Analysis, Interpretation and Synthesis of Facial Expressions

Irfan A. Essa

Ph. D. Thesis Submitted to Program in Media Arts & Sciences, MIT Media Laboratory, Sept. 1994


This thesis describes a computer vision system for observing the ``action units'' of a face using video sequences as input. The visual observation (sensing) is achieved by using an optimal estimation optical flow method coupled with a geometric and a physical (muscle) model describing the facial structure. This modeling results in a time-varying spatial patterning of facial shape and a parametric representation of the independent muscle action groups responsible for the observed facial motions. These muscle action patterns are then used for analysis, interpretation, recognition, and synthesis of facial expressions. Thus, by interpreting facial motions within a physics-based optimal estimation framework, a new control model of facial movement is developed. The newly extracted action units (which we name ``FACS+'') are both physics and geometry-based, and extend the well known FACS parameters for facial expressions by adding temporal information and non-local spatial patterning of facial motion.

Also available as MIT Media Laboratory, Vision and Modeling Group Technical Report # 303

Compressed ASCII PostScript 9.385 MegaBytes File

Here, can do any of the following to get my Tech-Reports:
View them via a Postscript Viewer
Request to download the Tech Report
Send email to request a hardcopy (Please try downloading it yourself first)

Irfan A. Essa (1994)
TR#303: Analysis, Interpretation and Synthesis of Facial Expressions [9385 kBytes], Ph. D. Thesis, Massachusetts Institute of Technology, Cambridge, MA. 1994.


Irfan Essa, irfan@media.mit.edu
Last modified: Thu Feb 15 10:29:53 EST 1996