NSF Report - Facial Expression Understanding
Title/Contents
Exec Summary
Overview
Psychology & Neuroanatomy
Computer Vision
Neural networks & Computation
Special Hardware
Basic Science
Sensing & Processing
Expression Models and Databases
Recommendations
Benefits
References

 

V. RECOMMENDATIONS

Joseph C. Hager

The recommendations of this Workshop are summarized here under four major headings: Basic Research, Infrastructure, Tools, and Training. These recommendations were collected from the rest of this report, where justification may be found, and from the proceedings of the Workshop. The list is long, emphasizing the specific, important research and development efforts needed to advance this area. The list should help make concrete the extensive nature of the tasks, although it is not exhaustive of worthwhile contributions. By focussing on specifics, rather than general efforts, more precise guidance is provided to investigators about the research questions and to funding institutions about required support.

Basic Research on the Face that Answers These Crucial Questions:

  • A. In regard to the system of facial signs and their perception:
    • i. How much of a known prototypical emotion expression is needed for observers to recognize it?
    • ii. Are there differences among specific cultures in perception of basic emotion expressions?
    • iii. What variations on the prototypical emotion expressions are still perceived as belonging to the same basic emotion?
    • iv. How are expressions that contain only some of the actions of the prototypical expression judged?
    • v. How are blends of different emotion expressions within the same expression perceived and judged?
    • vi. What is the effect of asymmetry on the perception of emotion expression and other signs?
    • vii. How does the perception of emotion expression affect the perception of the expresser's other personal characteristics?
    • viii. What about an expression indicates deception versus honesty, genuine involuntary expression versus feigned or deliberately produced expression, and suppression or augmentation of an involuntary expression?
    • ix. How do emotion signs interact with other facial signs to produce an interpretation by an observer?
    • x. What is the relative contribution of the face compared to other signal systems, such as the body and language, and how are these systems integrated by the expresser and decoded by the observer?
    • xi. What are the temporal dynamics of facial movements and expressions and does this provide additional information beyond the configuration?
    • xii. What effect do co-occurring or adjacent muscular movements in the sequence of facial movements have on each other (co-articulation)?
  • B. What are the subjective and physiological consequences of voluntary production of partial and complete prototype emotion expressions?
  • C. In regard to spontaneous expressive behavior:
    • i. What is the range and variability of spontaneous facial expressive behavior?
    • ii. How frequent are the facial prototypes of emotion in real-world situations?
    • iii. Across cultures, what components of spontaneous emotion expressions are invariant within an emotion family?
    • iv. How do expressive behaviors, such as head and lip movements, contribute to the perception and interpretation of facial expressions?
    • v. What are the relationships among speech behaviors, vocal expression, and facial expressions?
    • vi. What information is carried in physiological indices that are not available in expressive measures?
  • D. What behavioral and physiological indices differentiate between voluntary and spontaneous expressive productions?
  • E. What facial behaviors are relevant to human performance and how can these be monitored automatically?
  • F. In regard to the neural and physiological basis for facial messages:
    • i. What are the neural correlates of the perception of facial information specifically the neural centers for facial perception, their structure, and their connections to other neural centers?
    • ii. What are the neural correlates for the production of facial expression, specifically the neural centers for involuntary motor action, their efferent pathways, and their connections to other effector neural centers?
    • iii. What, if any, are the relationships between the production and the perception of facial expressions in regard to neural activity?

Infrastructure Resources that Include the Following:

  • A. A multimedia database shared on the Internet that contains images, sounds, numeric data, text, and various tools relevant to facial expression and its understanding. The images should include:
    • i. still and moving images of the face that reflect a number of important variables and parameters, separately and in combination,
    • ii. voluntary productions of facial movements, with descriptive tags, accompanied by data from other sensors,
    • iii. spontaneous movements carefully cataloged with associated physiology,
    • iv. animation exemplars for use in perception studies,
    • v. compilation of extant findings with more fine-grained description of facial behavior,
    • vi. large numbers of standardized facial images used to evaluate performance of alternative techniques of machine measurement and modeling.
    • vii. speech sounds and other vocalizations associated with facial messages.
  • B. A survey of what images meeting the criteria in A. above currently exist and can be incorporated into the database, and a report of the images that need to be collected.
  • C. Specifications for video recordings and equipment that would enable sharing the productions of different laboratories and that anticipate the rapid developments in imaging and image compression technology.
  • D. Standards for digitized images and sounds, data formats, and other elements shared in the database that are compatible with other national databases.
  • E. A security system that protects the privileged nature of some items in the database while maximizing free access to open items.
  • F. Analysis of database performance and design.
  • G. Strategies and opportunities to share expensive equipment or complex software among laboratories.

Tools for Processing and Analyzing Faces and Related Data:

  • A. Methods for detecting and tracking faces and heads in complex images.
  • B. Programs to translate among different visual facial measurement methods.
  • C. Automated facial measurements:
    • i. detecting and tracking 3D head position and orientation,
    • ii. detecting and tracking eye movements, gaze direction and eye closure,
    • iii. detecting and measuring lip movement and mouth opening,
    • iv. detecting and measuring facial muscular actions, including the following independent capabilities:
      • a. detection of brow movements,
      • b. detection of smiles,
      • c. detection of actions that are neither smiling or brow movements,
      • d. techniques for temporally segmenting the flow of facial behavior,
      • e. detection of onset, apex, and offset of facial muscle activity,
      • f. detection of limited subsets of facial actions.
  • D. Parametric and other models, including 3D, of the human face and head that enable accurate rendering of different expressions given a specific face.
    • i. anatomically correct physical models of the head and face,
    • ii. complete image atlas of the head, including soft and hard tissue.
  • E. Algorithms for assembling discrete measurements into meaningful chunks for interpretation.
  • F. Programs for translating facial measurements in terms of emotion, cognitive process, and other phenomena incapable of direct observation.
  • G. Programs for translating lip movements to speech.
  • H. Automated gesture recognition.
  • I. Programs for integrating and analyzing measurements of different modalities, such as visual, speech, and EMG.
  • J. Pattern discovery and recognition in multiple physiological measures.
  • K. Further exploration of novel computer vision and image processing techniques in processing the face, such as the use of color and 3D.
  • L. Development of "real-time" distance range sensors useful in constructing 3D head and face models.
  • M. Development of interactive systems for facial analysis.
  • N. Development and adaptation of parallel processing hardware to automated measurement.
  • O. Video sensors and control equipment to enable "active vision" cameras that would free behavioral scientists from requirements to keep subjects relatively stationary.

Training and Education for Experienced and Beginning Investigators:

  • A. Resources providing specialized training:
    • i. Post-doctoral, interdisciplinary training using multiple institutions,
    • i. Summer Institutes for the study of machine understanding of the face,
    • ii. Centers of excellence in geographic centers where high concentrations of relevant investigators and resources exist.
  • B. Resources to facilitate communication among investigators:
    • i. Special journal sections to bring information from different disciplines to the attention of other relevant disciplines,
    • ii. Computer bulletin boards,
    • iii. On-line journal or newsletter publishing and information exchange,
    • iv. Liaison to the business/industrial private sector.

NOTE: These recommendations were compiled by J. Hager from the discussions, reports, and working sessions of the Workshop.