Submitted to the Program in Media Arts & Sciences, School of Architecture & Planning on July 19, 1996 in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology
AbstractFace-to-face interaction between people is generally effortless and effective. We exchange glances, take turns speaking and make facial and manual gestures to achieve the goals of the dialogue. Endowing computers with such an interaction style marks the beginning of a new era in our relationship with machines-one that relies on communication, social convention and dialogue skills. This thesis presents a computational model of psychosocial dialogue expertise, bridging between perceptual analysis of multimodal events and multimodal action generation, supporting the creation of interfaces that afford full-duplex, real-time face-to-face interaction between a human and autonomous computer characters. The architecture, called Ymir, has been implemented in software, and a prototype humanoid created. The humanoid, named Gandalf, commands a graphical model of the solar system, and can interact with people using speech, manual and facial gesture. Gandalf has been tested in interaction with users and has been shown capable of fluid face-to-face dialogue. The prototype demonstrates several new ideas in the creation of communicative computer agents, including perceptual integration of multimodal events, distributed processing and decision making, layered input analysis and motor control, and the integration of reactive and reflective perception and action. Applications of the work presented in this thesis can be expected in such diverse fields as education, psychological and social research, work environments, and entertainment. |
JUSTINE CASSELL Assistant Professor of Media Arts & Sciences, MIT Program in Media Arts & Sciences PATTIE MAES Associate Professor of Media Arts & Sciences, Sony Corporation Career Development Professor of Media Arts & Sciences STEPHEN WHITTAKER Research Scientist, AT&T Labs Research
|
|
|||||||||||||||||
ALL FILES PDF: YOU WILL NEED ADOBE ACROBAT READER
Table of Contents
0. Abstract & Table of Contents [PDF]
1. Introduction [PDF]
2. Face-to-face Interface [PDF]
3. Multimodal Dialogue: Psychological and Interface Research [PDF] [Table 1 (ps)]
4. Agents, Robots and Artificial Intelligence [PDF]
5. Computational Characteristics of Multimodal Dialogue [PDF]
6. J.Jr.: A Study in Reactivity [PDF]
7. Ymir: A Generative Model of Psychosocial Dialogue Skills [PDF]
8. Ymir: An Implementation in LISP [PDF]
9. Gandalf: Humanoid One [PDF]
- Black-and-White QuickTime of Gandalf - 2 MB
- QuickTime of Gandalf 2 - 20 MB
- QuickTime of Gandalf 3 - 25.3 MB
10. Ymir / Gandalf: An Evaluation in Three Parts [PDF]
11. Designing Humanoid Agents: Some High-Level Issues [PDF]
12. Conclusions & Future Work [PDF]
13. References [PDF]
Appendix 1. Character Animation [PDF] — [Related technical paper: PDF]
Appendix 2. System Specifications [PDF]
Appendix 3. Questionnaires & Scoring [PDF]
[ Back to Thórisson's home page ]
Copyright 1997 K.R.Thórisson. All rights reserved.
![]()