Toward Multimodal Human-Computer Interface

Rajeev Sharma, Vladimir I. Pavlovic, Thomas S. Huang

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advances in various signal-processing technologies, coupled with an explosion in the available computing power, have given rise to a number of novel human-computer interaction (HCI) modalities-speech, vision-based gesture recognition, eye tracking, electroencephalograph, etc. Successful embodiment of these modalities into an interface has the potential of easing the HCI bottleneck that has become noticeable with the advances in computing and communication. It has also become increasingly evident that the difficulties encountered in the analysis and interpretation of individual sensing modalities may be overcome by integrating them into a multimodal human-computer interface. In this paper, we examine several promising directions toward achieving multimodal HCI. We consider some of the emerging novel input modalities for HCI and the fundamental issues in integrating them at various levels-from early "signal" level to intermediate "feature " level to late "decision " level. We discuss the different computational approaches that may be applied at the different levels of modality integration. We also briefly review several demonstrated multimodal HCI systems and applications. Despite all the recent developments, it is clear that further research is needed for interpreting and fusing multiple sensing modalities in the context of HCI. This research can benefit from many disparate fields of study that increase our understanding of the different human communication modalities and their potential role in HCI.

Original languageEnglish (US)
Pages (from-to)853-869
Number of pages17
JournalProceedings of the IEEE
Volume86
Issue number5
DOIs
StatePublished - 1998

Keywords

  • Human-computer interface
  • Multimodality
  • Sensor fusion

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Toward Multimodal Human-Computer Interface'. Together they form a unique fingerprint.

Cite this