Contrast is a very popular phenomenon in spoken language, and carries very important information to help understanding contents and structures of spoken language. In this paper, we propose an idea of automatic contrast detection as an effort for better speech understanding. We study the automatic tagging of three specific types of contrast: symmetric contrast, contrastive focus, and contrastive topic. We label the three types of contrasted words as contrast (C), and other words as noncontrast (C). The classification of contrast events is based on prosodic, spectral, and part-of-speech (POS) information sources. The integration of different knowledge sources is realized by a time-delay recursive neural network (TDRNN). The approach we proposed was testified on 235 spontaneous utterances consisting of 3500 words (samples). The contrast detection was speaker independent. The tests yielded an average of 87.9% classification rate.

Original languageEnglish (US)
Number of pages4
StatePublished - 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: Oct 4 2004Oct 8 2004


Other8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Automatic detection of contrast for speech understanding'. Together they form a unique fingerprint.

Cite this