Novel time domain multi-class SVMs for landmark detection

Rahul Chitturi, Mark Hasegawa Johnson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The training of precise speech recognition models depends on accurate segmentation of the phonemes in a training corpus. Segmentation is typically performed using HMMs, but recent speech recognition work suggests that the transient acoustic features characteristic of manner-class phoneme boundaries (landmarks) may be more precisely localized using acoustic classifiers specifically designed for the task of landmark detection. This paper makes an empirical exploration of new features which suit Landmark Detection and the application of Multi-class SVMs that are capable of improving the time alignment of phoneme boundaries proposed by Binary SVMs and HMM-based speech recognizer. On a standard benchmark data set (A database of Telugu - Official Indian Language, spoken by 75 million people), we achieve a new state-of-the-art performance, reducing RMS phone boundary alignment error from 32ms to 22ms.

Original languageEnglish (US)
Title of host publicationINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
PublisherInternational Speech Communication Association
Pages2354-2357
Number of pages4
ISBN (Print)9781604234497
StatePublished - Jan 1 2006
EventINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
Duration: Sep 17 2006Sep 21 2006

Publication series

NameINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
Volume5

Other

OtherINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
CountryUnited States
CityPittsburgh, PA
Period9/17/069/21/06

Keywords

  • Landmark
  • Multi class SVM
  • Segmentation
  • Time domain flatness measure

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Novel time domain multi-class SVMs for landmark detection'. Together they form a unique fingerprint.

  • Cite this

    Chitturi, R., & Johnson, M. H. (2006). Novel time domain multi-class SVMs for landmark detection. In INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP (pp. 2354-2357). (INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP; Vol. 5). International Speech Communication Association.