Subjective and objective evaluation of speech intelligibility enhancement under constant energy and duration constraints

Yan Tang, Martin Cooke

Research output: Contribution to journalConference articlepeer-review

Abstract

Speakers appear to adopt strategies to improve speech intelligibility for interlocutors in adverse acoustic conditions. Generated speech, whether synthetic, recorded or live, may also benefit from context-sensitive modifications in challenging situations. The current study measured the effect on intelligibility of six spectral and temporal modifications operating under global constraints of constant input-output energy and duration. Reallocation of energy from mid-frequency regions with high local SNR produced the largest intelligibility benefits, while other approaches such as pause insertion or maintenance of a constant segmental SNR actually led to a deterioration in intelligibility. Listener scores correlated only moderately well with recent objective intelligibility estimators, suggesting that further development of intelligibility models is required to improve predictions for modified speech.

Original languageEnglish (US)
Pages (from-to)345-348
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2011
Externally publishedYes
Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
Duration: Aug 27 2011Aug 31 2011

Keywords

  • Energy reallocation
  • Objective measures
  • Speech intelligibility

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Subjective and objective evaluation of speech intelligibility enhancement under constant energy and duration constraints'. Together they form a unique fingerprint.

Cite this