Energy reallocation strategies for speech enhancement in known noise conditions

Yan Tang, Martin Cooke

Research output: Contribution to conferencePaper

Abstract

Speech output, whether live, recorded or synthetic, is often employed in difficult listening conditions. Context-sensitive speech modifications aim to promote intelligibility while maintaining quality and listener comfort. The current study used objective measures of intelligibility and quality to compare five energy reallocation strategies operating under equal energy and preserved duration constraints. Results in both stationary and highly-nonstationary backgrounds suggest that time-varying modifications lead to large increases in objective intelligibility, but that speech quality is best preserved by time-invariant modifications. Selective amplification of time-frequency regions with low a priori SNR produced the highest objective intelligibility without severe disruption to quality.

Original languageEnglish (US)
Pages1636-1639
Number of pages4
StatePublished - Dec 1 2010
Externally publishedYes
Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
Duration: Sep 26 2010Sep 30 2010

Other

Other11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
CountryJapan
CityMakuhari, Chiba
Period9/26/109/30/10

Keywords

  • Energy reallocation
  • Glimpsing
  • PESQ
  • SII
  • Speech intelligibility

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint Dive into the research topics of 'Energy reallocation strategies for speech enhancement in known noise conditions'. Together they form a unique fingerprint.

  • Cite this

    Tang, Y., & Cooke, M. (2010). Energy reallocation strategies for speech enhancement in known noise conditions. 1636-1639. Paper presented at 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan.