Learning static spectral weightings for speech intelligibility enhancement in noise

Yan Tang, Martin Cooke

Research output: Contribution to journalArticlepeer-review

Abstract

Near-end speech enhancement works by modifying speech prior to presentation in a noisy environment, typically operating under a constraint of limited or no increase in speech level. One issue is the extent to which near-end enhancement techniques require detailed estimates of the masking environment to function effectively. The current study investigated speech modification strategies based on reallocating energy statically across the spectrum using masker-specific spectral weightings. Weighting patterns were learned offline by maximising a glimpse-based objective intelligibility metric. Keyword scores in sentences in the presence of stationary and fluctuating maskers increased, in some cases by very substantial amounts, following the application of masker- and SNR-specific spectral weighting. A second experiment using generic masker-independent spectral weightings that boosted all frequencies above 1 kHz also led to significant gains in most conditions. These findings indicate that energy-neutral spectral weighting is a highly-effective near-end speech enhancement approach that places minimal demands on detailed masker estimation.

Original languageEnglish (US)
Pages (from-to)1-16
Number of pages16
JournalComputer Speech and Language
Volume49
DOIs
StatePublished - May 2018
Externally publishedYes

Keywords

  • Glimpsing
  • Intelligibility
  • Noise
  • Pattern search
  • Spectral weighting
  • Speech

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Learning static spectral weightings for speech intelligibility enhancement in noise'. Together they form a unique fingerprint.

Cite this