Abstract
Near-end speech enhancement works by modifying speech prior to presentation in a noisy environment, typically operating under a constraint of limited or no increase in speech level. One issue is the extent to which near-end enhancement techniques require detailed estimates of the masking environment to function effectively. The current study investigated speech modification strategies based on reallocating energy statically across the spectrum using masker-specific spectral weightings. Weighting patterns were learned offline by maximising a glimpse-based objective intelligibility metric. Keyword scores in sentences in the presence of stationary and fluctuating maskers increased, in some cases by very substantial amounts, following the application of masker- and SNR-specific spectral weighting. A second experiment using generic masker-independent spectral weightings that boosted all frequencies above 1 kHz also led to significant gains in most conditions. These findings indicate that energy-neutral spectral weighting is a highly-effective near-end speech enhancement approach that places minimal demands on detailed masker estimation.
Original language | English (US) |
---|---|
Pages (from-to) | 1-16 |
Number of pages | 16 |
Journal | Computer Speech and Language |
Volume | 49 |
DOIs | |
State | Published - May 2018 |
Externally published | Yes |
Keywords
- Glimpsing
- Intelligibility
- Noise
- Pattern search
- Spectral weighting
- Speech
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Human-Computer Interaction