From macroscopic to microscopic glimpse-based models of intelligibility prediction

Martin Cooke, Yan Tang, Mate A. Toth

Research output: Contribution to journalArticlepeer-review

Abstract

Miller and Licklider's explorations of the intelligibility of temporally interrupted speech, and later studies extending their findings to the spectro-temporal plane, have shown how the twin factors of sparseness and redundancy confer a high degree of robustness on speech in noise. The current contribution addresses two questions. First, to what extent can quantitative estimates of supra-threshold unmasked speech account for average (macroscopic) intelligiblity across a range of speech styles and masking conditions? We examine how well glimpse-based objective intelligibility metrics predict listeners' speech recognition scores for natural and synthetic speech in the presence of stationary and fluctuating maskers, and demonstrate reduced correlations for competing sources with an informational masking component. The second question concerns which additional components, beyond speech glimpses, are required to make (microscopic) predictions of actual listener confusions at the level of individual noisy speech tokens. Using corpora of speech-in-noise misperceptions, we show that in many cases the source of listener confusions is the misallocation of information from the masker, suggesting that estimates of supra-threshold unmasked speech alone are insufficient to explain speech intelligibility in noise.
Original languageEnglish (US)
Pages (from-to)2187-2187
JournalThe Journal of the Acoustical Society of America
Volume139
Issue number4
DOIs
StatePublished - Apr 1 2016
Externally publishedYes

Fingerprint Dive into the research topics of 'From macroscopic to microscopic glimpse-based models of intelligibility prediction'. Together they form a unique fingerprint.

Cite this