Abstract
The number of speech spectro-temporal (S-T) regions escaping from noise masking, known as “glimpses,” is proportional to speech intelligibility in noise. Previous studies have demonstrated that intelligibility can be estimated by calculating the glimpse proportion (GP). More recent evidence revealed that the contribution of glimpses to intelligibility differs in the energy level of the glimpsed regions, and that even non-glimpsed regions play a non-negligible role in speech perception in noise. This study incorporated the voicing-viceless information in estimating intelligibility using glimpses. Before computing the GP, the counts of raw glimpsed regions or those with energy above the mean noise level were weighted according to the voicing-voiceless status of a frame where the glimpses were detected. Evaluated using speech signals processed to have thirteen glimpse compositions in both temporally stationary and fluctuating noise maskers, the linear correlation between model predictions and listeners' word recognition rates increased from 0.76 to 0.80 for weighted GP, and from 0.89 to 0.92 for weighted high-energy GP. Further taking the contribution from non-glimpsed regions into account in the model improved the correlation to 0.95, suggesting that intelligibility in noise can be better predicted when the contributions of different speech regions are finely modelled.
Original language | English (US) |
---|---|
DOIs | |
State | Published - 2023 |
Event | 184th Meeting of the Acoustical Society of America, ASA 2023 - Duration: May 8 2023 → May 12 2023 https://acousticalsociety.org/asa-meetings/ |
Conference
Conference | 184th Meeting of the Acoustical Society of America, ASA 2023 |
---|---|
Period | 5/8/23 → 5/12/23 |
Internet address |