TY - JOUR
T1 - Band importance for speech-in-speech recognition in the presence of extended high-frequency cues
AU - Ananthanarayana, Rohit M.
AU - Buss, Emily
AU - Monson, Brian B.
N1 - Publisher Copyright:
© 2024 Acoustical Society of America.
PY - 2024/8/1
Y1 - 2024/8/1
N2 - Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8-20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4-1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3-8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.
AB - Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8-20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4-1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3-8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.
UR - http://www.scopus.com/inward/record.url?scp=85201620780&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201620780&partnerID=8YFLogxK
U2 - 10.1121/10.0028269
DO - 10.1121/10.0028269
M3 - Article
C2 - 39158325
AN - SCOPUS:85201620780
SN - 0001-4966
VL - 156
SP - 1202
EP - 1213
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 2
ER -