TY - JOUR
T1 - An Anechoic, High-Fidelity, Multidirectional Speech Corpus
AU - Miller, Margaret K.
AU - Delaram, Vahid
AU - Trine, Allison
AU - Ananthanarayana, Rohit M.
AU - Buss, Emily
AU - Monson, Brian B.
AU - Stecker, G. Christopher
N1 - This work was supported by National Institutes of Health Grant R01-DC019745 (Personal Investigator: Monson). Portions of this work were previously presented to the American Auditory Society (Miller et al., 2023). The authors would like to thank all of our participants for lending us their voices. G.C.S., E.B., and B.B.M. designed the protocol; M.K.M., V.D., and G.C.S. wrote the first draft of the paper; and all authors participated in subse\u00ADquent revisions.
PY - 2025/1
Y1 - 2025/1
N2 - Introduction: We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners. Design: Fifteen male and 15 female talkers (21.3–60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0–10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°–180° azimuth angle around the talker using a 48 kHz sampling rate. Results: Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers. Conclusions: The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.
AB - Introduction: We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners. Design: Fifteen male and 15 female talkers (21.3–60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0–10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°–180° azimuth angle around the talker using a 48 kHz sampling rate. Results: Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers. Conclusions: The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85208283645&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85208283645&partnerID=8YFLogxK
U2 - 10.1044/2024_JSLHR-24-00296
DO - 10.1044/2024_JSLHR-24-00296
M3 - Article
C2 - 39620949
AN - SCOPUS:85208283645
SN - 1092-4388
VL - 68
SP - 411
EP - 418
JO - Journal of Speech, Language, and Hearing Research
JF - Journal of Speech, Language, and Hearing Research
IS - 1
ER -