TY - CHAP
T1 - Corpora, Databases, and Internet Resources
T2 - Corpus Phonology with Speech Resources Using The Internet For Collecting Phonological Data Speech Manipulation, Synthesis, and Automatic Recognition in Laboratory Phonology Phonotactic Patterns in Lexical Corpora
AU - Cole, Jennifer
AU - Hasegawa-Johnson, Mark Allan
AU - Loehr, Dan
AU - Guilder, Linda Van
AU - Reetz, Henning
AU - Frisch, Stefan A.
PY - 2012/9/18
Y1 - 2012/9/18
N2 - This article introduces a wide range of approaches to using large bodies of data for linguistic research. Corpus analysis for phonological research involves the investigation of the phonetic, phonological, and lexical properties of speech for the purpose of understanding the patterns of variation in the phonetic expression of words, and the distributional patterns of sound elements in relation to the linguistic context. A speech corpus provides a basis for investigating variability in phonetic form and also provides a rich resource for studying the relationship between phonological form and other levels of linguistic structure. Linguistic metadata provides information about the speakers, such as sex, age, ethnicity, and region of residence. Metadata may also provide information about speaker recruitment and recording procedures. Forced alignment is done using algorithms from automatic speech recognition (ASR), and is most successful when each phone associated with the word in its dictionary form is actually fully pronounced. One of the easiest methods of manipulating natural speech is the splicing technique, where parts of a speech signal are cut out, repeated, or cross-spliced with another piece of the signal. The gating technique is another form of natural speech signal manipulation often applied in psycholinguistic experiments, where parts of a speech signal are cut off, and incrementally more of the signal is presented to a listener. Another speech signal manipulation is the mixing of two signals.
AB - This article introduces a wide range of approaches to using large bodies of data for linguistic research. Corpus analysis for phonological research involves the investigation of the phonetic, phonological, and lexical properties of speech for the purpose of understanding the patterns of variation in the phonetic expression of words, and the distributional patterns of sound elements in relation to the linguistic context. A speech corpus provides a basis for investigating variability in phonetic form and also provides a rich resource for studying the relationship between phonological form and other levels of linguistic structure. Linguistic metadata provides information about the speakers, such as sex, age, ethnicity, and region of residence. Metadata may also provide information about speaker recruitment and recording procedures. Forced alignment is done using algorithms from automatic speech recognition (ASR), and is most successful when each phone associated with the word in its dictionary form is actually fully pronounced. One of the easiest methods of manipulating natural speech is the splicing technique, where parts of a speech signal are cut out, repeated, or cross-spliced with another piece of the signal. The gating technique is another form of natural speech signal manipulation often applied in psycholinguistic experiments, where parts of a speech signal are cut off, and incrementally more of the signal is presented to a listener. Another speech signal manipulation is the mixing of two signals.
KW - Automatic speech recognition
KW - Corpus analysis
KW - Gating technique
KW - Lexical properties
KW - Linguistic metadata
KW - Phonology
KW - Speech signal manipulation
KW - Usage frequency
UR - http://www.scopus.com/inward/record.url?scp=85066544562&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066544562&partnerID=8YFLogxK
U2 - 10.1093/oxfordhb/9780199575039.013.0017
DO - 10.1093/oxfordhb/9780199575039.013.0017
M3 - Chapter
SN - 9780199575039
BT - The Oxford Handbook of Laboratory Phonology
PB - Oxford University Press
ER -