TY - GEN
T1 - A Theory of Unsupervised Speech Recognition
AU - Wang, Liming
AU - Hasegawa-Johnson, Mark
AU - Yoo, Chang D.
N1 - This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2022-0-00184, Development and Study of AI Technologies to Inexpensively Conform to Evolving Policy on Ethics)
PY - 2023
Y1 - 2023
N2 - Unsupervised speech recognition (ASR-U) is the problem of learning automatic speech recognition (ASR) systems from unpaired speech-only and text-only corpora. While various algorithms exist to solve this problem, a theoretical framework is missing to study their properties and address such issues as sensitivity to hyperparameters and training instability. In this paper, we proposed a general theoretical framework to study the properties of ASR-U systems based on random matrix theory and the theory of neural tangent kernels. Such a framework allows us to prove various learnability conditions and sample complexity bounds of ASR-U. Extensive ASR-U experiments on synthetic languages with three classes of transition graphs provide strong empirical evidence for our theory (code available at cactuswiththoughts/UnsupASRTheory.git).
AB - Unsupervised speech recognition (ASR-U) is the problem of learning automatic speech recognition (ASR) systems from unpaired speech-only and text-only corpora. While various algorithms exist to solve this problem, a theoretical framework is missing to study their properties and address such issues as sensitivity to hyperparameters and training instability. In this paper, we proposed a general theoretical framework to study the properties of ASR-U systems based on random matrix theory and the theory of neural tangent kernels. Such a framework allows us to prove various learnability conditions and sample complexity bounds of ASR-U. Extensive ASR-U experiments on synthetic languages with three classes of transition graphs provide strong empirical evidence for our theory (code available at cactuswiththoughts/UnsupASRTheory.git).
UR - http://www.scopus.com/inward/record.url?scp=85174386324&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174386324&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.acl-long.67
DO - 10.18653/v1/2023.acl-long.67
M3 - Conference contribution
AN - SCOPUS:85174386324
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 1192
EP - 1215
BT - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Y2 - 9 July 2023 through 14 July 2023
ER -