TY - GEN
T1 - Listen, Decipher and Sign
T2 - Findings of the Association for Computational Linguistics, ACL 2023
AU - Wang, Liming
AU - Ni, Junrui
AU - Gao, Heting
AU - Li, Jialu
AU - Chang, Kai Chieh
AU - Fan, Xulin
AU - Wu, Junkai
AU - Hasegawa-Johnson, Mark
AU - Yoo, Chang D.
N1 - This work utilizes resources supported by the National Science Foundation’s Major Research Instrumentation program, grant #1725729 (Kindratenko et al., 2020), as well as the University of Illinois at Urbana-Champaign. This work was also supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2022-0-00184, Development and Study of AI Technologies to Inexpensively Conform to Evolving Policy on Ethics)
PY - 2023
Y1 - 2023
N2 - Existing supervised sign language recognition systems rely on an abundance of well-annotated data. Instead, an unsupervised speech-to-sign language recognition (SSR-U) system learns to translate between spoken and sign languages by observing only non-parallel speech and sign-language corpora. We propose speech2signU, a neural network-based approach capable of both character-level and word-level SSR-U. Our approach significantly outperforms baselines directly adapted from unsupervised speech recognition (ASR-U) models by as much as 50% recall@10 on several challenging American sign language corpora with various levels of sample sizes, vocabulary sizes, and audio and visual variability. The code is available at cactuswiththoughts/UnsupSpeech2Sign.git.
AB - Existing supervised sign language recognition systems rely on an abundance of well-annotated data. Instead, an unsupervised speech-to-sign language recognition (SSR-U) system learns to translate between spoken and sign languages by observing only non-parallel speech and sign-language corpora. We propose speech2signU, a neural network-based approach capable of both character-level and word-level SSR-U. Our approach significantly outperforms baselines directly adapted from unsupervised speech recognition (ASR-U) models by as much as 50% recall@10 on several challenging American sign language corpora with various levels of sample sizes, vocabulary sizes, and audio and visual variability. The code is available at cactuswiththoughts/UnsupSpeech2Sign.git.
UR - http://www.scopus.com/inward/record.url?scp=85175446459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175446459&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.findings-acl.424
DO - 10.18653/v1/2023.findings-acl.424
M3 - Conference contribution
AN - SCOPUS:85175446459
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 6785
EP - 6800
BT - Findings of the Association for Computational Linguistics, ACL 2023
PB - Association for Computational Linguistics (ACL)
Y2 - 9 July 2023 through 14 July 2023
ER -