TY - GEN
T1 - BabyBERTa
T2 - 25th Conference on Computational Natural Language Learning, CoNLL 2021
AU - Huebner, Philip A.
AU - Sulem, Elior
AU - Fisher, Cynthia
AU - Roth, Dan
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - Transformer-based language models have taken the NLP world by storm. However, their potential for addressing important questions in language acquisition research has been largely ignored. In this work, we examined the grammatical knowledge of RoBERTa (Liu et al., 2019) when trained on a 5M word corpus of language acquisition data to simulate the input available to children between the ages 1 and 6. Using the behavioral probing paradigm, we found that a smaller version of RoBERTa-base that never predicts unmasked tokens, which we term BabyBERTa, acquires grammatical knowledge comparable to that of pre-trained RoBERTa-base - and does so with approximately 15X fewer parameters and 6,000X fewer words. We discuss implications for building more efficient models and the learnability of grammar from input available to children. Lastly, to support research on this front, we release our novel grammar test suite that is compatible with the small vocabulary of child-directed input.
AB - Transformer-based language models have taken the NLP world by storm. However, their potential for addressing important questions in language acquisition research has been largely ignored. In this work, we examined the grammatical knowledge of RoBERTa (Liu et al., 2019) when trained on a 5M word corpus of language acquisition data to simulate the input available to children between the ages 1 and 6. Using the behavioral probing paradigm, we found that a smaller version of RoBERTa-base that never predicts unmasked tokens, which we term BabyBERTa, acquires grammatical knowledge comparable to that of pre-trained RoBERTa-base - and does so with approximately 15X fewer parameters and 6,000X fewer words. We discuss implications for building more efficient models and the learnability of grammar from input available to children. Lastly, to support research on this front, we release our novel grammar test suite that is compatible with the small vocabulary of child-directed input.
UR - http://www.scopus.com/inward/record.url?scp=85119118398&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119118398&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85119118398
T3 - CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings
SP - 624
EP - 646
BT - CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings
A2 - Bisazza, Arianna
A2 - Abend, Omri
PB - Association for Computational Linguistics (ACL)
Y2 - 10 November 2021 through 11 November 2021
ER -