Memory-bounded left-corner unsupervised grammar induction on child-directed input

Cory Shain, William Bryce, Lifeng Jin, Victoria Krakovna, Finale Doshi-Velez, Timothy Miller, William Schuier, Lane Schwartz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acquisition and sentence processing that have not been simultaneously modeled by any existing grammar induction algorithm: (1) a left-corner parsing strategy and (2) limited working memory capacity. To model realistic input to human language learners, we evaluate our system on a corpus of child-directed speech rather than typical newswire corpora. Results beat or closely match those of three competing systems.

Original languageEnglish (US)
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages964-975
Number of pages12
ISBN (Print)9784879747020
StatePublished - 2016
Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
Duration: Dec 11 2016Dec 16 2016

Publication series

NameCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers

Other

Other26th International Conference on Computational Linguistics, COLING 2016
Country/TerritoryJapan
CityOsaka
Period12/11/1612/16/16

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Memory-bounded left-corner unsupervised grammar induction on child-directed input'. Together they form a unique fingerprint.

Cite this