A Hybrid Approach Combining Statistical Knowledge with Conditional Random Fields for Chinese Grammatical Error Detection

Yiyi Wang, Chilin Shih

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a method of combining Conditional Random Fields (CRFs) model with a post-processing layer using Google n-grams statistical information tailored to detect word selection and word order errors made by learners of Chinese as Foreign Language (CFL). We describe the architecture of the model and its performance in the shared task of the ACL 2018 Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA). This hybrid approach yields comparably high false positive rate (FPR = 0.1274) and precision (Pd= 0.7519; Pi= 0.6311), but low recall (Rd = 0.3035; Ri = 0.1696) in grammatical error detection and identification tasks. Additional statistical information and linguistic rules can be added to enhance the model performance in the future.

Original languageEnglish (US)
Title of host publicationACL 2018 - Natural Language Processing Techniques for Educational Applications, Proceedings of the 5th Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages194-198
Number of pages5
ISBN (Electronic)9781948087353
StatePublished - 2018
EventACL 2018 5th Workshop on Natural Language Processing Techniques for Educational Applications, NLPTEA 2018 - Melbourne, Australia
Duration: Jul 19 2018 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

ConferenceACL 2018 5th Workshop on Natural Language Processing Techniques for Educational Applications, NLPTEA 2018
Country/TerritoryAustralia
CityMelbourne
Period7/19/18 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'A Hybrid Approach Combining Statistical Knowledge with Conditional Random Fields for Chinese Grammatical Error Detection'. Together they form a unique fingerprint.

Cite this