Structural parse tree features for text representation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose and study novel text representation features created from parse tree structures. Unlike the traditional parse tree features which include all the attached syntactic categories to capture linguistic properties of text, the new features are solely or primarily defined based on the tree structure, and thus better reflect the pure structural properties of parse trees. We hypothesize that these new complex structural features capture an orthogonal perspective of text even compared to advanced syntactic ones. Evaluation based on three different text categorization tasks (i.e., nationality detection, essay scoring, and sentiment analysis) shows that the proposed new tree structure features complement the existing ones to enrich text representation. Experiment results further show that a combination of the proposed new structure features with word n-grams can improveF1 score and classification accuracy.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE 7th International Conference on Semantic Computing, ICSC 2013
Pages9-16
Number of pages8
DOIs
StatePublished - Dec 1 2013
Event2013 IEEE 7th International Conference on Semantic Computing, ICSC 2013 - Irvine, CA, United States
Duration: Sep 16 2013Sep 18 2013

Publication series

NameProceedings - 2013 IEEE 7th International Conference on Semantic Computing, ICSC 2013

Other

Other2013 IEEE 7th International Conference on Semantic Computing, ICSC 2013
CountryUnited States
CityIrvine, CA
Period9/16/139/18/13

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Structural parse tree features for text representation'. Together they form a unique fingerprint.

Cite this