Shallow information extraction from medical forum data

Parikshit Sondhi, Manish Gupta, Cheng Xiang Zhai, Julia Hockenmaier

Research output: Contribution to conferencePaper

Abstract

We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical forum documents, our goal is to extract two related types of sentences that describe a biomedical case (i.e., medical problem descriptions and medical treatment descriptions). Such an extraction task directly generates medical case descriptions that can be useful in many applications. We solve the problem using two popular machine learning methods Support VectorMachines (SVM) and Conditional Random Fields (CRF). We propose novel features to improve the accuracy of extraction. Experiment results show that we can obtain an accuracy of up to 75%.

Original languageEnglish (US)
Pages1158-1166
Number of pages9
StatePublished - Dec 1 2010
Event23rd International Conference on Computational Linguistics, Coling 2010 - Beijing, China
Duration: Aug 23 2010Aug 27 2010

Other

Other23rd International Conference on Computational Linguistics, Coling 2010
CountryChina
CityBeijing
Period8/23/108/27/10

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Shallow information extraction from medical forum data'. Together they form a unique fingerprint.

  • Cite this

    Sondhi, P., Gupta, M., Zhai, C. X., & Hockenmaier, J. (2010). Shallow information extraction from medical forum data. 1158-1166. Paper presented at 23rd International Conference on Computational Linguistics, Coling 2010, Beijing, China.