Abstract
Online health forums provide a convenient way for patients to obtain medical information and connect with physicians and peers outside of clinical settings. However, large quantities of unstructured and diversified content generated on these forums make it difficult for users to digest and extract useful information. Understanding user intents would enable forums to find and recommend relevant information to users by filtering out threads that do not match particular intents. In this paper, we derive a taxonomy of intents to capture user information needs in online health forums and propose novel pattern-based features for use with a multiclass support vector machine (SVM) classifier to classify original thread posts according to their underlying intents. Since no dataset existed for this task, we employ three annotators to manually label a dataset of 1192 HealthBoards posts spanning four forum topics. Experimental results show that a SVM using pattern-based features is highly capable of identifying user intents in forum posts, reaching a maximum precision of 75%, and that a SVM-based hierarchical classifier using both pattern and word features outperforms its SVM counterpart that uses only word features. Furthermore, comparable classification performance can be achieved by training and testing on posts from different forum topics.
Original language | English (US) |
---|---|
Article number | 7066225 |
Pages (from-to) | 1392-1398 |
Number of pages | 7 |
Journal | IEEE Journal of Biomedical and Health Informatics |
Volume | 19 |
Issue number | 4 |
DOIs | |
State | Published - Jul 1 2015 |
Keywords
- Forum intents
- online health forums
- pattern based features
- support vector machines
- user intent classification
ASJC Scopus subject areas
- Health Information Management
- Health Informatics
- Electrical and Electronic Engineering
- Computer Science Applications