TY - GEN
T1 - Understanding user intents in online health forums
AU - Zhang, Thomas
AU - Cho, Jason H.D.
AU - Zhai, Chengxiang
N1 - Publisher Copyright:
Copyright © 2014 ACM.
PY - 2014/9/20
Y1 - 2014/9/20
N2 - Online health forums provide a convenient way for patients to obtain medical information and connect with physicians and peers outside of clinical settings. However, large quan- Tities of unstructured and diversified content generated on these forums make it difficult for users to digest and ex- Tract useful information. Understanding user intents would enable forums to more accurately and efficiently find rele- vant information by filtering out threads that do not match particular intents. In this paper, we derive a taxonomy of intents to capture user information needs in online health forums, and propose novel pattern based features for use with a multiclass support vector machine (SVM) classifier to classify original thread posts according to their underly- ing intents. Since no dataset existed for this task, we employ three annotators to manually label a dataset of 1,200 Health- Boards posts spanning four forum topics. Experimental re- sults show that SVM with pattern based features is highly capable of identifying user intents in forum posts, reach- ing a maximum precision of 75%. Furthermore, comparable classification performance can be achieved by training and testing on posts from different forum topics (e.g. training on allergy posts, testing on depression posts). Finally, we run a trained classiffier on a MedHelp dataset to analyze the distribution of intents of posts from different forum topics.
AB - Online health forums provide a convenient way for patients to obtain medical information and connect with physicians and peers outside of clinical settings. However, large quan- Tities of unstructured and diversified content generated on these forums make it difficult for users to digest and ex- Tract useful information. Understanding user intents would enable forums to more accurately and efficiently find rele- vant information by filtering out threads that do not match particular intents. In this paper, we derive a taxonomy of intents to capture user information needs in online health forums, and propose novel pattern based features for use with a multiclass support vector machine (SVM) classifier to classify original thread posts according to their underly- ing intents. Since no dataset existed for this task, we employ three annotators to manually label a dataset of 1,200 Health- Boards posts spanning four forum topics. Experimental re- sults show that SVM with pattern based features is highly capable of identifying user intents in forum posts, reach- ing a maximum precision of 75%. Furthermore, comparable classification performance can be achieved by training and testing on posts from different forum topics (e.g. training on allergy posts, testing on depression posts). Finally, we run a trained classiffier on a MedHelp dataset to analyze the distribution of intents of posts from different forum topics.
KW - Forum intents
KW - Online health fo- rums
KW - Pattern based features
KW - Support vector machines
KW - User intent classification
UR - http://www.scopus.com/inward/record.url?scp=84920715065&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920715065&partnerID=8YFLogxK
U2 - 10.1145/2649387.2649445
DO - 10.1145/2649387.2649445
M3 - Conference contribution
AN - SCOPUS:84920715065
T3 - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
SP - 220
EP - 229
BT - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PB - Association for Computing Machinery
T2 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
Y2 - 20 September 2014 through 23 September 2014
ER -