Reliability prediction of webpages in the medical domain

Parikshit Sondhi, V. G. Vinod Vydiswaran, Chengxiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we study how to automatically predict reliability of web pages in the medical domain. Assessing reliability of online medical information is especially critical as it may potentially influence vulnerable patients seeking help online. Unfortunately, there are no automated systems currently available that can classify a medical webpage as being reliable, while manual assessment cannot scale up to process the large number of medical pages on the Web. We propose a supervised learning approach to automatically predict reliability of medical webpages. We developed a gold standard dataset using the standard reliability criteria defined by the Health on Net Foundation and systematically experimented with different link and content based feature sets. Our experiments show promising results with prediction accuracies of over 80%. We also show that our proposed prediction method is useful in applications such as reliability-based re-ranking and automatic website accreditation.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings
Pages219-231
Number of pages13
DOIs
StatePublished - Apr 27 2012
Event34th European Conference on Information Retrieval, ECIR 2012 - Barcelona, Spain
Duration: Apr 1 2012Apr 5 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7224 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other34th European Conference on Information Retrieval, ECIR 2012
Country/TerritorySpain
CityBarcelona
Period4/1/124/5/12

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Reliability prediction of webpages in the medical domain'. Together they form a unique fingerprint.

Cite this