Uncertainty reduction for knowledge discovery and information extraction on the world wide web

Heng Ji, Hongbo Deng, Jiawei Han

Research output: Contribution to journalArticle

Abstract

In this paper, we give an overview of knowledge discovery (KD) and information extraction (IE) techniques on the World Wide Web (WWW). We intend to answer the following questions: What kind of additional uncertainty challenges are introduced by the WWW setting to basic KD and IE techniques? What are the fundamental techniques that can be used to reduce such uncertainty and achieve reasonable KD and IE performance on the WWW? What is the impact of each novel method? What types of interactions can be conducted between these techniques and information networks to make them benefit from each other? In what way can we utilize the results in more interesting applications? What are the remaining challenges and what are the possible ways to address these challenges? We hope this can provide a road map to advance KD and IE on the WWW to a higher level of performance, portability and utilization.

Original languageEnglish (US)
Article number6212297
Pages (from-to)2658-2674
Number of pages17
JournalProceedings of the IEEE
Volume100
Issue number9
DOIs
StatePublished - Jan 1 2012

Keywords

  • natural language processing
  • text analysis
  • text mining

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Uncertainty reduction for knowledge discovery and information extraction on the world wide web'. Together they form a unique fingerprint.

  • Cite this