Building Web collections for vertical markets

Research output: Contribution to conferencePaperpeer-review

Abstract

Gaining actionable business intelligence from the Web is an active and promising area of research. An important facet of this line of work is the ability to build collections of Web pages (or Web collections) that are relevant to a vertical market of interest. Such collections provide raw data that can be further processed with text and data mining tools to gain business intelligence. Web crawlers are programs that can be used to build Web collections ranging in size from thousands to millions of pages. Through a systematic study we show that crawlers face a difficult task when building a Web collection for vertical markets due to the competitive nature of the business communities. However, the difficulty of the task can be greatly reduced through limited use of a search engine to identify candidate "information broker" Web pages that may serve as hubs leading to many relevant pages and competing Web sites.

Original languageEnglish (US)
Pages39-44
Number of pages6
StatePublished - 2005
Externally publishedYes
Event15th Workshop on Information Technology and Systems, WITS 2005 - Las Vegas, NV, United States
Duration: Dec 10 2005Dec 11 2005

Other

Other15th Workshop on Information Technology and Systems, WITS 2005
Country/TerritoryUnited States
CityLas Vegas, NV
Period12/10/0512/11/05

ASJC Scopus subject areas

  • Information Systems
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Building Web collections for vertical markets'. Together they form a unique fingerprint.

Cite this