Abstract
Gaining actionable business intelligence from the Web is an active and promising area of research. An important facet of this line of work is the ability to build collections of Web pages (or Web collections) that are relevant to a vertical market of interest. Such collections provide raw data that can be further processed with text and data mining tools to gain business intelligence. Web crawlers are programs that can be used to build Web collections ranging in size from thousands to millions of pages. Through a systematic study we show that crawlers face a difficult task when building a Web collection for vertical markets due to the competitive nature of the business communities. However, the difficulty of the task can be greatly reduced through limited use of a search engine to identify candidate "information broker" Web pages that may serve as hubs leading to many relevant pages and competing Web sites.
Original language | English (US) |
---|---|
Pages | 39-44 |
Number of pages | 6 |
State | Published - 2005 |
Externally published | Yes |
Event | 15th Workshop on Information Technology and Systems, WITS 2005 - Las Vegas, NV, United States Duration: Dec 10 2005 → Dec 11 2005 |
Other
Other | 15th Workshop on Information Technology and Systems, WITS 2005 |
---|---|
Country/Territory | United States |
City | Las Vegas, NV |
Period | 12/10/05 → 12/11/05 |
ASJC Scopus subject areas
- Information Systems
- Control and Systems Engineering