Search engine-crawler symbiosis: Adapting to community interests

Gautam Pant, Shannon Bradshaw, Filippo Menczer

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Web crawlers have been used for nearly a decade as a search engine component to create and update large collections of documents. Typically the crawler and the rest of the search engine are not closely integrated. If the purpose of a search engine is to have as large a collection as possible to serve the general Web community, a close integration may not be necessary. However, if the search engine caters to a specific community with shared focused interests, it can take advantage of such an integration. In this paper we investigate a tightly coupled system in which the crawler and the search engine engage in a symbiotic relationship. The crawler feeds the search engine and the search engine in turn helps the crawler to better its performance. We show that the symbiosis can help the system learn about a community's interests and serve such a community with better focus.

Original languageEnglish (US)
Title of host publicationResearch and Advanced Technology for Digital Libraries
EditorsTraugott Koch, Ingeborg Torvik Sølvberg
PublisherSpringer
Pages221-232
Number of pages12
ISBN (Print)354040726X, 9783540407263
DOIs
StatePublished - 2003
Externally publishedYes

Publication series

NameLecture Notes in Computer Science
Volume2769
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Search engine-crawler symbiosis: Adapting to community interests'. Together they form a unique fingerprint.

Cite this