Deriving link-context from HTML tag tree

Research output: Contribution to conferencePaperpeer-review


HTML anchors are often surrounded by text that seems to describe the destination page appropriately. The text surrounding a link or the link-context is used for a variety of tasks associated with Web information retrieval. These tasks can benefit by identifying regularities in the manner in which "good" contexts appear around links. In this paper, we describe a framework for conducting such a study. The framework serves as an evaluation platform for comparing various link-context derivation methods. We apply the framework to a sample of Web pages obtained from more than 10,000 different categories of the ODP. Our focus is on understanding the potential merits of using a Web page's tag tree structure, for deriving link-contexts. We find that good link-context can be associated with tag tree hierarchy. Our results show that climbing up the tag tree when the link-context provided by greater depths is too short can provide better performance than some of the traditional techniques.

Original languageEnglish (US)
Number of pages7
StatePublished - 2003
Externally publishedYes
Event8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD '03 - San Diego, CA, United States
Duration: Jun 13 2003Jun 13 2003


Conference8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD '03
Country/TerritoryUnited States
CitySan Diego, CA


  • DOM
  • Link-context
  • Tag tree

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications


Dive into the research topics of 'Deriving link-context from HTML tag tree'. Together they form a unique fingerprint.

Cite this