Performance and implications of semantic indexing in a distributed environment

Conrad T.K. Chang, Bruce R. Schatz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A research prototype is presented for semantic indexing and retrieval in Information Retrieval. The prototype is motivated by a desire to provide a more efficient and effective information retrieval system compared to the current state of the art. An overview of the Interspace architecture layers is discussed. An object model supporting semantic operations is developed. The model contains a rich set of classes and relationships of the data for the semantic indexing module. The basis of our semantic indexing is done by the creation of concept space. A concept space is an index of a collection that uses document statistics to capture the relationships between concepts. It is useful for boosting text search, by term suggestion of alternative terms semantically related to query terms. Over the years, we have developed generic technology for concept spaces computation on large collections across many subjects. Recent computations on discipline-scale collections have been made on high-end supercomputers. This paper describes our implementation and implications of the computation in a distributed computing environment. Experimental results using different collection sizes and number of processes are presented to show the feasibility of this approach. We also show that laboratory and community collections are already easily computable using a group of PCs in a lab via a message-passing model. We conclude that PC clusters will shortly be able to compute semantic indexes for any real collections.

Original languageEnglish (US)
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
PublisherACM
Pages391-398
Number of pages8
ISBN (Print)1581131461, 9781581131468
DOIs
StatePublished - 1999
EventProceedings of the 1999 8th International Conference on Information Knowledge Management (CIKM'99) - Kansas City, MO, USA
Duration: Nov 2 1999Nov 6 1999

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

OtherProceedings of the 1999 8th International Conference on Information Knowledge Management (CIKM'99)
CityKansas City, MO, USA
Period11/2/9911/6/99

ASJC Scopus subject areas

  • General Business, Management and Accounting

Fingerprint

Dive into the research topics of 'Performance and implications of semantic indexing in a distributed environment'. Together they form a unique fingerprint.

Cite this