Sanitization and anonymization of document repositories

Yücel Saygin, Dilek Hakkani-Tür, Gökhan Tür

Research output: Chapter in Book/Report/Conference proceedingChapter


Information security and privacy in the context of the World Wide Web (WWW) are important issues that are still being investigated. However, most of the present research is dealing with access control and authentication-based trust. Especially with the popularity of WWW as one of the largest information sources, privacy of individuals is now as important as the security of information. In this chapter, our focus is text, which is probably the most frequently seen data type in the www.Our aim is to highlight the possible threats to privacy that exist due to the availability of document repositories and sophisticated tools to browse and analyze these documents. We first identify possible threats to privacy in document repositories. We then discuss a measure for privacy in documents with some possible solutions to avoid or, at least, alleviate these threats.

Original languageEnglish (US)
Title of host publicationWeb and Information Security
PublisherIGI Global
Number of pages16
ISBN (Print)9781591405887
StatePublished - 2005
Externally publishedYes

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Sanitization and anonymization of document repositories'. Together they form a unique fingerprint.

Cite this