Discovering interesting usage patterns in text collections: Integrating text mining with visualization

Anthony Don, Elena Zheleva, Machon Gregory, Sureyya Tarkan, Loretta Auvil, Tanya Clement, Ben Shneiderman, Catherine Plaisant

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses the problem of making text mining results more comprehensible to humanities scholars, journalists, intelligence analysts, and other researchers, in order to support the analysis of text collections. Our system, FeatureLens1, visualizes a text collection at several levels of granularity and enables users to explore interesting text patterns. The current implementation focuses on frequent itemsets of n-grams, as they capture the repetition of exact or similar expressions in the collection. Users can find meaningful co-occurrences of text patterns by visualizing them within and across documents in the collection. This also permits users to identify the temporal evolution of usage such as increasing, decreasing or sudden appearance of text patterns. The interface could be used to explore other text features as well. Initial studies suggest that FeatureLens helped a literary scholar and 8 users generate new hypotheses and interesting insights using 2 text collections.

Original languageEnglish (US)
Title of host publicationCIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Pages213-221
Number of pages9
DOIs
StatePublished - 2007
Event16th ACM Conference on Information and Knowledge Management, CIKM 2007 - Lisboa, Portugal
Duration: Nov 6 2007Nov 9 2007

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other16th ACM Conference on Information and Knowledge Management, CIKM 2007
Country/TerritoryPortugal
CityLisboa
Period11/6/0711/9/07

Keywords

  • Digital humanities
  • Frequent closed itemsets
  • N-grams
  • Text mining
  • User interface

ASJC Scopus subject areas

  • General Decision Sciences
  • General Business, Management and Accounting

Fingerprint

Dive into the research topics of 'Discovering interesting usage patterns in text collections: Integrating text mining with visualization'. Together they form a unique fingerprint.

Cite this