TY - GEN
T1 - Discovering interesting usage patterns in text collections
T2 - 16th ACM Conference on Information and Knowledge Management, CIKM 2007
AU - Don, Anthony
AU - Zheleva, Elena
AU - Gregory, Machon
AU - Tarkan, Sureyya
AU - Auvil, Loretta
AU - Clement, Tanya
AU - Shneiderman, Ben
AU - Plaisant, Catherine
PY - 2007
Y1 - 2007
N2 - This paper addresses the problem of making text mining results more comprehensible to humanities scholars, journalists, intelligence analysts, and other researchers, in order to support the analysis of text collections. Our system, FeatureLens1, visualizes a text collection at several levels of granularity and enables users to explore interesting text patterns. The current implementation focuses on frequent itemsets of n-grams, as they capture the repetition of exact or similar expressions in the collection. Users can find meaningful co-occurrences of text patterns by visualizing them within and across documents in the collection. This also permits users to identify the temporal evolution of usage such as increasing, decreasing or sudden appearance of text patterns. The interface could be used to explore other text features as well. Initial studies suggest that FeatureLens helped a literary scholar and 8 users generate new hypotheses and interesting insights using 2 text collections.
AB - This paper addresses the problem of making text mining results more comprehensible to humanities scholars, journalists, intelligence analysts, and other researchers, in order to support the analysis of text collections. Our system, FeatureLens1, visualizes a text collection at several levels of granularity and enables users to explore interesting text patterns. The current implementation focuses on frequent itemsets of n-grams, as they capture the repetition of exact or similar expressions in the collection. Users can find meaningful co-occurrences of text patterns by visualizing them within and across documents in the collection. This also permits users to identify the temporal evolution of usage such as increasing, decreasing or sudden appearance of text patterns. The interface could be used to explore other text features as well. Initial studies suggest that FeatureLens helped a literary scholar and 8 users generate new hypotheses and interesting insights using 2 text collections.
KW - Digital humanities
KW - Frequent closed itemsets
KW - N-grams
KW - Text mining
KW - User interface
UR - http://www.scopus.com/inward/record.url?scp=51749094818&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51749094818&partnerID=8YFLogxK
U2 - 10.1145/1321440.1321473
DO - 10.1145/1321440.1321473
M3 - Conference contribution
AN - SCOPUS:51749094818
SN - 9781595938039
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 213
EP - 221
BT - CIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Y2 - 6 November 2007 through 9 November 2007
ER -