Semi-automated collection evaluation for large-scale aggregations

Katrina Fenlon, Peter Organisciak, Jacob Jett, Miles Efron

Research output: Contribution to journalArticlepeer-review


Library and museum digital collections are increasingly aggregated at various levels. Large-scale aggregations, often characterized by heterogeneous or messy metadata, pose unique and growing challenges to aggregation administrators - not only in facilitating end-user discovery and access, but in performing basic administrative and curatorial tasks in a scalable way, such as finding messy data and determining the overall topical landscape of the aggregation. This poster describes early findings on using statistical text analysis techniques to improve the scalability of an aggregation development workflow for a large-scale aggregation. These techniques hold great promise for automating historically labor-intensive evaluative aspects of aggregation development and form the basis for the development of an aggregator's dashboard. The aggregator's dashboard is planned as a statistical textanalysis-driven tool for supporting large-scale aggregation development and maintenance, through multifaceted, automatic visualization of an aggregation's metadata quality and topical coverage. The administrator's dashboard will support principled yet scalable aggregation development. Copyright notice continues right here.

Original languageEnglish (US)
JournalProceedings of the ASIST Annual Meeting
StatePublished - 2011


  • Collection evaluation
  • Digital aggregations
  • Digital collections
  • Digital libraries
  • Document representation
  • Latent topic models
  • Subject access
  • Subject analysis

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences


Dive into the research topics of 'Semi-automated collection evaluation for large-scale aggregations'. Together they form a unique fingerprint.

Cite this