Abstract
Library and museum digital collections are increasingly aggregated at various levels. Large-scale aggregations, often characterized by heterogeneous or messy metadata, pose unique and growing challenges to aggregation administrators - not only in facilitating end-user discovery and access, but in performing basic administrative and curatorial tasks in a scalable way, such as finding messy data and determining the overall topical landscape of the aggregation. This poster describes early findings on using statistical text analysis techniques to improve the scalability of an aggregation development workflow for a large-scale aggregation. These techniques hold great promise for automating historically labor-intensive evaluative aspects of aggregation development and form the basis for the development of an aggregator's dashboard. The aggregator's dashboard is planned as a statistical textanalysis-driven tool for supporting large-scale aggregation development and maintenance, through multifaceted, automatic visualization of an aggregation's metadata quality and topical coverage. The administrator's dashboard will support principled yet scalable aggregation development. Copyright notice continues right here.
Original language | English (US) |
---|---|
Journal | Proceedings of the ASIST Annual Meeting |
Volume | 48 |
DOIs | |
State | Published - 2011 |
Keywords
- Collection evaluation
- Digital aggregations
- Digital collections
- Digital libraries
- Document representation
- Latent topic models
- Subject access
- Subject analysis
ASJC Scopus subject areas
- Information Systems
- Library and Information Sciences