The HathiTrust (HT) digital library comprises 4 billion pages (composing 11 million volumes). The HathiTrust Research Center (HTRC) – a unique collaboration between University of Illinois and Indiana University – is developing tools to connect scholars to this large and diverse corpus. This poster discusses HTRC’s activities surrounding the discovery, formation and optimization of useful analytic subsets of the HT corpus (i.e., workset creation and use). As a part of this development we are prototyping a RDF-based triple-store designed to record and serialize metadata describing worksets and the bibliographic entities that are collected within them. At the heart of this work is the construction of a formal conceptual model that captures sufficient descriptive information about worksets, including provenance, curatorial intent, and other useful metadata, so that digital humanities scholars can more easily select, group, and cite their research data collections based upon HT and external corpora. The prototype’s data model is in being designed to be extensible and fit well within the Linked Open Data community.
|Original language||English (US)|
|Title of host publication||iConference 2015 Proceedings|
|State||Published - Mar 15 2015|
- Conceptual Models
- HathiTrust Research Center
- Linked Open Data
- Digital Humanities