Abstract
The PDF/A (Portable Document Format–Archival) was established by the International Organization of Standardization as the ISO 19005 standard for long-term preservation of electronic documents. In a case study of the Oxford institutional repository theses collection, PDF/A was evaluated as a possible format for standardizing theses disseminated online. While the ISO requirements of a well-formed PDF/A promises sustainability and easy recovery of content, the case study uncovered that the standard restricts some document features from being incorporated into a well-formed PDF/A. Non-conformances to the standard are found across electronic theses and dissertations, from non-Latin glyphs used in scientific and language papers to embedded content, such as images. A further complication for achieving ISO 19005 compliance is that, despite non-conformance to the ISO standard, validation tools do not always catch non-conformance errors in documents which claim to conform to PDF/A. While PDF/A is a logical solution for long-term digital preservation, the stringent standard prevents some content which is frequently used in academic research from conforming to the ISO 19005 standard.
Original language | English (US) |
---|---|
Title of host publication | iConference 2018 Proceedings |
State | Published - 2018 |
Keywords
- ISO 19005
- electronic theses and dissertations
- digital preservation
- institutional repositories