Navigating the PDF/A Standard: A Case Study of Theses in Oxford's Institutional Repository

Anna Oates, J. Stephen Downie, Edith Halvarsson, Michael Popham

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The PDF/A (Portable Document Format–Archival) was established by the International Organization of Standardization as the ISO 19005 standard for long-term preservation of electronic documents. In a case study of the Oxford institutional repository theses collection, PDF/A was evaluated as a possible format for standardizing theses disseminated online. While the ISO requirements of a well-formed PDF/A promises sustainability and easy recovery of content, the case study uncovered that the standard restricts some document features from being incorporated into a well-formed PDF/A. Non-conformances to the standard are found across electronic theses and dissertations, from non-Latin glyphs used in scientific and language papers to embedded content, such as images. A further complication for achieving ISO 19005 compliance is that, despite non-conformance to the ISO standard, validation tools do not always catch non-conformance errors in documents which claim to conform to PDF/A. While PDF/A is a logical solution for long-term digital preservation, the stringent standard prevents some content which is frequently used in academic research from conforming to the ISO 19005 standard.
Original languageEnglish (US)
Title of host publicationiConference 2018 Proceedings
StatePublished - 2018

Keywords

  • ISO 19005
  • electronic theses and dissertations
  • digital preservation
  • institutional repositories

Fingerprint

Dive into the research topics of 'Navigating the PDF/A Standard: A Case Study of Theses in Oxford's Institutional Repository'. Together they form a unique fingerprint.

Cite this