A framework for understanding file format conversions

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper addresses the workshop question: "Can data generated from the infancy of the digital age be ingestible by software today?" We have prototyped a set of e-services that serve as a framework for understanding content preservation, automation and computational requirements on preservation of electronic records. The framework consists of e-services for (a) finding file format conversion software, (b) executing file format conversions using available software, and (c) evaluating information loss across conversions. While the target audience for the technology is the US National Archives, these basic eservices are of interest to any manager of electronic records and to all citizens trying to keep their files current with the rapidly changing information technology. The novelty of the framework is in organizing the information about file format conversions, providing services about file format conversion paths, in prototyping a general architecture for reusing existing third-party software with import/export capabilities, and in evaluating information loss due to file format conversions. The impact of these e-services is in the widely accessible conversion software registry (CSR), conversion engine (Polyglot) and comparison engine (Versus) which can increase the productivity of the digital preservation community and other users of digital files.

Original languageEnglish (US)
Title of host publicationProceedings of the 2010 Roadmap for Digital Preservation Interoperability Framework Workshop, US-DPIF'10
StatePublished - 2010
Externally publishedYes
Event2010 Roadmap for Digital Preservation Interoperability Framework Workshop, US-DPIF'10 - Gaithersburg, MD, United States
Duration: Mar 29 2010Mar 31 2010

Publication series

NameACM International Conference Proceeding Series


Other2010 Roadmap for Digital Preservation Interoperability Framework Workshop, US-DPIF'10
Country/TerritoryUnited States
CityGaithersburg, MD


  • File format conversions
  • Information loss evaluations

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications


Dive into the research topics of 'A framework for understanding file format conversions'. Together they form a unique fingerprint.

Cite this