TY - GEN
T1 - A framework for understanding file format conversions
AU - Bajcsy, Peter
AU - Kooper, Rob
AU - Marini, Luigi
AU - McHenry, Kenton
AU - Ondrejcek, Michal
PY - 2010
Y1 - 2010
N2 - This paper addresses the workshop question: "Can data generated from the infancy of the digital age be ingestible by software today?" We have prototyped a set of e-services that serve as a framework for understanding content preservation, automation and computational requirements on preservation of electronic records. The framework consists of e-services for (a) finding file format conversion software, (b) executing file format conversions using available software, and (c) evaluating information loss across conversions. While the target audience for the technology is the US National Archives, these basic eservices are of interest to any manager of electronic records and to all citizens trying to keep their files current with the rapidly changing information technology. The novelty of the framework is in organizing the information about file format conversions, providing services about file format conversion paths, in prototyping a general architecture for reusing existing third-party software with import/export capabilities, and in evaluating information loss due to file format conversions. The impact of these e-services is in the widely accessible conversion software registry (CSR), conversion engine (Polyglot) and comparison engine (Versus) which can increase the productivity of the digital preservation community and other users of digital files.
AB - This paper addresses the workshop question: "Can data generated from the infancy of the digital age be ingestible by software today?" We have prototyped a set of e-services that serve as a framework for understanding content preservation, automation and computational requirements on preservation of electronic records. The framework consists of e-services for (a) finding file format conversion software, (b) executing file format conversions using available software, and (c) evaluating information loss across conversions. While the target audience for the technology is the US National Archives, these basic eservices are of interest to any manager of electronic records and to all citizens trying to keep their files current with the rapidly changing information technology. The novelty of the framework is in organizing the information about file format conversions, providing services about file format conversion paths, in prototyping a general architecture for reusing existing third-party software with import/export capabilities, and in evaluating information loss due to file format conversions. The impact of these e-services is in the widely accessible conversion software registry (CSR), conversion engine (Polyglot) and comparison engine (Versus) which can increase the productivity of the digital preservation community and other users of digital files.
KW - File format conversions
KW - Information loss evaluations
UR - http://www.scopus.com/inward/record.url?scp=80054076194&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80054076194&partnerID=8YFLogxK
U2 - 10.1145/2039274.2039284
DO - 10.1145/2039274.2039284
M3 - Conference contribution
AN - SCOPUS:80054076194
SN - 9781450301091
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 2010 Roadmap for Digital Preservation Interoperability Framework Workshop, US-DPIF'10
T2 - 2010 Roadmap for Digital Preservation Interoperability Framework Workshop, US-DPIF'10
Y2 - 29 March 2010 through 31 March 2010
ER -