A methodology for file relationship discovery

Michal Ondrejcek, Jason Kastner, Rob Kooper, Peter Bajcsy

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper addresses the problem of discovering temporal and contextual relationships across document, data, and software categories of electronic records. We designed a methodology to discover unknown relationships by conducting file system and file content analyses. The work also investigates automation of metadata extraction from engineering drawings and storage requirements for metadata extraction. The methodology has been applied to extracting information from a test collection of electronic records about the NAVY ship (TWR 841) archived by the US National Archive (NARA). This test collection represents a problem of unknown relationships among files that include 784 2D image drawings and 22 CAD models.

Original languageEnglish (US)
Title of host publicatione-Science 2009 - 5th IEEE International Conference on e-Science
Number of pages8
StatePublished - 2009
Event5th IEEE International Conference on e-Science, e-Science 2009 - Oxford, United Kingdom
Duration: Dec 9 2009Dec 11 2009

Publication series

Namee-Science 2009 - 5th IEEE International Conference on e-Science


Other5th IEEE International Conference on e-Science, e-Science 2009
Country/TerritoryUnited Kingdom


  • Data conversion
  • Data processing
  • Optical character recognition

ASJC Scopus subject areas

  • General Arts and Humanities
  • General Computer Science
  • General Earth and Planetary Sciences
  • Health Information Management


Dive into the research topics of 'A methodology for file relationship discovery'. Together they form a unique fingerprint.

Cite this