IDchase: Mitigating identifier migration trap in biological databases

Anupam Bhattacharjee, Aminul Islam, Hasan Jamil, Derek Wildman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A convenient mechanism to refer to large biological objects such as sequences, structures and networks is the use of identifiers or handles, commonly called IDs. IDs function as a unique place holder in an application for objects too large to be of immediate use in a table which is retrieved from a secondary archive when needed. Usually, applications use IDs of objects managed by remote databases that the applications do not have any control over such as GenBank, EMBL and UCSC. Unfortunately, IDs are generally not unique and frequently change as the objects they refer to change. Consequently, applications built using such IDs need to adapt by monitoring possible ID migration occurring in databases they do not control, or risk producing inconsistent, or out of date results, or even face loss of functionality. In this paper, we develop a wrapper based approach to recognizing ID migration in secondary databases, mapping obsolete IDs to valid new IDs, and updating databases to restore their intended functionality. We present our technique in detail using an example involving NCBI RefSeq as primary, and OCPAT as secondary databases. Based on the proposed technique, we introduce a new wrapper like tool, called IDChase, to address the ID migration problem in biological databases and as a general platform.

Original languageEnglish (US)
Title of host publicationContemporary Computing
Subtitle of host publicationSecond International Conference, IC3 2009, Noida, India, August 17-19, 2009. Proceedings
EditorsSanjay Ranka, Srinivas Aluru, Rajkumar Buyya, Y.C. Chung, Sumeet Dua, Ananth Grama, Sandeep Gupta, Rajeev Kumar, Vir Phoha
Pages461-472
Number of pages12
DOIs
StatePublished - Oct 19 2009

Publication series

NameCommunications in Computer and Information Science
Volume40
ISSN (Print)1865-0929

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Fingerprint Dive into the research topics of 'IDchase: Mitigating identifier migration trap in biological databases'. Together they form a unique fingerprint.

Cite this