Efficiently identifying max-gap clusters in pairwise genome comparison

Xu Ling, Xin He, Dong Xin, Jaiwei Han

Research output: Contribution to journalArticlepeer-review


The spatial clustering of genes across different genomes has been used to study important problems in comparative genomics, from identification of operons to detection of homologous regions. A set of formal models and algorithms of so-called max-gap clusters have been proposed recently. These algorithms guarantee the completeness of the results, and the simplicity of the model enables a rigorous statistical test of significance. These features overcome the weakness of many previous methods, which are often heuristic in nature. We developed a very efficient algorithm to compute max-gap clusters in pairwise genome comparison. Our algorithm is an order-of-magnitude faster than the previous algorithm based on the same model under a number of different settings. In our evaluation on two bacterial genomes, we showed that our method could identify known operons as well as some novel structures in the genome. We also demonstrated that the current framework for conserved spatial clustering of genes can be used to detect homologous regions in higher organisms, through the comparison of human and mouse genomes.

Original languageEnglish (US)
Pages (from-to)593-609
Number of pages17
JournalJournal of Computational Biology
Issue number6
StatePublished - Jul 1 2008


  • Algorithms
  • Gene clusters
  • Gene networks
  • Recognition of genes
  • Regulatory elements
  • Sequence analysis

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics


Dive into the research topics of 'Efficiently identifying max-gap clusters in pairwise genome comparison'. Together they form a unique fingerprint.

Cite this