Separating real motifs from their artifacts

Mathieu Blanchette, Saurabh Sinha

Research output: Contribution to journalArticlepeer-review


The typical output of many computational methods to identify binding sites is a long list of motifs containing some real motifs (those most likely to correspond to the actual binding sites) along with a large number of random variations of these. We present a statistical method to separate real motifs from their artifacts. This produces a short list of high quality motifs that is sufficient to explain the over-representation of all motifs in the given sequences. Using synthetic data sets, we show that the output of our method is very accurate. On various sets of upstream sequences in S. cerevisiae, our program identifies several known binding sites, as well as a number of significant novel motifs.

Original languageEnglish (US)
Pages (from-to)S30-S38
Issue numberSUPPL. 1
StatePublished - 2001
Externally publishedYes

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Separating real motifs from their artifacts'. Together they form a unique fingerprint.

Cite this