TY - JOUR
T1 - Computing DNA duplex instability profiles efficiently with a two-state model
T2 - Trends of promoters and binding sites
AU - Kantorovitz, Miriam R.
AU - Rapti, Zoi
AU - Gelev, Vladimir
AU - Usheva, Anny
N1 - Funding Information:
MRK was partially supported by the NSF grant DBI-0835718. ZR acknowledges support by the NSF through grant DMS-0708421. This research was funded in part by the NIH (grant # GM073911 to AU).
PY - 2010/12/21
Y1 - 2010/12/21
N2 - Background: DNA instability profiles have been used recently for predicting the transcriptional start site and the location of core promoters, and to gain insight into promoter action. It was also shown that the use of these profiles can significantly improve the performance of motif finding programs.Results: In this work we introduce a new method for computing DNA instability profiles. The model that we use is a modified Ising-type model and it is implemented via statistical mechanics. Our linear time algorithm computes the profile of a 10,000 base-pair long sequence in less than one second. The method we use also allows the computation of the probability that several consecutive bases are unpaired simultaneously. This is a feature that is not available in other linear-time algorithms. We use the model to compare the thermodynamic trends of promoter sequences of several genomes. In addition, we report results that associate the location of local extrema in the instability profiles with the presence of core promoter elements at these locations and with the location of the transcription start sites (TSS). We also analyzed the instability scores of binding sites of several human core promoter elements. We show that the instability scores of functional binding sites of a given core promoter element are significantly different than the scores of sites with the same motif occurring outside the functional range (relative to the TSS).Conclusions: The time efficiency of the algorithm and its genome-wide applications makes this work of broad interest to scientists interested in transcriptional regulation, motif discovery, and comparative genomics.
AB - Background: DNA instability profiles have been used recently for predicting the transcriptional start site and the location of core promoters, and to gain insight into promoter action. It was also shown that the use of these profiles can significantly improve the performance of motif finding programs.Results: In this work we introduce a new method for computing DNA instability profiles. The model that we use is a modified Ising-type model and it is implemented via statistical mechanics. Our linear time algorithm computes the profile of a 10,000 base-pair long sequence in less than one second. The method we use also allows the computation of the probability that several consecutive bases are unpaired simultaneously. This is a feature that is not available in other linear-time algorithms. We use the model to compare the thermodynamic trends of promoter sequences of several genomes. In addition, we report results that associate the location of local extrema in the instability profiles with the presence of core promoter elements at these locations and with the location of the transcription start sites (TSS). We also analyzed the instability scores of binding sites of several human core promoter elements. We show that the instability scores of functional binding sites of a given core promoter element are significantly different than the scores of sites with the same motif occurring outside the functional range (relative to the TSS).Conclusions: The time efficiency of the algorithm and its genome-wide applications makes this work of broad interest to scientists interested in transcriptional regulation, motif discovery, and comparative genomics.
UR - http://www.scopus.com/inward/record.url?scp=78650259674&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650259674&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-11-604
DO - 10.1186/1471-2105-11-604
M3 - Article
C2 - 21172036
AN - SCOPUS:78650259674
SN - 1471-2105
VL - 11
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - 604
ER -