Evaluating thermodynamic models of enhancer activity on cellular resolution gene expression data

Md Abul Hassan Samee, Saurabh Sinha

Research output: Contribution to journalArticlepeer-review


With the advent of high throughput sequencing and high resolution transcriptomic technologies, there exists today an unprecedented opportunity to understand gene regulation at a quantitative level. State of the art models of the relationship between regulatory sequence and gene expression have shown great promise, but also suffer from some major shortcomings. In this paper, we identify and address methodological challenges pertaining to quantitative modeling of gene expression from sequence, and test our models on the anterior-posterior patterning system in the Drosophila embryo. We first develop a framework to process cellular resolution three-dimensional gene expression data from the Drosophila embryo and create data sets on which quantitative models can be trained. Next we propose a new score, called 'weighted pattern generating potential' (w-PGP), to evaluate model predictions, and show its advantages over the two most common scoring schemes in use today. The model building exercise uses w-PGP as the evaluation score and adopts a systematic strategy to increase a model's complexity while guarding against over-fitting. Our model identifies three transcription factors - ZELDA, SLOPPY-PAIRED, and NUBBIN - that have not been previously incorporated in quantitative models of this system, as having significant regulatory influence. Finally, we show how fitting quantitative models on data sets comprising a handful of enhancers, as reported in earlier work, may lead to unreliable models.

Original languageEnglish (US)
Pages (from-to)79-90
Number of pages12
Issue number1
StatePublished - Jul 15 2013


  • Cellular resolution data
  • Drosophila A/P patterning system
  • Enhancer
  • Quantitative model
  • Transcription factor
  • Transcriptional regulation

ASJC Scopus subject areas

  • Molecular Biology
  • Biochemistry, Genetics and Molecular Biology(all)


Dive into the research topics of 'Evaluating thermodynamic models of enhancer activity on cellular resolution gene expression data'. Together they form a unique fingerprint.

Cite this