Abstract
Deciphering the sequence-function relationship encoded in enhancers holds the key to interpreting noncoding variants and understanding mechanisms of transcriptomic variation. Several quantitative models exist for predicting enhancer function and underlying mechanisms; however, there has been no systematic comparison of these models characterizing their relative strengths and shortcomings. Here, we interrogated a rich data set of neuroectodermal enhancers in Drosophila, representing cis- and trans- sources of expression variation, with a suite of biophysical and machine learning models. We performed rigorous comparisons of thermodynamics-based models implementing different mechanisms of activation, repression and cooperativity. Moreover, we developed a convolutional neural network (CNN) model, called CoNSEPT, that learns enhancer ‘grammar’ in an unbiased manner. CoNSEPT is the first general-purpose CNN tool for predicting enhancer function in varying conditions, such as different cell types and experimental conditions, and we show that such complex models can suggest interpretable mechanisms. We found model-based evidence for mechanisms previously established for the studied system, including cooperative activation and short-range repression. The data also favored one hypothesized activation mechanism over another and suggested an intriguing role for a direct, distance-independent repression mechanism. Our modeling shows that while fundamentally different models can yield similar fits to data, they vary in their utility for mechanistic inference. CoNSEPT is freely available at: https://github.com/PayamDiba/CoNSEPT.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 10309-10327 |
| Number of pages | 19 |
| Journal | Nucleic acids research |
| Volume | 49 |
| Issue number | 18 |
| Early online date | Sep 11 2021 |
| DOIs | |
| State | Published - Oct 11 2021 |
ASJC Scopus subject areas
- Genetics
Fingerprint
Dive into the research topics of 'Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks'. Together they form a unique fingerprint.Datasets
-
Convolutional Neural Network-based Sequence-to-Expression Prediction Tool (CoNSEPT)
Dibaeinia, P. (Creator) & Sinha, S. (Creator), University of Illinois Urbana-Champaign, Jan 8 2026
DOI: 10.13012/B2IDB-8692568_V1
Dataset
Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS