Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.
ASJC Scopus subject areas