Identification of the cis-regulatory elements that control coordinate gene expression is one of the longstanding goals of molecular biology. We have developed an integrated methodology and accompanying algorithms, including EvoPrinter and a suite of web-accessed search algorithms, cis-Decoder, that identifies functionally related cis-regulatory enhancers. For this analysis we have generated a Drosophila genome-wide database of conserved DNA consisting of greater than 100,000 conserved sequence clusters (CSCs) derived from EvoPrints spanning over 90% of the genome. cis-Decoder first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function.
To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer, characterized in our analysis of regulation of the temporal determinant castor, was used to identify other functionally related enhancers and analyze their structural organization.cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process.
For the last seventeen years I have maintained the Web resource entitled The Interactive Fly: A Cyberspace Guide to Drosophila Development and metazoan evolution. Drosophila is a good starting place from which to design an interactive model of development, and cyberspace is the made-to-order medium. The Interactive Fly is used at every level of study, including high school, college, graduate school and as a reference for researchers. It has also been used by programmers to develop methods of machine reading of biology text.
Identification of novel embryonic neural precursor cell enhancers based on shared repeat sequences
-Decoder database searches using conserved sequences of the castor
late temporal network enhancer (cas-6
) identified other enhancers that share balanced repeat sequences with the cas-6
enhancer. These structurally similar enhancers also function as late NB sub-lineage enhancers. Many identified CSCs are adjacent to known NB expressed genes (vvl, nab, cas, tkr
, and grh
Although the cas-6
-related enhancers are active in overlapping neural precursor cells, each has its own unique cis
-regulatory identity. Each has a different pattern of expression in subsets of NBs, GMCs, and/or nascent neurons. For example, three identified enhancers (nab-1, CG6559-28
, and tkr-15
) exhibit early expression in a subset of ventral cord midline cells, while other enhancers do not activate reporter expression in the midline precursor cells. The cas-8
CSC activated reporter expression in many more precursors at stage 11 than any of the other reporter constructs. tkr-15
is expressed in many cells at stage 11. Since these cells are too small to be considered NBs, they are most likely GMCs or nascent neurons. cas-6
enhancers both drive reporter expression in overlapping subsets of cells that represent sub-patterns of endogenouscas
Our studies also revealed that there is no apparent consistency in the ordering, overlap, or orientation of shared elements between functionally related enhancers. For example, repeat and palindromic elements shared between cas-6, cg7229-5
, and grh-15
appear in unique contexts within each enhancer.