1
|
Villarreal F, Burguener GF, Sosa EJ, Stocchi N, Somoza GM, Turjanski AG, Blanco A, Viñas J, Mechaly AS. Genome sequencing and analysis of black flounder (Paralichthys orbignyanus) reveals new insights into Pleuronectiformes genomic size and structure. BMC Genomics 2024; 25:297. [PMID: 38509481 PMCID: PMC10956332 DOI: 10.1186/s12864-024-10081-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 02/02/2024] [Indexed: 03/22/2024] Open
Abstract
Black flounder (Paralichthys orbignyanus, Pleuronectiformes) is a commercially significant marine fish with promising aquaculture potential in Argentina. Despite extensive studies on Black flounder aquaculture, its limited genetic information available hampers the crucial role genetics plays in the development of this activity. In this study, we first employed Illumina sequencing technology to sequence the entire genome of Black flounder. Utilizing two independent libraries-one from a female and another from a male-with 150 bp paired-end reads, a mean insert length of 350 bp, and over 35 X-fold coverage, we achieved assemblies resulting in a genome size of ~ 538 Mbp. Analysis of the assemblies revealed that more than 98% of the core genes were present, with more than 78% of them having more than 50% coverage. This indicates a somehow complete and accurate genome at the coding sequence level. This genome contains 25,231 protein-coding genes, 445 tRNAs, 3 rRNAs, and more than 1,500 non-coding RNAs of other types. Black flounder, along with pufferfishes, seahorses, pipefishes, and anabantid fish, displays a smaller genome compared to most other teleost groups. In vertebrates, the number of transposable elements (TEs) is often correlated with genome size. However, it remains unclear whether the sizes of introns and exons also play a role in determining genome size. Hence, to elucidate the potential factors contributing to this reduced genome size, we conducted a comparative genomic analysis between Black flounder and other teleost orders to determine if the small genomic size could be explained by repetitive elements or gene features, including the whole genome genes and introns sizes. We show that the smaller genome size of flounders can be attributed to several factors, including changes in the number of repetitive elements, and decreased gene size, particularly due to lower amount of very large and small introns. Thus, these components appear to be involved in the genome reduction in Black flounder. Despite these insights, the full implications and potential benefits of genome reduction in Black flounder for reproduction and aquaculture remain incompletely understood, necessitating further research.
Collapse
Affiliation(s)
- Fernando Villarreal
- Facultad de Ciencias Exactas y Naturales, Instituto de Investigaciones Biológicas (IIB-CONICET-UNMdP), Universidad Nacional de Mar del Plata, Mar del Plata, Argentina
| | - Germán F Burguener
- Plataforma de Bioinformática Argentina, Facultad de Ciencias Exactas y Naturales, Instituto de Cálculo, UBA, Pabellón 2, Ciudad Universitaria, Buenos Aires, Argentina
| | - Ezequiel J Sosa
- Plataforma de Bioinformática Argentina, Facultad de Ciencias Exactas y Naturales, Instituto de Cálculo, UBA, Pabellón 2, Ciudad Universitaria, Buenos Aires, Argentina
- Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Ciudad Universitaria, Buenos Aires, Argentina
| | - Nicolas Stocchi
- Facultad de Ciencias Exactas y Naturales, Instituto de Investigaciones Biológicas (IIB-CONICET-UNMdP), Universidad Nacional de Mar del Plata, Mar del Plata, Argentina
| | - Gustavo M Somoza
- Instituto Tecnológico de Chascomús (CONICET-UNSAM), Chascomús, Buenos Aires, Argentina
- Escuela de Bio y Nanotecnologías (UNSAM), Buenos Aires, Argentina
| | - Adrián G Turjanski
- Plataforma de Bioinformática Argentina, Facultad de Ciencias Exactas y Naturales, Instituto de Cálculo, UBA, Pabellón 2, Ciudad Universitaria, Buenos Aires, Argentina
- Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Ciudad Universitaria, Buenos Aires, Argentina
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Andrés Blanco
- Facultade de Veterinaria, Universidade de Santiago de Compostela, Santiago de Compostela, Lugo, Spain
- Departamento de Zoología, Genética y Antropología Física, Facultad de Veterinaria, Campus Terra, Universidade de Santiago de Compostela, Lugo, Spain
| | - Jordi Viñas
- Laboratori d'Ictiologia Genètica, Departament de Biologia, Universitat de Girona, Maria Aurèlia Campmany, 40, Girona, Spain
| | - Alejandro S Mechaly
- Instituto de Investigaciones en Biodiversidad y Biotecnología (INBIOTEC-CONICET), Mar del Plata, Argentina.
- Fundación Para Investigaciones Biológicas Aplicadas (FIBA), Mar del Plata, Argentina.
| |
Collapse
|
2
|
Sabarís G, Ortíz DM, Laiker I, Mayansky I, Naik S, Cavalli G, Stern DL, Preger-Ben Noon E, Frankel N. The Density of Regulatory Information Is a Major Determinant of Evolutionary Constraint on Noncoding DNA in Drosophila. Mol Biol Evol 2024; 41:msae004. [PMID: 38364113 PMCID: PMC10871701 DOI: 10.1093/molbev/msae004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 11/26/2023] [Accepted: 01/05/2024] [Indexed: 02/18/2024] Open
Abstract
Evolutionary analyses have estimated that ∼60% of nucleotides in intergenic regions of the Drosophila melanogaster genome are functionally relevant, suggesting that regulatory information may be encoded more densely in intergenic regions than has been revealed by most functional dissections of regulatory DNA. Here, we approached this issue through a functional dissection of the regulatory region of the gene shavenbaby (svb). Most of the ∼90 kb of this large regulatory region is highly conserved in the genus Drosophila, though characterized enhancers occupy a small fraction of this region. By analyzing the regulation of svb in different contexts of Drosophila development, we found that the regulatory information that drives svb expression in the abdominal pupal epidermis is organized in a different way than the elements that drive svb expression in the embryonic epidermis. While in the embryonic epidermis svb is activated by compact enhancers separated by large inactive DNA regions, svb expression in the pupal epidermis is driven by regulatory information distributed over broader regions of svb cis-regulatory DNA. In the same vein, we observed that other developmental genes also display a dense distribution of putative regulatory elements in their regulatory regions. Furthermore, we found that a large percentage of conserved noncoding DNA of the Drosophila genome is contained within regions of open chromatin. These results suggest that part of the evolutionary constraint on noncoding DNA of Drosophila is explained by the density of regulatory information, which may be greater than previously appreciated.
Collapse
Affiliation(s)
- Gonzalo Sabarís
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
- Institute of Human Genetics, UMR 9002 CNRS-Université de Montpellier, Montpellier, France
| | - Daniela M Ortíz
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Ian Laiker
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Ignacio Mayansky
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Sujay Naik
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion—Israel Institute of Technology, Haifa 3109601, Israel
| | - Giacomo Cavalli
- Institute of Human Genetics, UMR 9002 CNRS-Université de Montpellier, Montpellier, France
| | - David L Stern
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA
| | - Ella Preger-Ben Noon
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion—Israel Institute of Technology, Haifa 3109601, Israel
| | - Nicolás Frankel
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
- Departamento de Ecología, Genética y Evolución, Facultad de Ciencias Exactas y Naturales (FCEN), Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| |
Collapse
|
3
|
Cheatle Jarvela AM, Trelstad CS, Pick L. Anterior-posterior patterning of segments in Anopheles stephensi offers insights into the transition from sequential to simultaneous segmentation in holometabolous insects. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2023; 340:116-130. [PMID: 34734470 PMCID: PMC9061899 DOI: 10.1002/jez.b.23102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 10/13/2021] [Accepted: 10/16/2021] [Indexed: 11/10/2022]
Abstract
The gene regulatory network for segmentation in arthropods offers valuable insights into how networks evolve owing to the breadth of species examined and the extremely detailed knowledge gained in the model organism Drosophila melanogaster. These studies have shown that Drosophila's network represents a derived state that acquired changes to accelerate segment patterning, whereas most insects specify segments gradually as the embryo elongates. Such heterochronic shifts in segmentation have potentially emerged multiple times within holometabolous insects, resulting in many mechanistic variants and difficulties in isolating underlying commonalities that permit such shifts. Recent studies identified regulatory genes that work as timing factors, coordinating gene expression transitions during segmentation. These studies predict that changes in timing factor deployment explain shifts in segment patterning relative to other developmental events. Here, we test this hypothesis by characterizing the temporal and spatial expression of the pair-rule patterning genes in the malaria vector mosquito, Anopheles stephensi. This insect is a Dipteran (fly), like Drosophila, but represents an ancient divergence within this clade, offering a useful counterpart for evo-devo studies. In mosquito embryos, we observe anterior to posterior sequential addition of stripes for many pair-rule genes and a wave of broad timer gene expression across this axis. Segment polarity gene stripes are added sequentially in the wake of the timer gene wave and the full pattern is not complete until the embryo is fully elongated. This "progressive segmentation" mode in Anopheles displays commonalities with both Drosophila's rapid segmentation mechanism and sequential modes used by more distantly related insects.
Collapse
Affiliation(s)
- Alys M. Cheatle Jarvela
- Department of Entomology, University of Maryland, College Park, 4291 Fieldhouse Drive, College Park, MD 20742, U.S.A
| | - Catherine S. Trelstad
- Department of Entomology, University of Maryland, College Park, 4291 Fieldhouse Drive, College Park, MD 20742, U.S.A
| | - Leslie Pick
- Department of Entomology, University of Maryland, College Park, 4291 Fieldhouse Drive, College Park, MD 20742, U.S.A
| |
Collapse
|
4
|
|
5
|
Maeso I, Tena JJ. Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework. Semin Cell Dev Biol 2015; 57:2-10. [PMID: 26673387 DOI: 10.1016/j.semcdb.2015.12.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Revised: 12/02/2015] [Accepted: 12/05/2015] [Indexed: 12/22/2022]
Abstract
Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements.
Collapse
Affiliation(s)
- Ignacio Maeso
- Centro Andaluz de Biología del Desarrollo (CSIC/UPO/JA), Universidad Pablo de Olavide, 41013 Seville, Spain.
| | - Juan J Tena
- Centro Andaluz de Biología del Desarrollo (CSIC/UPO/JA), Universidad Pablo de Olavide, 41013 Seville, Spain.
| |
Collapse
|
6
|
Negre B, Simpson P. The achaete-scute complex in Diptera: patterns of noncoding sequence evolution. J Evol Biol 2015; 28:1770-81. [PMID: 26134680 PMCID: PMC4832353 DOI: 10.1111/jeb.12687] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 06/26/2015] [Accepted: 06/29/2015] [Indexed: 11/29/2022]
Abstract
The achaete‐scute complex (AS‐C) has been a useful paradigm for the study of pattern formation and its evolution. achaete‐scute genes have duplicated and evolved distinct expression patterns during the evolution of cyclorraphous Diptera. Are the expression patterns in different species driven by conserved regulatory elements? If so, when did such regulatory elements arise? Here, we have sequenced most of the AS‐C of the fly Calliphora vicina (including the genes achaete, scute and lethal of scute) to compare noncoding sequences with known cis‐regulatory sequences in Drosophila. The organization of the complex is conserved with respect to Drosophila species. There are numerous small stretches of conserved noncoding sequence that, in spite of high sequence turnover, display binding sites for known transcription factors. Synteny of the blocks of conserved noncoding sequences is maintained suggesting not only conservation of the position of regulatory elements but also an origin prior to the divergence between these two species. We propose that some of these enhancers originated by duplication with their target genes.
Collapse
Affiliation(s)
- B Negre
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - P Simpson
- Department of Zoology, University of Cambridge, Cambridge, UK
| |
Collapse
|
7
|
Kazemian M, Suryamohan K, Chen JY, Zhang Y, Samee MAH, Halfon MS, Sinha S. Evidence for deep regulatory similarities in early developmental programs across highly diverged insects. Genome Biol Evol 2015; 6:2301-20. [PMID: 25173756 PMCID: PMC4217690 DOI: 10.1093/gbe/evu184] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution.
Collapse
Affiliation(s)
- Majid Kazemian
- Department of Computer Science, University of Illinois at Urbana-Champaign Laboratory of Molecular Immunology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Kushal Suryamohan
- Department of Biochemistry, University at Buffalo-State University of New York NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, New York
| | - Jia-Yu Chen
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | - Yinan Zhang
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | | | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, New York Department of Biological Sciences, University at Buffalo-State University of New York Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign Institute of Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
8
|
Shadow enhancers enable Hunchback bifunctionality in the Drosophila embryo. Proc Natl Acad Sci U S A 2015; 112:785-90. [PMID: 25564665 DOI: 10.1073/pnas.1413877112] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Hunchback (Hb) is a bifunctional transcription factor that activates and represses distinct enhancers. Here, we investigate the hypothesis that Hb can activate and repress the same enhancer. Computational models predicted that Hb bifunctionally regulates the even-skipped (eve) stripe 3+7 enhancer (eve3+7) in Drosophila blastoderm embryos. We measured and modeled eve expression at cellular resolution under multiple genetic perturbations and found that the eve3+7 enhancer could not explain endogenous eve stripe 7 behavior. Instead, we found that eve stripe 7 is controlled by two enhancers: the canonical eve3+7 and a sequence encompassing the minimal eve stripe 2 enhancer (eve2+7). Hb bifunctionally regulates eve stripe 7, but it executes these two activities on different pieces of regulatory DNA--it activates the eve2+7 enhancer and represses the eve3+7 enhancer. These two "shadow enhancers" use different regulatory logic to create the same pattern.
Collapse
|
9
|
Wotton KR, Jiménez-Guri E, Crombach A, Janssens H, Alcaine-Colet A, Lemke S, Schmidt-Ott U, Jaeger J. Quantitative system drift compensates for altered maternal inputs to the gap gene network of the scuttle fly Megaselia abdita. eLife 2015; 4:e04785. [PMID: 25560971 PMCID: PMC4337606 DOI: 10.7554/elife.04785] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 01/02/2015] [Indexed: 12/20/2022] Open
Abstract
The segmentation gene network in insects can produce equivalent phenotypic outputs despite differences in upstream regulatory inputs between species. We investigate the mechanistic basis of this phenomenon through a systems-level analysis of the gap gene network in the scuttle fly Megaselia abdita (Phoridae). It combines quantification of gene expression at high spatio-temporal resolution with systematic knock-downs by RNA interference (RNAi). Initiation and dynamics of gap gene expression differ markedly between M. abdita and Drosophila melanogaster, while the output of the system converges to equivalent patterns at the end of the blastoderm stage. Although the qualitative structure of the gap gene network is conserved, there are differences in the strength of regulatory interactions between species. We term such network rewiring 'quantitative system drift'. It provides a mechanistic explanation for the developmental hourglass model in the dipteran lineage. Quantitative system drift is likely to be a widespread mechanism for developmental evolution.
Collapse
Affiliation(s)
- Karl R Wotton
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Eva Jiménez-Guri
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Anton Crombach
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Hilde Janssens
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Anna Alcaine-Colet
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Universitat de Barcelona, Barcelona, Spain
| | - Steffen Lemke
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, United States
| | - Urs Schmidt-Ott
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, United States
| | - Johannes Jaeger
- European Molecular Biology Laboratory, CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
10
|
Gilchrist AS, Shearman DCA, Frommer M, Raphael KA, Deshpande NP, Wilkins MR, Sherwin WB, Sved JA. The draft genome of the pest tephritid fruit fly Bactrocera tryoni: resources for the genomic analysis of hybridising species. BMC Genomics 2014; 15:1153. [PMID: 25527032 PMCID: PMC4367827 DOI: 10.1186/1471-2164-15-1153] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 12/12/2014] [Indexed: 01/08/2023] Open
Abstract
Background The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. Results We produced a draft de novo genome assembly of Australia’s major tephritid pest species, Bactrocera tryoni. The male genome (650 -700 Mbp) includes approximately 150Mb of interspersed repetitive DNA sequences and 60Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. Conclusions This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA families. These genetic resources will facilitate the further investigations of genetic mechanisms responsible for the behavioural and morphological differences between these three species and other tephritids. We have also shown how whole genome sequence data can be used to generate simple diagnostic tests between very closely-related species where only one of the species is scaffolded. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1153) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anthony Stuart Gilchrist
- Evolution and Ecology Research Centre, School of Biological, Earth and Environmental Sciences, The University of New South Wales, Sydney, NSW 2052 Australia.
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Glassford WJ, Rebeiz M. Assessing constraints on the path of regulatory sequence evolution. Philos Trans R Soc Lond B Biol Sci 2013; 368:20130026. [PMID: 24218638 PMCID: PMC3826499 DOI: 10.1098/rstb.2013.0026] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Structural and functional constraints are known to play a major role in restricting the path of evolution of protein activities. However, constraints acting on evolving transcriptional regulatory sequences, e.g. enhancers, are largely unknown. Recently, we elucidated how a novel expression pattern of the Neprilysin-1 (Nep1) gene in the optic lobe of Drosophila santomea evolved via co-option of existing enhancer activities. Drosophila santomea, which has diverged from Drosophila yakuba by approximately 400 000 years has accumulated four fixed mutations that each contribute to the full activity of this enhancer. Recreating and testing the optic lobe enhancer of the ancestor of D. santomea and D. yakuba revealed that the strong D. santomea enhancer activity evolved from a weak ancestral activity. Because each mutation on the path from the D. yakuba/santomea ancestor to modern-day D. santomea contributes to the newly derived optic lobe enhancer activity, we sought here to use this system to study the path of evolution of enhancer sequences. We inferred likely paths of evolution of this enhancer by observing the transcriptional output of all possible intermediate steps between the ancestral D. yakuba/santomea enhancer and the modern D. santomea enhancer. Many possible paths had epistatic and cooperative effects. Furthermore, we found that several paths significantly increased ectopic transcriptional activity or affected existing enhancer activities from which the novel activity was co-opted. We suggest that these attributes highlight constraints that guide the path of evolution of enhancers.
Collapse
Affiliation(s)
| | - Mark Rebeiz
- Department of Biological Sciences, University of Pittsburgh, 4249 Fifth Avenue, Pittsburgh, PA 15260, USA
| |
Collapse
|
12
|
Chong Z, Zhai W, Li C, Gao M, Gong Q, Ruan J, Li J, Jiang L, Lv X, Hungate E, Wu CI. The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster. Mol Biol Evol 2013; 30:2699-708. [PMID: 24077769 DOI: 10.1093/molbev/mst167] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Studies of protein evolution have focused on amino acid substitutions with much less systematic analysis on insertion and deletions (indels) in protein coding genes. We hence surveyed 7,500 genes between Drosophila melanogaster and D. simulans, using D. yakuba as an outgroup for this purpose. The evolutionary rate of coding indels is indeed low, at only 3% of that of nonsynonymous substitutions. As coding indels follow a geometric distribution in size and tend to fall in low-complexity regions of proteins, it is unclear whether selection or mutation underlies this low rate. To resolve the issue, we collected genomic sequences from an isogenic African line of D. melanogaster (ZS30) at a high coverage of 70× and analyzed indel polymorphism between ZS30 and the reference genome. In comparing polymorphism and divergence, we found that the divergence to polymorphism ratio (i.e., fixation index) for smaller indels (size ≤ 10 bp) is very similar to that for synonymous changes, suggesting that most of the within-species polymorphism and between-species divergence for indels are selectively neutral. Interestingly, deletions of larger sizes (size ≥ 11 bp and ≤ 30 bp) have a much higher fixation index than synonymous mutations and 44.4% of fixed middle-sized deletions are estimated to be adaptive. To our surprise, this pattern is not found for insertions. Protein indel evolution appear to be in a dynamic flux of neutrally driven expansion (insertions) together with adaptive-driven contraction (deletions), and these observations provide important insights for understanding the fitness of new mutations as well as the evolutionary driving forces for genomic evolution in Drosophila species.
Collapse
Affiliation(s)
- Zechen Chong
- Center for Computational Biology and Laboratory of Disease Genomics and Individualized Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Jiménez-Guri E, Huerta-Cepas J, Cozzuto L, Wotton KR, Kang H, Himmelbauer H, Roma G, Gabaldón T, Jaeger J. Comparative transcriptomics of early dipteran development. BMC Genomics 2013; 14:123. [PMID: 23432914 PMCID: PMC3616871 DOI: 10.1186/1471-2164-14-123] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Accepted: 02/19/2013] [Indexed: 12/24/2022] Open
Abstract
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies).
Collapse
Affiliation(s)
- Eva Jiménez-Guri
- EMBL/CRG Research Unit in Systems Biology, Centre de Regulació Genòmica (CRG), and Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Nirmala X, Schetelig MF, Yu F, Handler AM. An EST database of the Caribbean fruit fly, Anastrepha suspensa (Diptera: Tephritidae). Gene 2013; 517:212-7. [PMID: 23296060 DOI: 10.1016/j.gene.2012.12.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2012] [Revised: 12/06/2012] [Accepted: 12/07/2012] [Indexed: 10/27/2022]
Abstract
Invasive tephritid fruit flies are a great threat to agriculture worldwide and warrant serious pest control measures. Molecular strategies that promote embryonic lethality in these agricultural pests are limited by the small amount of nucleotide sequence data available for tephritids. To increase the dataset for sequence mining, we generated an EST database by 454 sequencing of the caribfly, Anastrepha suspensa, a model tephritid pest. This database yielded 95,803 assembled sequences with 24% identified as independent transcripts. The percentage of caribfly sequences with hits to the closely related tephritid, Rhagoletis pomonella, transcriptome was higher (28%) than to Drosophila proteins/genes (18%) in NCBI. The database contained genes specifically expressed in embryos, genes involved in the cell death, sex-determination, and RNAi pathways, and transposable elements and microsatellites. This study significantly expands the nucleotide data available for caribflies and will be a valuable resource for gene isolation and genomic studies in tephritid insects.
Collapse
Affiliation(s)
- Xavier Nirmala
- USDA/ARS, Center for Medical, Agricultural and Veterinary Entomology, 1700 SW 23rd Drive, Gainesville, FL 32608, USA.
| | | | | | | |
Collapse
|
15
|
Jenkins C, Chapman TA, Micallef JL, Reynolds OL. Molecular Techniques for the Detection and Differentiation of Host and Parasitoid Species and the Implications for Fruit Fly Management. INSECTS 2012; 3:763-88. [PMID: 26466628 PMCID: PMC4553589 DOI: 10.3390/insects3030763] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Revised: 07/31/2012] [Accepted: 08/01/2012] [Indexed: 12/17/2022]
Abstract
Parasitoid detection and identification is a necessary step in the development and implementation of fruit fly biological control strategies employing parasitoid augmentive release. In recent years, DNA-based methods have been used to identify natural enemies of pest species where morphological differentiation is problematic. Molecular techniques also offer a considerable advantage over traditional morphological methods of fruit fly and parasitoid discrimination as well as within-host parasitoid identification, which currently relies on dissection of immature parasitoids from the host, or lengthy and labour-intensive rearing methods. Here we review recent research focusing on the use of molecular strategies for fruit fly and parasitoid detection and differentiation and discuss the implications of these studies on fruit fly management.
Collapse
Affiliation(s)
- Cheryl Jenkins
- Elizabeth Macarthur Agricultural Institute, NSW Department of Primary Industries, Woodbridge Road, Menangle, NSW 2568, Australia.
| | - Toni A Chapman
- Elizabeth Macarthur Agricultural Institute, NSW Department of Primary Industries, Woodbridge Road, Menangle, NSW 2568, Australia.
| | - Jessica L Micallef
- Elizabeth Macarthur Agricultural Institute, NSW Department of Primary Industries, Woodbridge Road, Menangle, NSW 2568, Australia.
| | - Olivia L Reynolds
- Elizabeth Macarthur Agricultural Institute, NSW Department of Primary Industries, Woodbridge Road, Menangle, NSW 2568, Australia.
| |
Collapse
|
16
|
Aerts S. Computational strategies for the genome-wide identification of cis-regulatory elements and transcriptional targets. Curr Top Dev Biol 2012; 98:121-45. [PMID: 22305161 DOI: 10.1016/b978-0-12-386499-4.00005-7] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transcription factors (TFs) are key proteins that decode the information in our genome to express a precise and unique set of proteins and RNA molecules in each cell type in our body. These factors play a pivotal role in all biological processes, including the determination of a cell's fate during development and the maintenance of a cell's physiological function. To achieve this, a TF binds to specific DNA sequences in the noncoding part of the genome, recruits chromatin modifiers and cofactors, and directs the transcription initiation rate of its "target genes." Therefore, a key challenge in deciphering a transcriptional switch is to identify the direct target genes of the master regulators that control the switch, the cis-regulatory elements implementing (auto-)regulatory loops, and the target genes of all the TFs in the downstream regulatory network. A better knowledge of a TF's targetome during specification and differentiation of a particular cell type will generate mechanistic insight into its developmental program. Here, I review computational strategies and methods to predict transcriptional targets by genome-wide searches for TF binding sites using position weight matrices, motif clusters, phylogenetic footprinting, chromatin binding and accessibility data, enhancer classification, motif enrichment, and gene expression signatures.
Collapse
Affiliation(s)
- Stein Aerts
- Laboratory of Computational Biology, Center for Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| |
Collapse
|
17
|
Wittkopp PJ, Kalay G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 2011; 13:59-69. [DOI: 10.1038/nrg3095] [Citation(s) in RCA: 659] [Impact Index Per Article: 50.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
18
|
Geuten K, Viaene T, Irish VF. Robustness and evolvability in the B-system of flower development. ANNALS OF BOTANY 2011; 107:1545-56. [PMID: 21441246 PMCID: PMC3108807 DOI: 10.1093/aob/mcr061] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2010] [Revised: 12/17/2010] [Accepted: 01/24/2011] [Indexed: 05/23/2023]
Abstract
BACKGROUND Gene duplication has often been invoked as a key mechanism responsible for evolution of new morphologies. The floral homeotic B-group gene family has undergone a number of gene duplication events, and yet the functions of these genes appear to be largely conserved. However, detailed comparative analysis has indicated that such duplicate genes have considerable cryptic variability in their functions. In the Solanaceae, two duplicate B-group gene lineages have been retained in three subfamilies. Comparisons of orthologous genes across members of the Solanaceae have demonstrated that the combined function of all four B-gene members is to establish petal and stamen identity, but that this function was partitioned differently in each species. These observations emphasize both the robustness and the evolvability of the B-system. SCOPE We provide an overview of how the B-function genes can robustly specify petal and stamen identity and at the same time evolve through changes in protein-protein interaction, gene expression patterns, copy number variation or alterations in the downstream genes they control. By using mathematical models we explore regulatory differences between species and how these impose constraints on downstream gene regulation. CONCLUSIONS Evolvability of the B-genes can be understood through the multiple ways in which the B-system can be robust. Quantitative approaches should allow for the incorporation of more biological realism in the representations of these regulatory systems and this should contribute to understanding the constraints under which different B-systems can function and evolve. This, in turn, can provide a better understanding of the ways in which B-genes have contributed to flower diversity.
Collapse
Affiliation(s)
- K Geuten
- Department of Biology, K.U. Leuven, Kasteelpark Arenberg 31, 3001 Heverlee, Belgium.
| | | | | |
Collapse
|
19
|
Bullaughey K. Changes in selective effects over time facilitate turnover of enhancer sequences. Genetics 2011; 187:567-82. [PMID: 21098721 PMCID: PMC3030497 DOI: 10.1534/genetics.110.121590] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2010] [Accepted: 11/10/2010] [Indexed: 11/18/2022] Open
Abstract
Correct gene expression is often critical and consequently stabilizing selection on expression is widespread. Yet few genes possess highly conserved regulatory DNA, and for the few enhancers that have been carefully characterized, substantial functional reorganization has often occurred. Given that natural selection removes mutations of even very small deleterious effect, how can transcription factor binding evolve so readily when it underlies a conserved phenotype? As a first step toward addressing this question, I combine a computational model for regulatory function that incorporates many aspects of our present biological knowledge with a model for the fitness effects of misexpression. I then use this model to study the evolution of enhancers. Several robust behaviors emerge: First, the selective effects of mutations at a site change dramatically over time due to substitutions elsewhere in the enhancer, and even the overall degree of constraint across the enhancer can change considerably. Second, many of the substitutions responsible for changes in binding occur at sites where previously the mutation would have been strongly deleterious, suggesting that fluctuations in selective effects at a site are important for functional turnover. Third, most substitutions contributing to the repatterning of binding and constraint are effectively neutral, highlighting the importance of genetic drift-even for enhancers underlying conserved phenotypes. These findings have important implications for phylogenetic inference of function and for interpretations of selection coefficients estimated for regulatory DNA.
Collapse
Affiliation(s)
- Kevin Bullaughey
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
| |
Collapse
|
20
|
Hox gene Ultrabithorax regulates distinct sets of target genes at successive stages of Drosophila haltere morphogenesis. Proc Natl Acad Sci U S A 2011; 108:2855-60. [PMID: 21282633 DOI: 10.1073/pnas.1015077108] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Hox genes encode highly conserved transcription factors that regionalize the animal body axis by controlling complex developmental processes. Although they are known to operate in multiple cell types and at different stages, we are still missing the batteries of genes targeted by any one Hox gene over the course of a single developmental process to achieve a particular cell and organ morphology. The transformation of wings into halteres by the Hox gene Ultrabithorax (Ubx) in Drosophila melanogaster presents an excellent model system to study the Hox control of transcriptional networks during successive stages of appendage morphogenesis and cell differentiation. We have used an inducible misexpression system to switch on Ubx in the wing epithelium at successive stages during metamorphosis--in the larva, prepupa, and pupa. We have then used extensive microarray expression profiling and quantitative RT-PCR to identify the primary transcriptional responses to Ubx. We find that Ubx targets range from regulatory genes like transcription factors and signaling components to terminal differentiation genes affecting a broad repertoire of cell behaviors and metabolic reactions. Ubx up- and down-regulates hundreds of downstream genes at each stage, mostly in a subtle manner. Strikingly, our analysis reveals that Ubx target genes are largely distinct at different stages of appendage morphogenesis, suggesting extensive interactions between Hox genes and hormone-controlled regulatory networks to orchestrate complex genetic programs during metamorphosis.
Collapse
|
21
|
Abstract
Gap genes are involved in segment determination during the early development of the fruit fly Drosophila melanogaster as well as in other insects. This review attempts to synthesize the current knowledge of the gap gene network through a comprehensive survey of the experimental literature. I focus on genetic and molecular evidence, which provides us with an almost-complete picture of the regulatory interactions responsible for trunk gap gene expression. I discuss the regulatory mechanisms involved, and highlight the remaining ambiguities and gaps in the evidence. This is followed by a brief discussion of molecular regulatory mechanisms for transcriptional regulation, as well as precision and size-regulation provided by the system. Finally, I discuss evidence on the evolution of gap gene expression from species other than Drosophila. My survey concludes that studies of the gap gene system continue to reveal interesting and important new insights into the role of gene regulatory networks in development and evolution.
Collapse
Affiliation(s)
- Johannes Jaeger
- Centre de Regulació Genòmica, Universtitat Pompeu Fabra, Barcelona, Spain.
| |
Collapse
|
22
|
Shanley L, Davidson S, Lear M, Thotakura AK, McEwan IJ, Ross RA, MacKenzie A. Long-range regulatory synergy is required to allow control of the TAC1 locus by MEK/ERK signalling in sensory neurones. Neurosignals 2010; 18:173-85. [PMID: 21160161 PMCID: PMC3718575 DOI: 10.1159/000322010] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Accepted: 10/13/2010] [Indexed: 01/05/2023] Open
Abstract
Changes in the expression of the neuropeptide substance P (SP) in different populations of sensory neurones are associated with the progression of chronic inflammatory disease. Thus, understanding the genomic and cellular mechanisms driving the expression of the TAC1 gene, which encodes SP, in sensory neurones is essential to understanding its role in inflammatory disease. We used a novel combination of computational genomics, primary-cell culture and mouse transgenics to determine the genomic and cellular mechanisms that control the expression of TAC1 in sensory neurones. Intriguingly, we demonstrated that the promoter of the TAC1 gene must act in synergy with a remote enhancer, identified using comparative genomics, to respond to MAPK signalling that modulates the expression of TAC1 in sensory neurones. We also reveal that noxious stimulation of sensory neurones triggers this synergy in larger diameter sensory neurones – an expression of SP associated with hyperalgesia. This noxious stimulation of TAC1 enhancer-promotor synergy could be strongly blocked by antagonism of the MEK pathway. This study provides a unique insight into the role of long-range enhancer-promoter synergy and selectivity in the tissue-specific response of promoters to specific signal transduction pathways and suggests a possible new avenue for the development of novel anti-inflammatory therapies.
Collapse
Affiliation(s)
- Lynne Shanley
- School of Medical Sciences, University of Aberdeen, Aberdeen, UK
| | | | | | | | | | | | | |
Collapse
|
23
|
Su J, Teichmann SA, Down TA. Assessing computational methods of cis-regulatory module prediction. PLoS Comput Biol 2010; 6:e1001020. [PMID: 21152003 PMCID: PMC2996316 DOI: 10.1371/journal.pcbi.1001020] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 10/29/2010] [Indexed: 01/02/2023] Open
Abstract
Computational methods attempting to identify instances of cis-regulatory modules (CRMs) in the genome face a challenging problem of searching for potentially interacting transcription factor binding sites while knowledge of the specific interactions involved remains limited. Without a comprehensive comparison of their performance, the reliability and accuracy of these tools remains unclear. Faced with a large number of different tools that address this problem, we summarized and categorized them based on search strategy and input data requirements. Twelve representative methods were chosen and applied to predict CRMs from the Drosophila CRM database REDfly, and across the human ENCODE regions. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. When discriminating CRMs from non-coding regions, those methods considering evolutionary conservation have a stronger predictive power than methods designed to be run on a single genome. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. For example, some favour homotypical clusters of binding sites, while others perform best on short CRMs. Furthermore, most methods appear to be sensitive to the composition and structure of the genome to which they are applied. We analyze the principal features that distinguish the methods that performed well, identify weaknesses leading to poor performance, and provide a guide for users. We also propose key considerations for the development and evaluation of future CRM-prediction methods. Transcriptional regulation involves multiple transcription factors binding to DNA sequences. A limited repertoire of transcription factors performs this complex regulatory step through various spatial and temporal interactions between themselves and their binding sites. These transcription factor binding interactions are clustered as distinct modules: cis-regulatory modules (CRMs). Computational methods attempting to identify instances of CRMs in the genome face a challenging problem because a majority of these interactions between transcription factors remain unknown. To investigate the reliability and accuracy of these methods, we chose twelve representative methods and applied them to predict CRMs on both the fly and human genomes. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. We provide a guide for users and key considerations for developers. We also expect that, along with new technology generating new types of genomic data, future CRM prediction methods will be able to reveal transcription binding interactions in three-dimensional space.
Collapse
Affiliation(s)
- Jing Su
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | | |
Collapse
|
24
|
Meader S, Ponting CP, Lunter G. Massive turnover of functional sequence in human and other mammalian genomes. Genome Res 2010; 20:1335-43. [PMID: 20693480 DOI: 10.1101/gr.108795.110] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: First, what fraction of any species' genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both questions by applying, across the mammalian phylogeny, an evolutionary model that estimates the amount of functional DNA that is shared between two species' genomes. Our main findings are, first, that as the divergence between mammalian species increases, the predicted amount of pairwise shared functional sequence drops off dramatically. We show by simulations that this is not an artifact of the method, but rather indicates that functional (and mostly noncoding) sequence is turning over at a very high rate. We estimate that between 200 and 300 Mb (∼6.5%-10%) of the human genome is under functional constraint, which includes five to eight times as many constrained noncoding bases than bases that code for protein. In contrast, in D. melanogaster we estimate only 56-66 Mb to be constrained, implying a ratio of noncoding to coding constrained bases of about 2. This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity.
Collapse
Affiliation(s)
- Stephen Meader
- MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom
| | | | | |
Collapse
|
25
|
Marlétaz F, Gyapay G, Le Parco Y. High level of structural polymorphism driven by mobile elements in the Hox genomic region of the Chaetognath Spadella cephaloptera. Genome Biol Evol 2010; 2:665-77. [PMID: 20829282 PMCID: PMC2997562 DOI: 10.1093/gbe/evq047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2010] [Indexed: 11/22/2022] Open
Abstract
Little is known about the relationships between genome polymorphism, mobile element dynamics, and population size among animal populations. The chaetognath species Spadella cephaloptera offers a unique perspective to examine this issue because they display a high level of genetic polymorphism at the population level. Here, we have investigated in detail the extent of nucleotide and structural polymorphism in a region harboring Hox1 and several coding genes and presumptive functional elements. Sequencing of several bacterial artificial chromosome inserts representative of this nuclear region uncovered a high level of structural heterogeneity, which is mainly caused by the polymorphic insertion of a diversity of genetic mobile elements. By anchoring this variation through individual genotyping, we demonstrated that sequence diversity could be attributed to the allelic pool of a single population, which was confirmed by detection of extensive recombination within the genomic region studied. The high average level of nucleotide heterozygosity provides clues of selection in both coding and noncoding domains. This pattern stresses how selective processes remarkably cope with intense sequence turnover due to substitutions, mobile element insertions, and recombination to preserve the integrity of functional landscape. These findings suggest that genome polymorphism could provide pivotal information for future functional annotation of genomes.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Centre d'Océanologie de Marseille, CNRS UMR 6540 DIMAR, Université de la Méditerranée (Aix-Marseille II), Station Marine d'Endoume, Marseille, France
| | - Gabor Gyapay
- Genoscope (CEA), CNRS UMR 8030, Université d'Evry, Evry, France
| | - Yannick Le Parco
- Centre d'Océanologie de Marseille, CNRS UMR 6540 DIMAR, Université de la Méditerranée (Aix-Marseille II), Station Marine d'Endoume, Marseille, France
| |
Collapse
|
26
|
Gabrieli P, Falaguerra A, Siciliano P, Gomulski LM, Scolari F, Zacharopoulou A, Franz G, Malacrida AR, Gasperi G. Sex and the single embryo: early deveiopment in the Mediterranean fruit fly, Ceratitis capitata. BMC DEVELOPMENTAL BIOLOGY 2010; 10:12. [PMID: 20102629 PMCID: PMC2826288 DOI: 10.1186/1471-213x-10-12] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2009] [Accepted: 01/26/2010] [Indexed: 01/17/2023]
Abstract
Background In embryos the maternal-to-zygotic transition (MTZ) integrates post-transcriptional regulation of maternal transcripts with transcriptional activation of the zygotic genome. Although the molecular mechanisms underlying this event are being clarified in Drosophila melanogaster, little is know about the embryogenic processes in other insect species. The recent publication of expressed sequence tags (ESTs) from embryos of the global pest species Ceratitis capitata (medfly) has enabled the investigation of embryogenesis in this species and has allowed a comparison of the embryogenic processes in these two related dipteran species, C. capitata and D. melanogaster, that shared a common ancestor 80-100 mya. Results Using a novel PCR-based sexing method, which takes advantage of a putative LTR retrotransposon MITE insertion on the medfly Y chromosome, the transcriptomes of individual early male and female embryos were analysed using RT-PCR. This study is focused on two crucial aspects of the onset of embryonic development: sex determination and cellular blastoderm formation. Together with the three known medfly genes (Cctransformer, Cctransformer2 and Ccdoublesex), the expression patterns of other medfly genes that are similar to the D. melanogaster sex-determination genes (sisterlessA, groucho, deadpan, Sex-lethal, female lethal d, sans fille and intersex) and four cellular blastoderm formation genes (Rho1, spaghetti squash, slow-as-molasses and serendipity-α) were analyzed, allowing us to sketch a preliminary outline of the embryonic process in the medfly. Furthermore, a putative homologue of the Zelda gene has been considered, which in D. melanogaster encodes a DNA-binding factor responsible for the maternal-to-zygotic transition. Conclusions Our novel sexing method facilitates the study of i) when the MTZ transition occurs in males and females of C. capitata, ii) when and how the maternal information of "female-development" is reprogrammed in the embryos and iii) similarities and differences in the regulation of gene expression in C. capitata and D. melanogaster. We suggest a new model for the onset of the sex determination cascade in the medfly: the maternally inherited Cctra transcripts in the female embryos are insufficient to produce enough active protein to inhibit the male mode of Cctra splicing. The slow rate of development and the inefficiency of the splicing mechanism in the pre-cellular blastoderm facilitates the male-determining factor (M) activity, which probably acts by inhibiting CcTRA protein activity.
Collapse
Affiliation(s)
- Paolo Gabrieli
- Department of Animal Biology, University of Pavia, Piazza Botta 9, 27100 Pavia, Italy.
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Borok MJ, Tran DA, Ho MCW, Drewell RA. Dissecting the regulatory switches of development: lessons from enhancer evolution in Drosophila. Development 2010; 137:5-13. [PMID: 20023155 PMCID: PMC2796927 DOI: 10.1242/dev.036160] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Cis-regulatory modules are non-protein-coding regions of DNA essential for the control of gene expression. One class of regulatory modules is embryonic enhancers, which drive gene expression during development as a result of transcription factor protein binding at the enhancer sequences. Recent comparative studies have begun to investigate the evolution of the sequence architecture within enhancers. These analyses are illuminating the way that developmental biologists think about enhancers by revealing their molecular mechanism of function.
Collapse
Affiliation(s)
| | | | - Margaret C. W. Ho
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Robert A. Drewell
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| |
Collapse
|
28
|
Meireles-Filho ACA, Stark A. Comparative genomics of gene regulation-conservation and divergence of cis-regulatory information. Curr Opin Genet Dev 2009; 19:565-70. [PMID: 19913403 DOI: 10.1016/j.gde.2009.10.006] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Revised: 10/06/2009] [Accepted: 10/06/2009] [Indexed: 01/13/2023]
Abstract
We recently witnessed a tremendous increase in genomics studies on gene regulation and in entirely sequenced genomes from closely related species. This has triggered analyses that suggest a wide range of evolutionary dynamics of gene regulation, from rapid turnover of transcription-factor binding sites to conservation of enhancer function across large evolutionary distances. Many examples show that enhancers can evolve beyond recognizable sequence similarity while retaining function. However, bioinformatics approaches are increasingly able to detect conserved regulatory elements through characteristic evolutionary sequence signatures. Cis-regulatory changes are also a major source of morphological evolution, which might be facilitated by many biochemically functional elements that are selectively neutral and by the buffering function of redundant enhancers and 'shadow' enhancers.
Collapse
|
29
|
Wilson MD, Odom DT. Evolution of transcriptional control in mammals. Curr Opin Genet Dev 2009; 19:579-85. [PMID: 19913406 DOI: 10.1016/j.gde.2009.10.003] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2009] [Revised: 09/07/2009] [Accepted: 10/07/2009] [Indexed: 01/18/2023]
Abstract
Changes in gene expression directed by transcriptional regulators can give rise to new phenotypes. While gene expression profiles can be maintained across large evolutionary distances, transcription factor-DNA interactions diverge rapidly. The application of new genome-wide methodologies has begun refining our global understanding of when and where mammalian transcription factors interact with DNA, thereby providing new insight into the mechanisms of transcriptional evolution. The interplay between cis and trans regulation of gene expression is an increasingly active area of investigation, and recent studies suggest that mutations in cis-regulatory DNA can explain many inter-species differences in gene expression.
Collapse
Affiliation(s)
- Michael D Wilson
- Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| | | |
Collapse
|
30
|
Ho MCW, Johnsen H, Goetz SE, Schiller BJ, Bae E, Tran DA, Shur AS, Allen JM, Rau C, Bender W, Fisher WW, Celniker SE, Drewell RA. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila. PLoS Genet 2009; 5:e1000709. [PMID: 19893611 PMCID: PMC2763271 DOI: 10.1371/journal.pgen.1000709] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Accepted: 10/05/2009] [Indexed: 11/19/2022] Open
Abstract
It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.
Collapse
Affiliation(s)
- Margaret C. W. Ho
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Holly Johnsen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Sara E. Goetz
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Benjamin J. Schiller
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Esther Bae
- College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, California, United States of America
| | - Diana A. Tran
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Andrey S. Shur
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - John M. Allen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Christoph Rau
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Welcome Bender
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - William W. Fisher
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Susan E. Celniker
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Robert A. Drewell
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| |
Collapse
|
31
|
Vavouri T, Lehner B. Conserved noncoding elements and the evolution of animal body plans. Bioessays 2009; 31:727-35. [PMID: 19492354 DOI: 10.1002/bies.200900014] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The genomes of vertebrates, flies, and nematodes contain highly conserved noncoding elements (CNEs). CNEs cluster around genes that regulate development, and where tested, they can act as transcriptional enhancers. Within an animal group CNEs are the most conserved sequences but between groups they are normally diverged beyond recognition. Alternative CNEs are, however, associated with an overlapping set of genes that control development in all animals. Here, we discuss the evidence that CNEs are part of the core gene regulatory networks (GRNs) that specify alternative animal body plans. The major animal groups arose >550 million years ago. We propose that the cis-regulatory inputs identified by CNEs arose during the "re-wiring" of regulatory interactions that occurred during early animal evolution. Consequently, different animal groups, with different core GRNs, contain alternative sets of CNEs. Due to the subsequent stability of animal body plans, these core regulatory sequences have been evolving in parallel under strong purifying selection in different animal groups.
Collapse
Affiliation(s)
- Tanya Vavouri
- EMBL-CRG Systems Biology Research Unit, Dr. Aiguader 88, Barcelona, Spain.
| | | |
Collapse
|
32
|
Rasmussen DA, Noor MAF. What can you do with 0.1x genome coverage? A case study based on a genome survey of the scuttle fly Megaselia scalaris (Phoridae). BMC Genomics 2009; 10:382. [PMID: 19689807 PMCID: PMC2735751 DOI: 10.1186/1471-2164-10-382] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2009] [Accepted: 08/18/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The declining cost of DNA sequencing is making genome sequencing a feasible option for more organisms, including many of interest to ecologists and evolutionary biologists. While obtaining high-depth, completely assembled genome sequences for most non-model organisms remains challenging, low-coverage genome survey sequences (GSS) can provide a wealth of biologically useful information at low cost. Here, using a random pyrosequencing approach, we sequence the genome of the scuttle fly Megaselia scalaris and evaluate the utility of our low-coverage GSS approach. RESULTS Random pyrosequencing of the M. scalaris genome provided a depth of coverage (0.05x0.1x) much lower than typical GSS studies. We demonstrate that, even with extremely low-coverage sequencing, bioinformatics approaches can yield extensive information about functional and repetitive elements. We also use our GSS data to develop genomic resources such as a nearly complete mitochondrial genome sequence and microsatellite markers for M. scalaris. CONCLUSION We conclude that low-coverage genome surveys are effective at generating useful information about organisms currently lacking genomic sequence data.
Collapse
|
33
|
Genomics: Big is beautiful. Nature 2009. [DOI: 10.1038/458263a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|