1
|
Schember I, Reid W, Sterling-Lentsch G, Halfon MS. Conserved and novel enhancers in the Aedes aegypti single-minded locus recapitulate embryonic ventral midline gene expression. PLoS Genet 2024; 20:e1010891. [PMID: 38683842 PMCID: PMC11081499 DOI: 10.1371/journal.pgen.1010891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 05/09/2024] [Accepted: 04/16/2024] [Indexed: 05/02/2024] Open
Abstract
Transcriptional cis-regulatory modules, e.g., enhancers, control the time and location of metazoan gene expression. While changes in enhancers can provide a powerful force for evolution, there is also significant deep conservation of enhancers for developmentally important genes, with function and sequence characteristics maintained over hundreds of millions of years of divergence. Not well understood, however, is how the overall regulatory composition of a locus evolves, with important outstanding questions such as how many enhancers are conserved vs. novel, and to what extent are the locations of conserved enhancers within a locus maintained? We begin here to address these questions with a comparison of the respective single-minded (sim) loci in the two dipteran species Drosophila melanogaster (fruit fly) and Aedes aegypti (mosquito). sim encodes a highly conserved transcription factor that mediates development of the arthropod embryonic ventral midline. We identify two enhancers in the A. aegypti sim locus and demonstrate that they function equivalently in both transgenic flies and transgenic mosquitoes. One A. aegypti enhancer is highly similar to known Drosophila counterparts in its activity, location, and autoregulatory capability. The other differs from any known Drosophila sim enhancers with a novel location, failure to autoregulate, and regulation of expression in a unique subset of midline cells. Our results suggest that the conserved pattern of sim expression in the two species is the result of both conserved and novel regulatory sequences. Further examination of this locus will help to illuminate how the overall regulatory landscape of a conserved developmental gene evolves.
Collapse
Affiliation(s)
- Isabella Schember
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - William Reid
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Geyenna Sterling-Lentsch
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- New York State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, New York, United States of America
| |
Collapse
|
2
|
Dyer NA, Lucas ER, Nagi SC, McDermott DP, Brenas JH, Miles A, Clarkson CS, Mawejje HD, Wilding CS, Halfon MS, Asma H, Heinz E, Donnelly MJ. Mechanisms of transcriptional regulation in Anopheles gambiae revealed by allele specific expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.22.568226. [PMID: 38045426 PMCID: PMC10690255 DOI: 10.1101/2023.11.22.568226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Malaria control relies on insecticides targeting the mosquito vector, but this is increasingly compromised by insecticide resistance, which can be achieved by elevated expression of detoxifying enzymes that metabolize the insecticide. In diploid organisms, gene expression is regulated both in cis, by regulatory sequences on the same chromosome, and by trans acting factors, affecting both alleles equally. Differing levels of transcription can be caused by mutations in cis-regulatory modules (CRM), but few of these have been identified in mosquitoes. We crossed bendiocarb resistant and susceptible Anopheles gambiae strains to identify cis-regulated genes that might be responsible for the resistant phenotype using RNAseq, and cis-regulatory module sequences controlling gene expression in insecticide resistance relevant tissues were predicted using machine learning. We found 115 genes showing allele specific expression in hybrids of insecticide susceptible and resistant strains, suggesting cis regulation is an important mechanism of gene expression regulation in Anopheles gambiae. The genes showing allele specific expression included a higher proportion of Anopheles specific genes on average younger than genes those with balanced allelic expression.
Collapse
Affiliation(s)
- Naomi A Dyer
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - Eric R Lucas
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - Sanjay C Nagi
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - Daniel P McDermott
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - Jon H Brenas
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Alistair Miles
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Chris S Clarkson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Henry D Mawejje
- Infectious Diseases Research Collaboration (IDRC), Plot 2C Nakasero Hill Road, P.O.Box 7475, Kampala, Uganda
| | - Craig S Wilding
- School of Biological and Environmental Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, UK
| | - Marc S Halfon
- Department of Biochemistry, Jacobs School of Medicine & Biomedical Sciences, University at Buffalo-State University of New York, 955 Main Street, Buffalo, New York 14203, USA
| | - Hasiba Asma
- Department of Biochemistry, Jacobs School of Medicine & Biomedical Sciences, University at Buffalo-State University of New York, 955 Main Street, Buffalo, New York 14203, USA
| | - Eva Heinz
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
- Department of Clinical Sciences, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| | - Martin J Donnelly
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK
| |
Collapse
|
3
|
Mazo-Vargas A, Langmüller AM, Wilder A, van der Burg KRL, Lewis JJ, Messer PW, Zhang L, Martin A, Reed RD. Deep cis-regulatory homology of the butterfly wing pattern ground plan. Science 2022; 378:304-308. [PMID: 36264807 DOI: 10.1126/science.abi9407] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Butterfly wing patterns derive from a deeply conserved developmental ground plan yet are diverse and evolve rapidly. It is poorly understood how gene regulatory architectures can accommodate both deep homology and adaptive change. To address this, we characterized the cis-regulatory evolution of the color pattern gene WntA in nymphalid butterflies. Comparative assay for transposase-accessible chromatin using sequencing (ATAC-seq) and in vivo deletions spanning 46 cis-regulatory elements across five species revealed deep homology of ground plan-determining sequences, except in monarch butterflies. Furthermore, noncoding deletions displayed both positive and negative regulatory effects that were often broad in nature. Our results provide little support for models predicting rapid enhancer turnover and suggest that deeply ancestral, multifunctional noncoding elements can underlie rapidly evolving trait systems.
Collapse
Affiliation(s)
- Anyi Mazo-Vargas
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.,Department of Biological Sciences, The George Washington University, Washington, DC, USA
| | - Anna M Langmüller
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Alexis Wilder
- Department of Biological Sciences, The George Washington University, Washington, DC, USA
| | | | - James J Lewis
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.,Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| | - Philipp W Messer
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Linlin Zhang
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.,CAS and Shandong Province Key Laboratory of Experimental Marine Biology, Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China
| | - Arnaud Martin
- Department of Biological Sciences, The George Washington University, Washington, DC, USA
| | - Robert D Reed
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA
| |
Collapse
|
4
|
Keränen SVE, Villahoz-Baleta A, Bruno AE, Halfon MS. REDfly: An Integrated Knowledgebase for Insect Regulatory Genomics. INSECTS 2022; 13:618. [PMID: 35886794 PMCID: PMC9323752 DOI: 10.3390/insects13070618] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/01/2022] [Accepted: 07/06/2022] [Indexed: 11/29/2022]
Abstract
We provide here an updated description of the REDfly (Regulatory Element Database for Fly) database of transcriptional regulatory elements, a unique resource that provides regulatory annotation for the genome of Drosophila and other insects. The genomic sequences regulating insect gene expression-transcriptional cis-regulatory modules (CRMs, e.g., "enhancers") and transcription factor binding sites (TFBSs)-are not currently curated by any other major database resources. However, knowledge of such sequences is important, as CRMs play critical roles with respect to disease as well as normal development, phenotypic variation, and evolution. Characterized CRMs also provide useful tools for both basic and applied research, including developing methods for insect control. REDfly, which is the most detailed existing platform for metazoan regulatory-element annotation, includes over 40,000 experimentally verified CRMs and TFBSs along with their DNA sequences, their associated genes, and the expression patterns they direct. Here, we briefly describe REDfly's contents and data model, with an emphasis on the new features implemented since 2020. We then provide an illustrated walk-through of several common REDfly search use cases.
Collapse
Affiliation(s)
| | - Angel Villahoz-Baleta
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA; (A.V.-B.); (A.E.B.)
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Andrew E. Bruno
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA; (A.V.-B.); (A.E.B.)
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S. Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
5
|
Common Themes and Future Challenges in Understanding Gene Regulatory Network Evolution. Cells 2022; 11:cells11030510. [PMID: 35159319 PMCID: PMC8834487 DOI: 10.3390/cells11030510] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 12/18/2022] Open
Abstract
A major driving force behind the evolution of species-specific traits and novel structures is alterations in gene regulatory networks (GRNs). Comprehending evolution therefore requires an understanding of the nature of changes in GRN structure and the responsible mechanisms. Here, we review two insect pigmentation GRNs in order to examine common themes in GRN evolution and to reveal some of the challenges associated with investigating changes in GRNs across different evolutionary distances at the molecular level. The pigmentation GRN in Drosophila melanogaster and other drosophilids is a well-defined network for which studies from closely related species illuminate the different ways co-option of regulators can occur. The pigmentation GRN for butterflies of the Heliconius species group is less fully detailed but it is emerging as a useful model for exploring important questions about redundancy and modularity in cis-regulatory systems. Both GRNs serve to highlight the ways in which redeployment of trans-acting factors can lead to GRN rewiring and network co-option. To gain insight into GRN evolution, we discuss the importance of defining GRN architecture at multiple levels both within and between species and of utilizing a range of complementary approaches.
Collapse
|
6
|
Holm I, Nardini L, Pain A, Bischoff E, Anderson CE, Zongo S, Guelbeogo WM, Sagnon N, Gohl DM, Nowling RJ, Vernick KD, Riehle MM. Comprehensive Genomic Discovery of Non-Coding Transcriptional Enhancers in the African Malaria Vector Anopheles coluzzii. Front Genet 2022; 12:785934. [PMID: 35082832 PMCID: PMC8784733 DOI: 10.3389/fgene.2021.785934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 12/10/2021] [Indexed: 11/24/2022] Open
Abstract
Almost all regulation of gene expression in eukaryotic genomes is mediated by the action of distant non-coding transcriptional enhancers upon proximal gene promoters. Enhancer locations cannot be accurately predicted bioinformatically because of the absence of a defined sequence code, and thus functional assays are required for their direct detection. Here we used a massively parallel reporter assay, Self-Transcribing Active Regulatory Region sequencing (STARR-seq), to generate the first comprehensive genome-wide map of enhancers in Anopheles coluzzii, a major African malaria vector in the Gambiae species complex. The screen was carried out by transfecting reporter libraries created from the genomic DNA of 60 wild A. coluzzii from Burkina Faso into A. coluzzii 4a3A cells, in order to functionally query enhancer activity of the natural population within the homologous cellular context. We report a catalog of 3,288 active genomic enhancers that were significant across three biological replicates, 74% of them located in intergenic and intronic regions. The STARR-seq enhancer screen is chromatin-free and thus detects inherent activity of a comprehensive catalog of enhancers that may be restricted in vivo to specific cell types or developmental stages. Testing of a validation panel of enhancer candidates using manual luciferase assays confirmed enhancer function in 26 of 28 (93%) of the candidates over a wide dynamic range of activity from two to at least 16-fold activity above baseline. The enhancers occupy only 0.7% of the genome, and display distinct composition features. The enhancer compartment is significantly enriched for 15 transcription factor binding site signatures, and displays divergence for specific dinucleotide repeats, as compared to matched non-enhancer genomic controls. The genome-wide catalog of A. coluzzii enhancers is publicly available in a simple searchable graphic format. This enhancer catalogue will be valuable in linking genetic and phenotypic variation, in identifying regulatory elements that could be employed in vector manipulation, and in better targeting of chromosome editing to minimize extraneous regulation influences on the introduced sequences. Importance: Understanding the role of the non-coding regulatory genome in complex disease phenotypes is essential, but even in well-characterized model organisms, identification of regulatory regions within the vast non-coding genome remains a challenge. We used a large-scale assay to generate a genome wide map of transcriptional enhancers. Such a catalogue for the important malaria vector, Anopheles coluzzii, will be an important research tool as the role of non-coding regulatory variation in differential susceptibility to malaria infection is explored and as a public resource for research on this important insect vector of disease.
Collapse
Affiliation(s)
- Inge Holm
- Institut Pasteur, Université de Paris, CNRS UMR 2000, Unit of Insect Vector Genetics and Genomics, Department of Parasites and Insect Vectors, Paris, France
| | - Luisa Nardini
- Institut Pasteur, Université de Paris, CNRS UMR 2000, Unit of Insect Vector Genetics and Genomics, Department of Parasites and Insect Vectors, Paris, France
| | - Adrien Pain
- Institut Pasteur, Université de Paris, CNRS UMR 2000, Unit of Insect Vector Genetics and Genomics, Department of Parasites and Insect Vectors, Paris, France.,Institut Pasteur, Université de Paris, Hub de Bioinformatique et Biostatistique, Paris, France
| | - Emmanuel Bischoff
- Institut Pasteur, Université de Paris, CNRS UMR 2000, Unit of Insect Vector Genetics and Genomics, Department of Parasites and Insect Vectors, Paris, France
| | - Cameron E Anderson
- Department of Microbiology and Immunology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Soumanaba Zongo
- Centre National de Recherche et de Formation sur le Paludisme (CNRFP), Ministry of Health, Ouagadougou, Burkina Faso
| | - Wamdaogo M Guelbeogo
- Centre National de Recherche et de Formation sur le Paludisme (CNRFP), Ministry of Health, Ouagadougou, Burkina Faso
| | - N'Fale Sagnon
- Centre National de Recherche et de Formation sur le Paludisme (CNRFP), Ministry of Health, Ouagadougou, Burkina Faso
| | - Daryl M Gohl
- University of Minnesota Genomics Center, Minneapolis, MN, United States.,Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN, United States
| | - Ronald J Nowling
- Department of Electrical Engineering and Computer Science, Milwaukee School of Engineering (MSOE), Milwaukee, WI, United States
| | - Kenneth D Vernick
- Institut Pasteur, Université de Paris, CNRS UMR 2000, Unit of Insect Vector Genetics and Genomics, Department of Parasites and Insect Vectors, Paris, France
| | - Michelle M Riehle
- Department of Microbiology and Immunology, Medical College of Wisconsin, Milwaukee, WI, United States
| |
Collapse
|
7
|
Schember I, Halfon MS. Identification of new Anopheles gambiae transcriptional enhancers using a cross-species prediction approach. INSECT MOLECULAR BIOLOGY 2021; 30:410-419. [PMID: 33866636 PMCID: PMC8266755 DOI: 10.1111/imb.12705] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 02/09/2021] [Accepted: 03/31/2021] [Indexed: 06/12/2023]
Abstract
The success of transgenic mosquito vector control approaches relies on well-targeted gene expression, requiring the identification and characterization of a diverse set of mosquito promoters and transcriptional enhancers. However, few enhancers have been characterized in Anopheles gambiae to date. Here, we employ the SCRMshaw method we previously developed to predict enhancers in the A. gambiae genome, preferentially targeting vector-relevant tissues such as the salivary glands, midgut and nervous system. We demonstrate a high overall success rate, with at least 8 of 11 (73%) tested sequences validating as enhancers in an in vivo xenotransgenic assay. Four tested sequences drive expression in either the salivary gland or the midgut, making them directly useful for probing the biology of these infection-relevant tissues. The success of our study suggests that computational enhancer prediction should serve as an effective means for identifying A. gambiae enhancers with activity in tissues involved in malaria propagation and transmission.
Collapse
Affiliation(s)
- Isabella Schember
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14263
| |
Collapse
|
8
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
9
|
Hong J, Gao R, Yang Y. CrepHAN: Cross-species prediction of enhancers by using hierarchical attention networks. Bioinformatics 2021; 37:3436-3443. [PMID: 33978703 DOI: 10.1093/bioinformatics/btab349] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 04/21/2021] [Accepted: 05/06/2021] [Indexed: 01/17/2023] Open
Abstract
MOTIVATION Enhancers are important functional elements in genome sequences. The identification of enhancers is a very challenging task due to the great diversity of enhancer sequences and the flexible localization on genomes. Till now, the interactions between enhancers and genes have not been fully understood yet. To speed up the studies of the regulatory roles of enhancers, computational tools for the prediction of enhancers have emerged in recent years. Especially, thanks to the ENCODE project and the advances of high-throughput experimental techniques, a large amount of experimentally verified enhancers have been annotated on the human genome, which allows large-scale predictions of unknown enhancers using data-driven methods. However, except for human and some model organisms, the validated enhancer annotations are scarce for most species, leading to more difficulties in the computational identification of enhancers for their genomes. RESULTS In this study, we propose a deep learning-based predictor for enhancers, named CrepHAN, which is featured by a hierarchical attention neural network and word embedding-based representations for DNA sequences. We use the experimentally-supported data of the human genome to train the model, and perform experiments on human and other mammals, including mouse, cow, and dog. The experimental results show that CrepHAN has more advantages on cross-species predictions, and outperforms the existing models by a large margin. Especially, for human-mouse cross-predictions, the AUC score of ROC curve is increased by 0.033∼0.145 on the combined tissue dataset and 0.032∼0.109 on tissue-specific datasets. AVAILABILITY bcmi.sjtu.edu.cn/~yangyang/CrepHAN.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianwei Hong
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China.,School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ruitian Gao
- Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China.,Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, 200240, China
| |
Collapse
|
10
|
Ruiz JL, Ranford-Cartwright LC, Gómez-Díaz E. The regulatory genome of the malaria vector Anopheles gambiae: integrating chromatin accessibility and gene expression. NAR Genom Bioinform 2021; 3:lqaa113. [PMID: 33987532 PMCID: PMC8092447 DOI: 10.1093/nargab/lqaa113] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 12/15/2020] [Accepted: 12/26/2020] [Indexed: 12/12/2022] Open
Abstract
Anopheles gambiae mosquitoes are primary human malaria vectors, but we know very little about their mechanisms of transcriptional regulation. We profiled chromatin accessibility by the assay for transposase-accessible chromatin by sequencing (ATAC-seq) in laboratory-reared A. gambiae mosquitoes experimentally infected with the human malaria parasite Plasmodium falciparum. By integrating ATAC-seq, RNA-seq and ChIP-seq data, we showed a positive correlation between accessibility at promoters and introns, gene expression and active histone marks. By comparing expression and chromatin structure patterns in different tissues, we were able to infer cis-regulatory elements controlling tissue-specific gene expression and to predict the in vivo binding sites of relevant transcription factors. The ATAC-seq assay also allowed the precise mapping of active regulatory regions, including novel transcription start sites and enhancers that were annotated to mosquito immune-related genes. Not only is this study important for advancing our understanding of mechanisms of transcriptional regulation in the mosquito vector of human malaria, but the information we produced also has great potential for developing new mosquito-control and anti-malaria strategies.
Collapse
Affiliation(s)
- José L Ruiz
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN), Consejo Superior de Investigaciones Científicas, 18016 Granada, Spain
| | - Lisa C Ranford-Cartwright
- Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Science, University of Glasgow, Glasgow G12 8QQ, UK
| | - Elena Gómez-Díaz
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN), Consejo Superior de Investigaciones Científicas, 18016 Granada, Spain
| |
Collapse
|
11
|
Lezcano ÓM, Sánchez-Polo M, Ruiz JL, Gómez-Díaz E. Chromatin Structure and Function in Mosquitoes. Front Genet 2020; 11:602949. [PMID: 33365050 PMCID: PMC7750206 DOI: 10.3389/fgene.2020.602949] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 10/29/2020] [Indexed: 12/27/2022] Open
Abstract
The principles and function of chromatin and nuclear architecture have been extensively studied in model organisms, such as Drosophila melanogaster. However, little is known about the role of these epigenetic processes in transcriptional regulation in other insects including mosquitoes, which are major disease vectors and a worldwide threat for human health. Some of these life-threatening diseases are malaria, which is caused by protozoan parasites of the genus Plasmodium and transmitted by Anopheles mosquitoes; dengue fever, which is caused by an arbovirus mainly transmitted by Aedes aegypti; and West Nile fever, which is caused by an arbovirus transmitted by Culex spp. In this contribution, we review what is known about chromatin-associated mechanisms and the 3D genome structure in various mosquito vectors, including Anopheles, Aedes, and Culex spp. We also discuss the similarities between epigenetic mechanisms in mosquitoes and the model organism Drosophila melanogaster, and advocate that the field could benefit from the cross-application of state-of-the-art functional genomic technologies that are well-developed in the fruit fly. Uncovering the mosquito regulatory genome can lead to the discovery of unique regulatory networks associated with the parasitic life-style of these insects. It is also critical to understand the molecular interactions between the vectors and the pathogens that they transmit, which could hold the key to major breakthroughs on the fight against mosquito-borne diseases. Finally, it is clear that epigenetic mechanisms controlling mosquito environmental plasticity and evolvability are also of utmost importance, particularly in the current context of globalization and climate change.
Collapse
Affiliation(s)
| | | | - José L. Ruiz
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN), Consejo Superior de Investigaciones Científicas, Granada, Spain
| | - Elena Gómez-Díaz
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN), Consejo Superior de Investigaciones Científicas, Granada, Spain
| |
Collapse
|
12
|
Sinha S, Jones BM, Traniello IM, Bukhari SA, Halfon MS, Hofmann HA, Huang S, Katz PS, Keagy J, Lynch VJ, Sokolowski MB, Stubbs LJ, Tabe-Bordbar S, Wolfner MF, Robinson GE. Behavior-related gene regulatory networks: A new level of organization in the brain. Proc Natl Acad Sci U S A 2020; 117:23270-23279. [PMID: 32661177 PMCID: PMC7519311 DOI: 10.1073/pnas.1921625117] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Neuronal networks are the standard heuristic model today for describing brain activity associated with animal behavior. Recent studies have revealed an extensive role for a completely distinct layer of networked activities in the brain-the gene regulatory network (GRN)-that orchestrates expression levels of hundreds to thousands of genes in a behavior-related manner. We examine emerging insights into the relationships between these two types of networks and discuss their interplay in spatial as well as temporal dimensions, across multiple scales of organization. We discuss properties expected of behavior-related GRNs by drawing inspiration from the rich literature on GRNs related to animal development, comparing and contrasting these two broad classes of GRNs as they relate to their respective phenotypic manifestations. Developmental GRNs also represent a third layer of network biology, playing out over a third timescale, which is believed to play a crucial mediatory role between neuronal networks and behavioral GRNs. We end with a special emphasis on social behavior, discuss whether unique GRN organization and cis-regulatory architecture underlies this special class of behavior, and review literature that suggests an affirmative answer.
Collapse
Affiliation(s)
- Saurabh Sinha
- Department of Computer Science, University of Illinois, Urbana-Champaign, IL 61801;
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801
| | - Beryl M Jones
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544
| | - Ian M Traniello
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801
- Neuroscience Program, University of Illinois, Urbana-Champaign, IL 61801
| | - Syed A Bukhari
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801
- Informatics Program, University of Illinois, Urbana-Champaign, IL 61820
| | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
| | - Hans A Hofmann
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX 78712
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX 78712
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA 98109
| | - Paul S Katz
- Department of Biology, University of Massachusetts, Amherst, MA 01003
| | - Jason Keagy
- Department of Evolution, Ecology, and Behavior, School of Integrative Biology, University of Illinois, Urbana-Champaign, IL 61801
| | - Vincent J Lynch
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14260
| | - Marla B Sokolowski
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON M5S 3B2, Canada
- Program in Child and Brain Development, Canadian Institute for Advanced Research, Toronto, ON M5G 1M1, Canada
| | - Lisa J Stubbs
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801
- Department of Cell and Developmental Biology, University of Illinois, Urbana-Champaign, IL 61801
| | - Shayan Tabe-Bordbar
- Department of Computer Science, University of Illinois, Urbana-Champaign, IL 61801
| | - Mariana F Wolfner
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850
| | - Gene E Robinson
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, IL 61801;
- Neuroscience Program, University of Illinois, Urbana-Champaign, IL 61801
- Department of Entomology, University of Illinois, Urbana-Champaign, IL 61801
| |
Collapse
|
13
|
Rivera J, Keränen SVE, Gallo SM, Halfon MS. REDfly: the transcriptional regulatory element database for Drosophila. Nucleic Acids Res 2020; 47:D828-D834. [PMID: 30329093 PMCID: PMC6323911 DOI: 10.1093/nar/gky957] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 10/04/2018] [Indexed: 12/21/2022] Open
Abstract
The REDfly database provides a comprehensive curation of experimentally-validated Drosophila transcriptional cis-regulatory elements and includes information on DNA sequence, experimental evidence, patterns of regulated gene expression, and more. Now in its thirteenth year, REDfly has grown to over 23 000 records of tested reporter gene constructs and 2200 tested transcription factor binding sites. Recent developments include the start of curation of predicted cis-regulatory modules in addition to experimentally-verified ones, improved search and filtering, and increased interaction with the authors of curated papers. An expanded data model that will capture information on temporal aspects of gene regulation, regulation in response to environmental and other non-developmental cues, sexually dimorphic gene regulation, and non-endogenous (ectopic) aspects of reporter gene expression is under development and expected to be in place within the coming year. REDfly is freely accessible at http://redfly.ccr.buffalo.edu, and news about database updates and new features can be followed on Twitter at @REDfly_database.
Collapse
Affiliation(s)
- John Rivera
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | - Steven M Gallo
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
14
|
Tomoyasu Y, Halfon MS. How to study enhancers in non-traditional insect models. ACTA ACUST UNITED AC 2020; 223:223/Suppl_1/jeb212241. [PMID: 32034049 DOI: 10.1242/jeb.212241] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Transcriptional enhancers are central to the function and evolution of genes and gene regulation. At the organismal level, enhancers play a crucial role in coordinating tissue- and context-dependent gene expression. At the population level, changes in enhancers are thought to be a major driving force that facilitates evolution of diverse traits. An amazing array of diverse traits seen in insect morphology, physiology and behavior has been the subject of research for centuries. Although enhancer studies in insects outside of Drosophila have been limited, recent advances in functional genomic approaches have begun to make such studies possible in an increasing selection of insect species. Here, instead of comprehensively reviewing currently available technologies for enhancer studies in established model organisms such as Drosophila, we focus on a subset of computational and experimental approaches that are likely applicable to non-Drosophila insects, and discuss the pros and cons of each approach. We discuss the importance of validating enhancer function and evaluate several possible validation methods, such as reporter assays and genome editing. Key points and potential pitfalls when establishing a reporter assay system in non-traditional insect models are also discussed. We close with a discussion of how to advance enhancer studies in insects, both by improving computational approaches and by expanding the genetic toolbox in various insects. Through these discussions, this Review provides a conceptual framework for studying the function and evolution of enhancers in non-traditional insect models.
Collapse
Affiliation(s)
| | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
| |
Collapse
|
15
|
Peng PC, Khoueiry P, Girardot C, Reddington JP, Garfield DA, Furlong EEM, Sinha S. The Role of Chromatin Accessibility in cis-Regulatory Evolution. Genome Biol Evol 2020; 11:1813-1828. [PMID: 31114856 PMCID: PMC6601868 DOI: 10.1093/gbe/evz103] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2019] [Indexed: 02/07/2023] Open
Abstract
Transcription factor (TF) binding is determined by sequence as well as chromatin accessibility. Although the role of accessibility in shaping TF-binding landscapes is well recorded, its role in evolutionary divergence of TF binding, which in turn can alter cis-regulatory activities, is not well understood. In this work, we studied the evolution of genome-wide binding landscapes of five major TFs in the core network of mesoderm specification, between Drosophila melanogaster and Drosophila virilis, and examined its relationship to accessibility and sequence-level changes. We generated chromatin accessibility data from three important stages of embryogenesis in both Drosophila melanogaster and Drosophila virilis and recorded conservation and divergence patterns. We then used multivariable models to correlate accessibility and sequence changes to TF-binding divergence. We found that accessibility changes can in some cases, for example, for the master regulator Twist and for earlier developmental stages, more accurately predict binding change than is possible using TF-binding motif changes between orthologous enhancers. Accessibility changes also explain a significant portion of the codivergence of TF pairs. We noted that accessibility and motif changes offer complementary views of the evolution of TF binding and developed a combined model that captures the evolutionary data much more accurately than either view alone. Finally, we trained machine learning models to predict enhancer activity from TF binding and used these functional models to argue that motif and accessibility-based predictors of TF-binding change can substitute for experimentally measured binding change, for the purpose of predicting evolutionary changes in enhancer activity.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA
| | - Pierre Khoueiry
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,American University of Beirut (AUB), Department of Biochemistry and Molecular Genetics, Beirut, Lebanon
| | - Charles Girardot
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - James P Reddington
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - David A Garfield
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,IRI-Life Sciences, Humboldt Universität zu Berlin, Berlin, Germany
| | - Eileen E M Furlong
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
16
|
Asma H, Halfon MS. Computational enhancer prediction: evaluation and improvements. BMC Bioinformatics 2019; 20:174. [PMID: 30953451 PMCID: PMC6451241 DOI: 10.1186/s12859-019-2781-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity. RESULTS We introduce here pCRMeval, a pipeline for in silico evaluation of any enhancer prediction tools that are flexible enough to be applied to the Drosophila melanogaster genome. pCRMeval compares the result of predictions with the extensive existing knowledge of experimentally-validated Drosophila CRMs in order to estimate the precision and relative sensitivity of the prediction method. In the case of supervised prediction methods-when training data composed of validated CRMs are used-pCRMeval can also assess the sensitivity of specific training sets. We demonstrate the utility of pCRMeval through evaluation of our SCRMshaw CRM prediction method and training data. By measuring the impact of different parameters on SCRMshaw performance, as assessed by pCRMeval, we develop a more robust version of SCRMshaw, SCRMshaw_HD, that improves the number of predictions while maintaining sensitivity and specificity. Our analysis also demonstrates that SCRMshaw_HD, when applied to increasingly less well-assembled genomes, maintains its strong predictive power with only a minor drop-off in performance. CONCLUSION Our pCRMeval pipeline provides a general framework for evaluation that can be applied to any CRM prediction method, particularly a supervised method. While we make use of it here primarily to test and improve a particular method for CRM prediction, SCRMshaw, pCRMeval should provide a valuable platform to the research community not only for evaluating individual methods, but also for comparing between competing methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA
| | - Marc S Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biochemistry, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biological Sciences, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biomedical Informatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- NY State Center of Excellence in Bioinformatics and Life Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, 14263, USA.
| |
Collapse
|
17
|
Abstract
Although the number of sequenced insect genomes numbers in the hundreds, little is known about gene regulatory sequences in any species other than the well-studied Drosophila melanogaster. We provide here a detailed protocol for using SCRMshaw, a computational method for predicting cis-regulatory modules (CRMs, also "enhancers") in sequenced insect genomes. SCRMshaw is effective for CRM discovery throughout the range of holometabolous insects and potentially in even more diverged species, with true-positive prediction rates of 75% or better. Minimal requirements for using SCRMshaw are a genome sequence and training data in the form of known Drosophila CRMs; a comprehensive set of the latter can be obtained from the SCRMshaw download site. For basic applications, a user with only modest computational know-how can run SCRMshaw on a desktop computer. SCRMshaw can be run with a single, narrow set of training data to predict CRMs regulating a specific pattern of gene expression, or with multiple sets of training data covering a broad range of CRM activities to provide an initial rough regulatory annotation of a complete, newly-sequenced genome.
Collapse
Affiliation(s)
- Majid Kazemian
- Departments of Biochemistry and Computer Science, Purdue University, West Lafayette, IN, USA.
| | - Marc S Halfon
- Departments of Biochemistry, Biomedical Informatics, and Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY, USA.
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, USA.
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
| |
Collapse
|
18
|
Chen L, Fish AE, Capra JA. Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties. PLoS Comput Biol 2018; 14:e1006484. [PMID: 30286077 PMCID: PMC6191148 DOI: 10.1371/journal.pcbi.1006484] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 10/16/2018] [Accepted: 09/02/2018] [Indexed: 12/30/2022] Open
Abstract
Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution.
Collapse
Affiliation(s)
- Ling Chen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
| | - Alexandra E. Fish
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, United States of America
| | - John A. Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, United States of America
- Departments of Biomedical Informatics and Computer Science, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
19
|
Saul MC, Blatti C, Yang W, Bukhari SA, Shpigler HY, Troy JM, Seward CH, Sloofman L, Chandrasekaran S, Bell AM, Stubbs L, Robinson GE, Zhao SD, Sinha S. Cross-species systems analysis of evolutionary toolkits of neurogenomic response to social challenge. GENES BRAIN AND BEHAVIOR 2018; 18:e12502. [PMID: 29968347 DOI: 10.1111/gbb.12502] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 06/18/2018] [Accepted: 06/20/2018] [Indexed: 12/15/2022]
Abstract
Social challenges like territorial intrusions evoke behavioral responses in widely diverging species. Recent work has showed that evolutionary "toolkits"-genes and modules with lineage-specific variations but deep conservation of function-participate in the behavioral response to social challenge. Here, we develop a multispecies computational-experimental approach to characterize such a toolkit at a systems level. Brain transcriptomic responses to social challenge was probed via RNA-seq profiling in three diverged species-honey bees, mice and three-spined stickleback fish-following a common methodology, allowing fair comparisons across species. Data were collected from multiple brain regions and multiple time points after social challenge exposure, achieving anatomical and temporal resolution substantially greater than previous work. We developed statistically rigorous analyses equipped to find homologous functional groups among these species at the levels of individual genes, functional and coexpressed gene modules, and transcription factor subnetworks. We identified six orthogroups involved in response to social challenge, including groups represented by mouse genes Npas4 and Nr4a1, as well as common modulation of systems such as transcriptional regulators, ion channels, G-protein-coupled receptors and synaptic proteins. We also identified conserved coexpression modules enriched for mitochondrial fatty acid metabolism and heat shock that constitute the shared neurogenomic response. Our analysis suggests a toolkit wherein nuclear receptors, interacting with chaperones, induce transcriptional changes in mitochondrial activity, neural cytoarchitecture and synaptic transmission after social challenge. It shows systems-level mechanisms that have been repeatedly co-opted during evolution of analogous behaviors, thus advancing the genetic toolkit concept beyond individual genes.
Collapse
Affiliation(s)
- Michael C Saul
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Charles Blatti
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Wei Yang
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Syed A Bukhari
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Hagai Y Shpigler
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Ecology, Evolution and Behavior, Hebrew University, Jerusalem, Israel
| | - Joseph M Troy
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Christopher H Seward
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Laura Sloofman
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Genetics and Genomic Sciences, Mount Sinai Health System, New York, New York
| | | | - Alison M Bell
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Animal Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Lisa Stubbs
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Gene E Robinson
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Sihai D Zhao
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Saurabh Sinha
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| |
Collapse
|
20
|
Lai YT, Deem KD, Borràs-Castells F, Sambrani N, Rudolf H, Suryamohan K, El-Sherif E, Halfon MS, McKay DJ, Tomoyasu Y. Enhancer identification and activity evaluation in the red flour beetle, Tribolium castaneum. Development 2018. [PMID: 29540499 DOI: 10.1242/dev.160663] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Evolution of cis-regulatory elements (such as enhancers) plays an important role in the production of diverse morphology. However, a mechanistic understanding is often limited by the absence of methods for studying enhancers in species other than established model systems. Here, we sought to establish methods to identify and test enhancer activity in the red flour beetle, Tribolium castaneum To identify possible enhancer regions, we first obtained genome-wide chromatin profiles from various tissues and stages of Tribolium using FAIRE (formaldehyde-assisted isolation of regulatory elements)-sequencing. Comparison of these profiles revealed a distinct set of open chromatin regions in each tissue and at each stage. In addition, comparison of the FAIRE data with sets of computationally predicted (i.e. supervised cis-regulatory module-predicted) enhancers revealed a very high overlap between the two datasets. Second, using nubbin in the wing and hunchback in the embryo as case studies, we established the first universal reporter assay system that works in various contexts in Tribolium, and in a cross-species context. Together, these advances will facilitate investigation of cis-evolution and morphological diversity in Tribolium and other insects.
Collapse
Affiliation(s)
- Yi-Ting Lai
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | - Kevin D Deem
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | | | - Nagraj Sambrani
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | - Heike Rudolf
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91058, Germany
| | - Kushal Suryamohan
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Ezzat El-Sherif
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91058, Germany
| | - Marc S Halfon
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Daniel J McKay
- Department of Biology, Department of Genetics, Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
21
|
Gursky VV, Kozlov KN, Kulakovskiy IV, Zubair A, Marjoram P, Lawrie DS, Nuzhdin SV, Samsonova MG. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One 2017; 12:e0184657. [PMID: 28898266 PMCID: PMC5595321 DOI: 10.1371/journal.pone.0184657] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 08/28/2017] [Indexed: 11/18/2022] Open
Abstract
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects.
Collapse
Affiliation(s)
- Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
- * E-mail:
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Asif Zubair
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - David S. Lawrie
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sergey V. Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
22
|
Perspectives on Gene Regulatory Network Evolution. Trends Genet 2017; 33:436-447. [PMID: 28528721 DOI: 10.1016/j.tig.2017.04.005] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 04/24/2017] [Accepted: 04/25/2017] [Indexed: 11/23/2022]
Abstract
Animal development proceeds through the activity of genes and their cis-regulatory modules (CRMs) working together in sets of gene regulatory networks (GRNs). The emergence of species-specific traits and novel structures results from evolutionary changes in GRNs. Recent work in a wide variety of animal models, and particularly in insects, has started to reveal the modes and mechanisms of GRN evolution. I discuss here various aspects of GRN evolution and argue that developmental system drift (DSD), in which conserved phenotype is nevertheless a result of changed genetic interactions, should regularly be viewed from the perspective of GRN evolution. Advances in methods to discover related CRMs in diverse insect species, a critical requirement for detailed GRN characterization, are also described.
Collapse
|
23
|
Yang W, Sinha S. A novel method for predicting activity of cis-regulatory modules, based on a diverse training set. Bioinformatics 2016; 33:1-7. [PMID: 27609510 DOI: 10.1093/bioinformatics/btw552] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Revised: 07/26/2016] [Accepted: 08/17/2016] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION With the rapid emergence of technologies for locating cis-regulatory modules (CRMs) genome-wide, the next pressing challenge is to assign precise functions to each CRM, i.e. to determine the spatiotemporal domains or cell-types where it drives expression. A popular approach to this task is to model the typical k-mer composition of a set of CRMs known to drive a common expression pattern, and assign that pattern to other CRMs exhibiting a similar k-mer composition. This approach does not rely on prior knowledge of transcription factors relevant to the CRM or their binding motifs, and is thus more widely applicable than motif-based methods for predicting CRM activity, but is also prone to false positive predictions. RESULTS We present a novel strategy to improve the above-mentioned approach: to predict if a CRM drives a specific gene expression pattern, assess not only how similar the CRM is to other CRMs with similar activity but also to CRMs with distinct activities. We use a state-of-the-art statistical method to quantify a CRM's sequence similarity to many different training sets of CRMs, and employ a classification algorithm to integrate these similarity scores into a single prediction of the CRM's activity. This strategy is shown to significantly improve CRM activity prediction over current approaches. AVAILABILITY AND IMPLEMENTATION Our implementation of the new method, called IMMBoost, is freely available as source code, at https://github.com/weiyangedward/IMMBoost CONTACT: sinhas@illinois.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wei Yang
- Department of Computer Science, University of Illinois, Urbana-Champaign, Urbana, IL, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois, Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
24
|
Suryamohan K, Hanson C, Andrews E, Sinha S, Scheel MD, Halfon MS. Redeployment of a conserved gene regulatory network during Aedes aegypti development. Dev Biol 2016; 416:402-13. [PMID: 27341759 DOI: 10.1016/j.ydbio.2016.06.031] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 06/13/2016] [Accepted: 06/20/2016] [Indexed: 10/21/2022]
Abstract
Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system.
Collapse
Affiliation(s)
- Kushal Suryamohan
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY, United States; NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, United States
| | - Casey Hanson
- Department of Computer Science, University of Illinois Urbana-Champaign, Champaign, IL, United States
| | - Emily Andrews
- Indiana University School of Medicine, Department of Medical and Molecular Genetics, South Bend, IN, United States
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Champaign, IL, United States
| | - Molly Duman Scheel
- Indiana University School of Medicine, Department of Medical and Molecular Genetics, South Bend, IN, United States; University of Notre Dame, Eck Inst. for Global Health and Department of Biological Sciences, South Bend, IN, United States
| | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY, United States; NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, United States; Department of Biological Sciences and Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY, United States; Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY, United States.
| |
Collapse
|
25
|
Rothschild JB, Tsimiklis P, Siggia ED, François P. Predicting Ancestral Segmentation Phenotypes from Drosophila to Anopheles Using In Silico Evolution. PLoS Genet 2016; 12:e1006052. [PMID: 27227405 PMCID: PMC4882032 DOI: 10.1371/journal.pgen.1006052] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 04/23/2016] [Indexed: 12/23/2022] Open
Abstract
Molecular evolution is an established technique for inferring gene homology but regulatory DNA turns over so rapidly that inference of ancestral networks is often impossible. In silico evolution is used to compute the most parsimonious path in regulatory space for anterior-posterior patterning linking two Dipterian species. The expression pattern of gap genes has evolved between Drosophila (fly) and Anopheles (mosquito), yet one of their targets, eve, has remained invariant. Our model predicts that stripe 5 in fly disappears and a new posterior stripe is created in mosquito, thus eve stripe modules 3+7 and 4+6 in fly are homologous to 3+6 and 4+5 in mosquito. We can place Clogmia on this evolutionary pathway and it shares the mosquito homologies. To account for the evolution of the other pair-rule genes in the posterior we have to assume that the ancestral Dipterian utilized a dynamic method to phase those genes in relation to eve. The last common ancestor of the fruit fly (Drosophila) and mosquito (Anopheles) lived more than 200 Million years ago. Can we use available data on insects alive today to infer what their ancestor looked like? In this manuscript, we focus on early embryonic development, when stripes of genetic expression appear and define the location of insect segments (“segmentation”). We use an evolutionary algorithm to reconstruct and predict dynamics of genes controlling stripes in the last common ancestor of fly and mosquito. We predict a new and different combinatorial logic of stripe formation in mosquito compared to fly, which is fully consistent with development of intermediate species such as moth-fly (Clogmia). Our simulations further suggest that the dynamics of gene expression in this last common ancestor were similar to other insects, such as wasps (Nasonia). Our method illustrates how computational methods inspired by machine learning and non-linear physics can be used to infer gene dynamics in species that disappeared millions of years ago.
Collapse
Affiliation(s)
- Jeremy B. Rothschild
- Physics Department, McGill University, Ernest Rutherford Physics Building, Montreal, Quebec, Canada
| | - Panagiotis Tsimiklis
- Physics Department, McGill University, Ernest Rutherford Physics Building, Montreal, Quebec, Canada
| | - Eric D. Siggia
- Center for Studies in Physics and Biology, The Rockefeller University, New York, New York, United States of America
| | - Paul François
- Physics Department, McGill University, Ernest Rutherford Physics Building, Montreal, Quebec, Canada
- * E-mail:
| |
Collapse
|
26
|
Davies NJ, Krusche P, Tauber E, Ott S. Analysis of 5' gene regions reveals extraordinary conservation of novel non-coding sequences in a wide range of animals. BMC Evol Biol 2015; 15:227. [PMID: 26482678 PMCID: PMC4613772 DOI: 10.1186/s12862-015-0499-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Accepted: 09/28/2015] [Indexed: 01/20/2023] Open
Abstract
Background Phylogenetic footprinting is a comparative method based on the principle that functional sequence elements will acquire fewer mutations over time than non-functional sequences. Successful comparisons of distantly related species will thus yield highly important sequence elements likely to serve fundamental biological roles. RNA regulatory elements are less well understood than those in DNA. In this study we use the emerging model organism Nasonia vitripennis, a parasitic wasp, in a comparative analysis against 12 insect genomes to identify deeply conserved non-coding elements (CNEs) conserved in large groups of insects, with a focus on 5’ UTRs and promoter sequences. Results We report the identification of 322 CNEs conserved across a broad range of insect orders. The identified regions are associated with regulatory and developmental genes, and contain short footprints revealing aspects of their likely function in translational regulation. The most ancient regions identified in our analysis were all found to overlap transcribed regions of genes, reflecting stronger conservation of translational regulatory elements than transcriptional elements. Further expanding sequence analyses to non-insect species we also report the discovery of, to our knowledge, the two oldest and most ubiquitous CNE’s yet described in the animal kingdom (700 MYA). These ancient conserved non-coding elements are associated with the two ribosomal stalk genes, RPLP1 and RPLP2, and were very likely functional in some of the earliest animals. Conclusions We report the identification of the most deeply conserved CNE’s found to date, and several other deeply conserved elements which are without exception, part of 5’ untranslated regions of transcripts, and occur in a number of key translational regulatory genes, highlighting translational regulation of translational regulators as a conserved feature of insect genomes. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0499-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Peter Krusche
- Warwick Systems Biology Centre, University of Warwick, Coventry, UK.
| | - Eran Tauber
- Department of Genetics, University of Leicester, Leicester, UK.
| | - Sascha Ott
- Warwick Systems Biology Centre, University of Warwick, Coventry, UK.
| |
Collapse
|
27
|
Abstract
Sexual dimorphism, a poorly understood but crucial aspect of vector mosquito biology, encompasses sex-specific physical, physiological, and behavioral traits related to mosquito reproduction. The study of mosquito sexual dimorphism has largely focused on analysis of the differences between adult female and male mosquitoes, particularly with respect to sex-specific behaviors related to disease transmission. However, sexually dimorphic behaviors are the products of differential gene expression that initiates during development and therefore must also be studied during development. Recent technical advancements are facilitating functional genetic studies in the dengue vector Aedes aegypti, an emerging model for mosquito development. These methodologies, many of which could be extended to other non-model insect species, are facilitating analysis of the development of sexual dimorphism in neural tissues, particularly the olfactory system. These studies are providing insight into the neurodevelopmental genetic basis for sexual dimorphism in vector mosquitoes.
Collapse
Affiliation(s)
- Molly Duman-Scheel
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, South Bend, Indiana, USA; Eck Institute for Global Health, University of Notre Dame, Notre Dame, Indiana, USA; Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| | - Zainulabeuddin Syed
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, Indiana, USA; Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, USA
| |
Collapse
|
28
|
Suryamohan K, Halfon MS. Identifying transcriptional cis-regulatory modules in animal genomes. WILEY INTERDISCIPLINARY REVIEWS. DEVELOPMENTAL BIOLOGY 2015; 4:59-84. [PMID: 25704908 PMCID: PMC4339228 DOI: 10.1002/wdev.168] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 11/04/2014] [Accepted: 11/16/2014] [Indexed: 11/08/2022]
Abstract
UNLABELLED Gene expression is regulated through the activity of transcription factors (TFs) and chromatin-modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods have led to an explosion of both computational and empirical methods for CRM discovery in model and nonmodel organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against TFs or histone post-translational modifications, identification of nucleosome-depleted 'open' chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted TF-binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. For further resources related to this article, please visit the WIREs website. CONFLICT OF INTEREST The authors have declared no conflicts of interest for this article.
Collapse
Affiliation(s)
- Kushal Suryamohan
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY 14203, USA
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY 14203, USA
- Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|