1
|
Baranowski B, Pawłowski K. Protein family neighborhood analyzer-ProFaNA. PeerJ 2023; 11:e15715. [PMID: 37492397 PMCID: PMC10364804 DOI: 10.7717/peerj.15715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 06/16/2023] [Indexed: 07/27/2023] Open
Abstract
Background Functionally related genes are well known to be often grouped in close vicinity in the genomes, particularly in prokaryotes. Notwithstanding the diverse evolutionary mechanisms leading to this phenomenon, it can be used to predict functions of uncharacterized genes. Methods Here, we provide a simple but robust statistical approach that leverages the vast amounts of genomic data available today. Considering a protein domain as a functional unit, one can explore other functional units (domains) that significantly often occur within the genomic neighborhoods of the queried domain. This analysis can be performed across different taxonomic levels. Provisions can also be made to correct for the uneven sampling of the taxonomic space by genomic sequencing projects that often focus on large numbers of very closely related strains, e.g., pathogenic ones. To this end, an optional procedure for averaging occurrences within subtaxa is available. Results Several examples show this approach can provide useful functional predictions for uncharacterized gene families, and how to combine this information with other approaches. The method is made available as a web server at http://bioinfo.sggw.edu.pl/neighborhood_analysis.
Collapse
Affiliation(s)
- Bartosz Baranowski
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences, Warszawa, Poland
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warszawa, Poland
| | - Krzysztof Pawłowski
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences, Warszawa, Poland
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, United States
- Department of Translational Sciences, Lund University, Lund, Sweden
| |
Collapse
|
2
|
Nair RR, Pataki E, Gerst JE. Transperons: RNA operons as effectors of coordinated gene expression in eukaryotes. Trends Genet 2022; 38:1217-1227. [PMID: 35934590 DOI: 10.1016/j.tig.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 07/13/2022] [Accepted: 07/15/2022] [Indexed: 01/24/2023]
Abstract
Coordinated gene expression allows spatiotemporal control of cellular processes and is achieved by the cotranscription/translation of functionally related genes/proteins. Prokaryotes evolved polycistronic messages (operons) to confer expression from a single promoter to efficiently cotranslate proteins functioning on the same pathway. Yet, despite having far greater diversity (e.g., gene number, distribution, modes of expression), eukaryotic cells employ individual promoters and monocistronic messages. Although gene expression is modular, it does not account for how eukaryotes achieve coordinated localized translation. The RNA operon theory states that mRNAs derived from different chromosomes assemble into ribonucleoprotein particles (RNPs) that act as functional operons to generate protein cohorts upon cotranslation. Work in yeast has now validated this theory and shown that intergenic associations and noncanonical histone functions create pathway-specific RNA operons (transperons) that regulate cell physiology. Herein the involvement of chromatin organization in transperon formation and programmed gene coexpression is discussed.
Collapse
Affiliation(s)
- Rohini R Nair
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Emese Pataki
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Jeffrey E Gerst
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel.
| |
Collapse
|
3
|
Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
4
|
Genome Comparisons of the Fission Yeasts Reveal Ancient Collinear Loci Maintained by Natural Selection. J Fungi (Basel) 2021; 7:jof7100864. [PMID: 34682285 PMCID: PMC8537764 DOI: 10.3390/jof7100864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 10/06/2021] [Accepted: 10/12/2021] [Indexed: 11/30/2022] Open
Abstract
Fission yeasts have a unique life history and exhibit distinct evolutionary patterns from other yeasts. Besides, the species demonstrate stable genome structures despite the relatively fast evolution of their genomic sequences. To reveal what could be the reason for that, comparative genomic analyses were carried out. Our results provided evidence that the structural and sequence evolution of the fission yeasts were correlated. Moreover, we revealed ancestral locally collinear blocks (aLCBs), which could have been inherited from their last common ancestor. These aLCBs proved to be the most conserved regions of the genomes as the aLCBs contain almost eight genes/blocks on average in the same orientation and order across the species. Gene order of the aLCBs is mainly fission-yeast-specific but supports the idea of filamentous ancestors. Nevertheless, the sequences and gene structures within the aLCBs are as mutable as any sequences in other parts of the genomes. Although genes of certain Gene Ontology (GO) categories tend to cluster at the aLCBs, those GO enrichments are not related to biological functions or high co-expression rates, they are, rather, determined by the density of essential genes and Rec12 cleavage sites. These data and our simulations indicated that aLCBs might not only be remnants of ancestral gene order but are also maintained by natural selection.
Collapse
|
5
|
Van Dyke K, Lutz S, Mekonnen G, Myers CL, Albert FW. Trans-acting genetic variation affects the expression of adjacent genes. Genetics 2021; 217:6126816. [PMID: 33789351 DOI: 10.1093/genetics/iyaa051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 12/16/2020] [Indexed: 11/13/2022] Open
Abstract
Gene expression differences among individuals are shaped by trans-acting expression quantitative trait loci (eQTLs). Most trans-eQTLs map to hotspot locations that influence many genes. The molecular mechanisms perturbed by hotspots are often assumed to involve "vertical" cascades of effects in pathways that can ultimately affect the expression of thousands of genes. Here, we report that trans-eQTLs can affect the expression of adjacent genes via "horizontal" mechanisms that extend along a chromosome. Genes affected by trans-eQTL hotspots in the yeast Saccharomyces cerevisiae were more likely to be located next to each other than expected by chance. These paired hotspot effects tended to occur at adjacent genes that also show coexpression in response to genetic and environmental perturbations, suggesting shared mechanisms. Physical proximity and shared chromatin state, in addition to regulation of adjacent genes by similar transcription factors, were independently associated with paired hotspot effects among adjacent genes. Paired effects of trans-eQTLs can occur at neighboring genes even when these genes do not share a common function. This phenomenon could result in unexpected connections between regulatory genetic variation and phenotypes.
Collapse
Affiliation(s)
- Krisna Van Dyke
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Sheila Lutz
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Gemechu Mekonnen
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Chad L Myers
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455, USA
| | - Frank W Albert
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
6
|
Merlo MA, Portela-Bens S, Rodríguez ME, García-Angulo A, Cross I, Arias-Pérez A, García E, Rebordinos L. A Comprehensive Integrated Genetic Map of the Complete Karyotype of Solea senegalensis (Kaup 1858). Genes (Basel) 2020; 12:genes12010049. [PMID: 33396249 PMCID: PMC7824234 DOI: 10.3390/genes12010049] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/24/2020] [Accepted: 12/28/2020] [Indexed: 12/23/2022] Open
Abstract
Solea senegalensis aquaculture production has experienced a great increase in the last decade and, consequently, the genome knowledge of the species is gaining attention. In this sense, obtaining a high-density genome mapping of the species could offer clues to the aquaculture improvement in those aspects not resolved so far. In the present article, a review and new processed data have allowed to obtain a high-density BAC-based cytogenetic map of S. senegalensis beside the analysis of the sequences of such BAC clones to achieve integrative data. A total of 93 BAC clones were used to localize the chromosome complement of the species and 588 genes were annotated, thus almost reaching the 2.5% of the S. senegalensis genome sequences. As a result, important data about its genome organization and evolution were obtained, such as the lesser gene density of the large metacentric pair compared with the other metacentric chromosomes, which supports the theory of a sex proto-chromosome pair. In addition, chromosomes with a high number of linked genes that are conserved, even in distant species, were detected. This kind of result widens the knowledge of this species’ chromosome dynamics and evolution.
Collapse
|
7
|
Maney DL, Merritt JR, Prichard MR, Horton BM, Yi SV. Inside the supergene of the bird with four sexes. Horm Behav 2020; 126:104850. [PMID: 32937166 PMCID: PMC7725849 DOI: 10.1016/j.yhbeh.2020.104850] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 09/04/2020] [Accepted: 09/06/2020] [Indexed: 02/07/2023]
Abstract
The white-throated sparrow (Zonotrichia albicollis) offers unique opportunities to understand the adaptive value of supergenes, particularly their role in alternative phenotypes. In this species, alternative plumage morphs segregate with a nonrecombining segment of chromosome 2, which has been called a 'supergene'. The species mates disassortatively with respect to the supergene; that is, each breeding pair consists of one individual with it and one without it. This species has therefore been called the "bird with four sexes". The supergene segregates with a behavioral phenotype; birds with it are more aggressive and less parental than birds without it. Here, we review our efforts to identify the genes inside the supergene that are responsible for the behavioral polymorphism. The gene ESR1, which encodes estrogen receptor α, differs between the morphs and predicts both territorial and parental behavior. Variation in the regulatory regions of ESR1 causes an imbalance in expression of the two alleles, and the degree to which this imbalance favors the supergene allele predicts territorial singing. In heterozygotes, knockdown of ESR1 causes a phenotypic switch, from more aggressive to less aggressive. We recently showed that another gene important for social behavior, vasoactive intestinal peptide (VIP), is differentially expressed between the morphs and predicts territorial singing. We hypothesize that ESR1 and VIP contribute to behavior in a coordinated way and could represent co-adapted alleles. Because the supergene contains more than 1000 individual genes, this species provides rich possibilities for discovering alleles that work together to mediate life-history trade-offs and maximize the fitness of alternative complex phenotypes.
Collapse
Affiliation(s)
- Donna L Maney
- Department of Psychology, Emory University, Atlanta, GA, USA.
| | | | | | - Brent M Horton
- Department of Biology, Millersville University, Millersville, PA, USA
| | - Soojin V Yi
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
8
|
Wang J, Street NR, Park EJ, Liu J, Ingvarsson PK. Evidence for widespread selection in shaping the genomic landscape during speciation of Populus. Mol Ecol 2020; 29:1120-1136. [PMID: 32068935 DOI: 10.1111/mec.15388] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/13/2020] [Accepted: 02/14/2020] [Indexed: 12/13/2022]
Abstract
Increasing our understanding of how evolutionary processes drive the genomic landscape of variation is fundamental to a better understanding of the genomic consequences of speciation. However, genome-wide patterns of within- and between- species variation have not been fully investigated in most forest tree species despite their global ecological and economic importance. Here, we use whole-genome resequencing data from four Populus species spanning the speciation continuum to reconstruct their demographic histories and investigate patterns of diversity and divergence within and between species. Using Populus trichocarpa as an outgroup species, we further infer the genealogical relationships and estimate the extent of ancient introgression among the three aspen species (Populus tremula, Populus davidiana and Populus tremuloides) throughout the genome. Our results show substantial variation in these patterns along the genomes with this variation being strongly predicted by local recombination rates and the density of functional elements. This implies that the interaction between recurrent selection and intrinsic genomic features has dramatically sculpted the genomic landscape over long periods of time. In addition, our findings provide evidence that, apart from background selection, recent positive selection and long-term balancing selection have also been crucial components in shaping patterns of genome-wide variation during the speciation process.
Collapse
Affiliation(s)
- Jing Wang
- Key Laboratory for Bio-Resources and Eco-Environment, College of Life Science, Sichuan University, Chengdu, China
| | - Nathaniel R Street
- Department of Plant Physiology, Umeå Plant Science Centre, Umeå University, Umeå, Sweden
| | - Eung-Jun Park
- Department of Bioresources, National Institute of Forest Science, Suwon, Korea
| | - Jianquan Liu
- Key Laboratory for Bio-Resources and Eco-Environment, College of Life Science, Sichuan University, Chengdu, China
| | - Pär K Ingvarsson
- Department of Plant Biology, Uppsala BioCenter, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
9
|
Patterns of diverse gene functions in genomic neighborhoods predict gene function and phenotype. Sci Rep 2019; 9:19537. [PMID: 31863070 PMCID: PMC6925100 DOI: 10.1038/s41598-019-55984-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 12/02/2019] [Indexed: 01/01/2023] Open
Abstract
Genes with similar roles in the cell cluster on chromosomes, thus benefiting from coordinated regulation. This allows gene function to be inferred by transferring annotations from genomic neighbors, following the guilt-by-association principle. We performed a systematic search for co-occurrence of >1000 gene functions in genomic neighborhoods across 1669 prokaryotic, 49 fungal and 80 metazoan genomes, revealing prevalent patterns that cannot be explained by clustering of functionally similar genes. It is a very common occurrence that pairs of dissimilar gene functions – corresponding to semantically distant Gene Ontology terms – are significantly co-located on chromosomes. These neighborhood associations are often as conserved across genomes as the known associations between similar functions, suggesting selective benefits from clustering of certain diverse functions, which may conceivably play complementary roles in the cell. We propose a simple encoding of chromosomal gene order, the neighborhood function profiles (NFP), which draws on diverse gene clustering patterns to predict gene function and phenotype. NFPs yield a 26–46% increase in predictive power over state-of-the-art approaches that propagate function across neighborhoods, thus providing hundreds of novel, high-confidence gene function inferences per genome. Furthermore, we demonstrate that copy number-neutral structural variation that shapes gene function distribution across chromosomes can predict phenotype of individuals from their genome sequence.
Collapse
|
10
|
Swenson KM, Blanchette M. Large-scale mammalian genome rearrangements coincide with chromatin interactions. Bioinformatics 2019; 35:i117-i126. [PMID: 31510664 PMCID: PMC6612848 DOI: 10.1093/bioinformatics/btz343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Motivation Genome rearrangements drastically change gene order along great stretches of a chromosome. There has been initial evidence that these apparently non-local events in the 1D sense may have breakpoints that are close in the 3D sense. We harness the power of the Double Cut and Join model of genome rearrangement, along with Hi-C chromosome conformation capture data to test this hypothesis between human and mouse. Results We devise novel statistical tests that show that indeed, rearrangement scenarios that transform the human into the mouse gene order are enriched for pairs of breakpoints that have frequent chromosome interactions. This is observed for both intra-chromosomal breakpoint pairs, as well as for inter-chromosomal pairs. For intra-chromosomal rearrangements, the enrichment exists from close (<20 Mb) to very distant (100 Mb) pairs. Further, the pattern exists across multiple cell lines in Hi-C data produced by different laboratories and at different stages of the cell cycle. We show that similarities in the contact frequencies between these many experiments contribute to the enrichment. We conclude that either (i) rearrangements usually involve breakpoints that are spatially close or (ii) there is selection against rearrangements that act on spatially distant breakpoints. Availability and implementation Our pipeline is freely available at https://bitbucket.org/thekswenson/locality. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krister M Swenson
- Laboratoire d'Informatique, de Robotique, et de Microelectronique de Montpellier (LIRMM), Université Montpellier, Montpellier, France.,Centre Nationale de la Recherche Scientifique (CNRS), France
| | | |
Collapse
|
11
|
Franke J, Kim J, Hamilton JP, Zhao D, Pham GM, Wiegert-Rininger K, Crisovan E, Newton L, Vaillancourt B, Tatsis E, Buell CR, O'Connor SE. Gene Discovery in Gelsemium Highlights Conserved Gene Clusters in Monoterpene Indole Alkaloid Biosynthesis. Chembiochem 2019; 20:83-87. [PMID: 30300974 DOI: 10.1002/cbic.201800592] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Indexed: 12/30/2022]
Abstract
Genome mining is a routine technique in microbes for discovering biosynthetic pathways. In plants, however, genomic information is not commonly used to identify novel biosynthesis genes. Here, we present the genome of the medicinal plant and oxindole monoterpene indole alkaloid (MIA) producer Gelsemium sempervirens (Gelsemiaceae). A gene cluster from Catharanthus roseus, which is utilized at least six enzymatic steps downstream from the last common intermediate shared between the two plant alkaloid types, is found in G. sempervirens, although the corresponding enzymes act on entirely different substrates. This study provides insights into the common genomic context of MIA pathways and is an important milestone in the further elucidation of the Gelsemium oxindole alkaloid pathway.
Collapse
Affiliation(s)
- Jakob Franke
- Department of Biological Chemistry, John Innes Centre, Colney Lane, Norwich, NR4 7UH, UK
| | - Jeongwoon Kim
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - John P Hamilton
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Dongyan Zhao
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Gina M Pham
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | | | - Emily Crisovan
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Linsey Newton
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Brieanne Vaillancourt
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Evangelos Tatsis
- Department of Biological Chemistry, John Innes Centre, Colney Lane, Norwich, NR4 7UH, UK
| | - C Robin Buell
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Sarah E O'Connor
- Department of Biological Chemistry, John Innes Centre, Colney Lane, Norwich, NR4 7UH, UK
| |
Collapse
|
12
|
Leijten W, Koes R, Roobeek I, Frugis G. Translating Flowering Time From Arabidopsis thaliana to Brassicaceae and Asteraceae Crop Species. PLANTS 2018; 7:plants7040111. [PMID: 30558374 PMCID: PMC6313873 DOI: 10.3390/plants7040111] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 12/07/2018] [Accepted: 12/13/2018] [Indexed: 12/31/2022]
Abstract
Flowering and seed set are essential for plant species to survive, hence plants need to adapt to highly variable environments to flower in the most favorable conditions. Endogenous cues such as plant age and hormones coordinate with the environmental cues like temperature and day length to determine optimal time for the transition from vegetative to reproductive growth. In a breeding context, controlling flowering time would help to speed up the production of new hybrids and produce high yield throughout the year. The flowering time genetic network is extensively studied in the plant model species Arabidopsis thaliana, however this knowledge is still limited in most crops. This article reviews evidence of conservation and divergence of flowering time regulation in A. thaliana with its related crop species in the Brassicaceae and with more distant vegetable crops within the Asteraceae family. Despite the overall conservation of most flowering time pathways in these families, many genes controlling this trait remain elusive, and the function of most Arabidopsis homologs in these crops are yet to be determined. However, the knowledge gathered so far in both model and crop species can be already exploited in vegetable crop breeding for flowering time control.
Collapse
Affiliation(s)
- Willeke Leijten
- ENZA Zaden Research & Development B.V., Haling 1E, 1602 DB Enkhuizen, The Netherlands.
| | - Ronald Koes
- Swammerdam Institute for Life Sciences (SILS), University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.
| | - Ilja Roobeek
- ENZA Zaden Research & Development B.V., Haling 1E, 1602 DB Enkhuizen, The Netherlands.
| | - Giovanna Frugis
- Istituto di Biologia e Biotecnologia Agraria (IBBA), Operative Unit of Rome, Consiglio Nazionale delle Ricerche (CNR), Via Salaria Km. 29,300 ⁻ 00015, Monterotondo Scalo, Roma, Italy.
| |
Collapse
|
13
|
Bourgeois Y, Stritt C, Walser JC, Gordon SP, Vogel JP, Roulin AC. Genome-wide scans of selection highlight the impact of biotic and abiotic constraints in natural populations of the model grass Brachypodium distachyon. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 96:438-451. [PMID: 30044522 DOI: 10.1111/tpj.14042] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 06/20/2018] [Accepted: 07/17/2018] [Indexed: 06/08/2023]
Abstract
Grasses are essential plants for ecosystem functioning. Quantifying the selective pressures that act on natural variation in grass species is therefore essential regarding biodiversity maintenance. In this study, we investigate the selection pressures that act on two distinct populations of the grass model Brachypodium distachyon without prior knowledge about the traits under selection. We took advantage of whole-genome sequencing data produced for 44 natural accessions of B. distachyon and used complementary genome-wide selection scans (GWSS) methods to detect genomic regions under balancing and positive selection. We show that selection is shaping genetic diversity at multiple temporal and spatial scales in this species, and affects different genomic regions across the two populations. Gene ontology annotation of candidate genes reveals that pathogens may constitute important factors of positive and balancing selection in B. distachyon. We eventually cross-validated our results with quantitative trait locus data available for leaf-rust resistance in this species and demonstrate that, when paired with classical trait mapping, GWSS can help pinpointing candidate genes for further molecular validation. Thanks to a near base-perfect reference genome and the large collection of freely available natural accessions collected across its natural range, B. distachyon appears as a prime system for studies in ecology, population genomics and evolutionary biology.
Collapse
Affiliation(s)
- Yann Bourgeois
- New York University Abu Dhabi, PO Box 129188, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Christoph Stritt
- Institute of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, 8008, Zürich, Switzerland
| | - Jean-Claude Walser
- Genetic Diversity Centre, ETH Zürich, Universitätstrasse 16, Zurich, Switzerland
| | - Sean P Gordon
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - John P Vogel
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Anne C Roulin
- Institute of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, 8008, Zürich, Switzerland
| |
Collapse
|
14
|
Specialized plant biochemistry drives gene clustering in fungi. ISME JOURNAL 2018; 12:1694-1705. [PMID: 29463891 DOI: 10.1038/s41396-018-0075-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 01/18/2018] [Accepted: 01/26/2018] [Indexed: 01/31/2023]
Abstract
The fitness and evolution of prokaryotes and eukaryotes are affected by the organization of their genomes. In particular, the physical clustering of genes can coordinate gene expression and can prevent the breakup of co-adapted alleles. Although clustering may thus result from selection for phenotype optimization and persistence, the impact of environmental selection pressures on eukaryotic genome organization has rarely been systematically explored. Here, we investigated the organization of fungal genes involved in the degradation of phenylpropanoids, a class of plant-produced secondary metabolites that mediate many ecological interactions between plants and fungi. Using a novel gene cluster detection method, we identified 1110 gene clusters and many conserved combinations of clusters in a diverse set of fungi. We demonstrate that congruence in genome organization over small spatial scales is often associated with similarities in ecological lifestyle. Additionally, we find that while clusters are often structured as independent modules with little overlap in content, certain gene families merge multiple modules into a common network, suggesting they are important components of phenylpropanoid degradation strategies. Together, our results suggest that phenylpropanoids have repeatedly selected for gene clustering in fungi, and highlight the interplay between genome organization and ecological evolution in this ancient eukaryotic lineage.
Collapse
|
15
|
Donlon TA, Morris BJ, Chen R, Masaki KH, Allsopp RC, Willcox DC, Elliott A, Willcox BJ. FOXO3 longevity interactome on chromosome 6. Aging Cell 2017; 16:1016-1025. [PMID: 28722347 PMCID: PMC5595686 DOI: 10.1111/acel.12625] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/07/2017] [Indexed: 01/07/2023] Open
Abstract
FOXO3 has been implicated in longevity in multiple populations. By DNA sequencing in long‐lived individuals, we identified all single nucleotide polymorphisms (SNPs) in FOXO3 and showed 41 were associated with longevity. Thirteen of these had predicted alterations in transcription factor binding sites. Those SNPs appeared to be in physical contact, via RNA polymerase II binding chromatin looping, with sites in the FOXO3 promoter, and likely function together as a cis‐regulatory unit. The SNPs exhibited a high degree of LD in the Asian population, in which they define a specific longevity haplotype that is relatively common. The haplotype was less frequent in whites and virtually nonexistent in Africans. We identified distant contact points between FOXO3 and 46 neighboring genes, through long‐range physical contacts via CCCTC‐binding factor zinc finger protein (CTCF) binding sites, over a 7.3 Mb distance on chromosome 6q21. When activated by cellular stress, we visualized movement of FOXO3 toward neighboring genes. FOXO3 resides at the center of this early‐replicating and highly conserved syntenic region of chromosome 6. Thus, in addition to its role as a transcription factor regulating gene expression genomewide, FOXO3 may function at the genomic level to help regulate neighboring genes by virtue of its central location in chromatin conformation via topologically associated domains. We believe that the FOXO3 ‘interactome’ on chromosome 6 is a chromatin domain that defines an aging hub. A more thorough understanding of the functions of these neighboring genes may help elucidate the mechanisms through which FOXO3 variants promote longevity and healthy aging.
Collapse
Affiliation(s)
- Timothy A. Donlon
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
- John A. Burns School of Medicine; University of Hawaii Manoa; Honolulu Hawaii
| | - Brian J. Morris
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
- Basic & Clinical Genomics Laboratory; School of Medical Sciences and Bosch Institute; University of Sydney; Sydney NSW Australia
- Department of Geriatric Medicine; John A. Burns School of Medicine; University of Hawaii; Honolulu Hawaii
| | - Randi Chen
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
| | - Kamal H. Masaki
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
- Department of Geriatric Medicine; John A. Burns School of Medicine; University of Hawaii; Honolulu Hawaii
| | - Richard C. Allsopp
- John A. Burns School of Medicine; University of Hawaii Manoa; Honolulu Hawaii
| | - D. Craig Willcox
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
- Department of Geriatric Medicine; John A. Burns School of Medicine; University of Hawaii; Honolulu Hawaii
- Department of Human Welfare; Okinawa International University; Okinawa Japan
| | - Ayako Elliott
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
| | - Bradley J. Willcox
- Department of Research; Genetics Laboratory; Honolulu Heart Program/Honolulu-Asia Aging Study (HAAS); Kuakini Medical Center; Honolulu Hawaii
- Department of Geriatric Medicine; John A. Burns School of Medicine; University of Hawaii; Honolulu Hawaii
| |
Collapse
|
16
|
Reimegård J, Kundu S, Pendle A, Irish VF, Shaw P, Nakayama N, Sundström JF, Emanuelsson O. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana. Nucleic Acids Res 2017; 45:3253-3265. [PMID: 28175342 PMCID: PMC5389543 DOI: 10.1093/nar/gkx087] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 01/31/2017] [Indexed: 12/02/2022] Open
Abstract
Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation.
Collapse
Affiliation(s)
- Johan Reimegård
- Science for Life Laboratory, School of Biotechnology, Division of Gene Technology, KTH Royal Institute of Technology, Solna SE-171 65, Sweden
| | - Snehangshu Kundu
- Department of Plant Biology, Uppsala BioCenter, Linnean Center for Plant Biology, Swedish University of Agricultural Sciences, Uppsala SE-750 07, Sweden
| | - Ali Pendle
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Vivian F Irish
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06520, USA
| | - Peter Shaw
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Naomi Nakayama
- Institute of Molecular Plant Science, SynthSys Centre for Synthetic and Systems Biology, and Centre for Science at Extreme Conditions, University of Edinburgh, King's Buildings, Edinburgh, UK
| | - Jens F Sundström
- Department of Plant Biology, Uppsala BioCenter, Linnean Center for Plant Biology, Swedish University of Agricultural Sciences, Uppsala SE-750 07, Sweden
| | - Olof Emanuelsson
- Science for Life Laboratory, School of Biotechnology, Division of Gene Technology, KTH Royal Institute of Technology, Solna SE-171 65, Sweden
| |
Collapse
|
17
|
Ravinet M, Faria R, Butlin RK, Galindo J, Bierne N, Rafajlović M, Noor MAF, Mehlig B, Westram AM. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. J Evol Biol 2017; 30:1450-1477. [DOI: 10.1111/jeb.13047] [Citation(s) in RCA: 306] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 01/31/2017] [Accepted: 02/01/2017] [Indexed: 12/14/2022]
Affiliation(s)
- M. Ravinet
- Centre for Ecological and Evolutionary Synthesis; University of Oslo; Oslo Norway
- National Institute of Genetics; Mishima Shizuoka Japan
| | - R. Faria
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos; InBIO, Laboratório Associado; Universidade do Porto; Vairão Portugal
- Department of Experimental and Health Sciences; IBE, Institute of Evolutionary Biology (CSIC-UPF); Pompeu Fabra University; Barcelona Spain
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
| | - R. K. Butlin
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
- Department of Marine Sciences; Centre for Marine Evolutionary Biology; University of Gothenburg; Gothenburg Sweden
| | - J. Galindo
- Department of Biochemistry, Genetics and Immunology; University of Vigo; Vigo Spain
| | - N. Bierne
- CNRS; Université Montpellier; ISEM; Station Marine Sète France
| | - M. Rafajlović
- Department of Physics; University of Gothenburg; Gothenburg Sweden
| | | | - B. Mehlig
- Department of Physics; University of Gothenburg; Gothenburg Sweden
| | - A. M. Westram
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
| |
Collapse
|
18
|
Affiliation(s)
- Brian J Morris
- From the Basic & Clinical Genomics Laboratory, School of Medical Sciences and Bosch Institute, University of Sydney, New South Wales, Australia.
| |
Collapse
|
19
|
Guo Y, Zhang P, Sheng Q, Zhao S, Hackett TA. lncRNA expression in the auditory forebrain during postnatal development. Gene 2016; 593:201-216. [PMID: 27544636 PMCID: PMC5034298 DOI: 10.1016/j.gene.2016.08.027] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 06/27/2016] [Accepted: 08/15/2016] [Indexed: 12/30/2022]
Abstract
The biological processes governing brain development and maturation depend on complex patterns of gene and protein expression, which can be influenced by many factors. One of the most overlooked is the long noncoding class of RNAs (lncRNAs), which are known to play important regulatory roles in an array of biological processes. Little is known about the distribution of lncRNAs in the sensory systems of the brain, and how lncRNAs interact with other mechanisms to guide the development of these systems. In this study, we profiled lncRNA expression in the mouse auditory forebrain during postnatal development at time points before and after the onset of hearing (P7, P14, P21, adult). First, we generated lncRNA profiles of the primary auditory cortex (A1) and medial geniculate body (MG) at each age. Then, we determined the differential patterns of expression by brain region and age. These analyses revealed that the lncRNA expression profile was distinct between both brain regions and between each postnatal age, indicating spatial and temporal specificity during maturation of the auditory forebrain. Next, we explored potential interactions between functionally-related lncRNAs, protein coding RNAs (pcRNAs), and associated proteins. The maturational trajectories (P7 to adult) of many lncRNA - pcRNA pairs were highly correlated, and predictive analyses revealed that lncRNA-protein interactions tended to be strong. A user-friendly database was constructed to facilitate inspection of the expression levels and maturational trajectories for any lncRNA or pcRNA in the database. Overall, this study provides an in-depth summary of lncRNA expression in the developing auditory forebrain and a broad-based foundation for future exploration of lncRNA function during brain development.
Collapse
Affiliation(s)
- Yan Guo
- Dept. of Cancer Biology, Vanderbilt University, Nashville, TN, USA
| | - Pan Zhang
- Dept. of Cancer Biology, Vanderbilt University, Nashville, TN, USA
| | - Quanhu Sheng
- Dept. of Cancer Biology, Vanderbilt University, Nashville, TN, USA
| | - Shilin Zhao
- Dept. of Cancer Biology, Vanderbilt University, Nashville, TN, USA
| | - Troy A Hackett
- Dept. of Hearing and Speech Sciences, Vanderbilt University School of Medicine, Nashville, TN, USA.
| |
Collapse
|
20
|
Mostovoy Y, Thiemicke A, Hsu TY, Brem RB. The Role of Transcription Factors at Antisense-Expressing Gene Pairs in Yeast. Genome Biol Evol 2016; 8:1748-61. [PMID: 27190003 PMCID: PMC4943177 DOI: 10.1093/gbe/evw104] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genes encoded close to one another on the chromosome are often coexpressed, by a mechanism and regulatory logic that remain poorly understood. We surveyed the yeast genome for tandem gene pairs oriented tail-to-head at which expression antisense to the upstream gene was conserved across species. The intergenic region at most such tandem pairs is a bidirectional promoter, shared by the downstream gene mRNA and the upstream antisense transcript. Genomic analyses of these intergenic loci revealed distinctive patterns of transcription factor regulation. Mutation of a given transcription factor verified its role as a regulator in trans of tandem gene pair loci, including the proximally initiating upstream antisense transcript and downstream mRNA and the distally initiating upstream mRNA. To investigate cis-regulatory activity at such a locus, we focused on the stress-induced NAD(P)H dehydratase YKL151C and its downstream neighbor, the metabolic enzyme GPM1. Previous work has implicated the region between these genes in regulation of GPM1 expression; our mutation experiments established its function in rich medium as a repressor in cis of the distally initiating YKL151C sense RNA, and an activator of the proximally initiating YKL151C antisense RNA. Wild-type expression of all three transcripts required the transcription factor Gcr2. Thus, at this locus, the intergenic region serves as a focal point of regulatory input, driving antisense expression and mediating the coordinated regulation of YKL151C and GPM1. Together, our findings implicate transcription factors in the joint control of neighboring genes specialized to opposing conditions and the antisense transcripts expressed between them.
Collapse
Affiliation(s)
- Yulia Mostovoy
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Cardiovascular Research Institute, University of California, San Francisco, CA
| | - Alexander Thiemicke
- Department of Molecular and Cell Biology, University of California, Berkeley, California Program in Molecular Medicine, Friedrich-Schiller-Universität, Jena, Germany Present address: Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN
| | - Tiffany Y Hsu
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Graduate Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA
| | - Rachel B Brem
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Buck Institute for Research on Aging, Novato, CA
| |
Collapse
|
21
|
Lorent K, Gong W, Koo KA, Waisbourd-Zinman O, Karjoo S, Zhao X, Sealy I, Kettleborough RN, Stemple DL, Windsor PA, Whittaker SJ, Porter JR, Wells RG, Pack M. Identification of a plant isoflavonoid that causes biliary atresia. Sci Transl Med 2016; 7:286ra67. [PMID: 25947162 DOI: 10.1126/scitranslmed.aaa1652] [Citation(s) in RCA: 105] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Biliary atresia (BA) is a rapidly progressive and destructive fibrotic disorder of unknown etiology affecting the extrahepatic biliary tree of neonates. Epidemiological studies suggest that an environmental factor, such as a virus or toxin, is the cause of the disease, although none have been definitively established. Several naturally occurring outbreaks of BA in Australian livestock have been associated with the ingestion of unusual plants by pregnant animals during drought conditions. We used a biliary secretion assay in zebrafish to isolate a previously undescribed isoflavonoid, biliatresone, from Dysphania species implicated in a recent BA outbreak. This compound caused selective destruction of the extrahepatic, but not intrahepatic, biliary system of larval zebrafish. A mutation that enhanced biliatresone toxicity mapped to a region of the zebrafish genome that has conserved synteny with an established human BA susceptibility locus. The toxin also caused loss of cilia in neonatal mouse extrahepatic cholangiocytes in culture and disrupted cell polarity and monolayer integrity in cholangiocyte spheroids. Together, these findings provide direct evidence that BA could be initiated by perinatal exposure to an environmental toxin.
Collapse
Affiliation(s)
- Kristin Lorent
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Weilong Gong
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kyung A Koo
- Department of Biological Sciences, University of the Sciences, Philadelphia, PA 19104, USA
| | - Orith Waisbourd-Zinman
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA. Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
| | - Sara Karjoo
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Xiao Zhao
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ian Sealy
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Ross N Kettleborough
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Derek L Stemple
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Peter A Windsor
- Faculty of Veterinary Science, University of Sydney, Camden, New South Wales 2570, Australia
| | - Stephen J Whittaker
- Hume Livestock Health and Pest Authority, Albury, New South Wales 2640, Australia
| | - John R Porter
- Department of Biological Sciences, University of the Sciences, Philadelphia, PA 19104, USA
| | - Rebecca G Wells
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Michael Pack
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA. Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
22
|
Aprea J, Calegari F. Long non-coding RNAs in corticogenesis: deciphering the non-coding code of the brain. EMBO J 2015; 34:2865-84. [PMID: 26516210 DOI: 10.15252/embj.201592655] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Accepted: 10/05/2015] [Indexed: 01/17/2023] Open
Abstract
Evidence on the role of long non-coding (lnc) RNAs has been accumulating over decades, but it has been only recently that advances in sequencing technologies have allowed the field to fully appreciate their abundance and diversity. Despite this, only a handful of lncRNAs have been phenotypically or mechanistically studied. Moreover, novel lncRNAs and new classes of RNAs are being discovered at growing pace, suggesting that this class of molecules may have functions as diverse as protein-coding genes. Interestingly, the brain is the organ where lncRNAs have the most peculiar features including the highest number of lncRNAs that are expressed, proportion of tissue-specific lncRNAs and highest signals of evolutionary conservation. In this work, we critically review the current knowledge about the steps that have led to the identification of the non-coding transcriptome including the general features of lncRNAs in different contexts in terms of both their genomic organisation, evolutionary origin, patterns of expression, and function in the developing and adult mammalian brain.
Collapse
Affiliation(s)
- Julieta Aprea
- DFG-Research Center and Cluster of Excellence for Regenerative Therapies, Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Federico Calegari
- DFG-Research Center and Cluster of Excellence for Regenerative Therapies, Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
23
|
Abstract
The world of primate genomics is expanding rapidly in new and exciting ways owing to lowered costs and new technologies in molecular methods and bioinformatics. The primate order is composed of 78 genera and 478 species, including human. Taxonomic inferences are complex and likely a consequence of ongoing hybridization, introgression, and reticulate evolution among closely related taxa. Recently, we applied large-scale sequencing methods and extensive taxon sampling to generate a highly resolved phylogeny that affirms, reforms, and extends previous depictions of primate speciation. The next stage of research uses this phylogeny as a foundation for investigating genome content, structure, and evolution across primates. Ongoing and future applications of a robust primate phylogeny are discussed, highlighting advancements in adaptive evolution of genes and genomes, taxonomy and conservation management of endangered species, next-generation genomic technologies, and biomedicine.
Collapse
Affiliation(s)
- Jill Pecon-Slattery
- Laboratory of Genomic Diversity, National Cancer Institute, Frederick, Maryland 21702; Current Affiliation: Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, Virginia 22630;
| |
Collapse
|
24
|
The clustering of functionally related genes contributes to CNV-mediated disease. Genome Res 2015; 25:802-13. [PMID: 25887030 PMCID: PMC4448677 DOI: 10.1101/gr.184325.114] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 04/13/2015] [Indexed: 12/20/2022]
Abstract
Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 × 10−3). Using three different functional networks, we identified unexpectedly large numbers of functionally related genes within de novo CNVs from two large independent cohorts of individuals with developmental disorders. The presence of multiple functionally related genes was a significant predictor of a CNV's pathogenicity when compared to CNVs from apparently healthy individuals and a better predictor than the presence of known disease or haploinsufficient genes for larger CNVs. The functionally related genes found in the de novo CNVs belonged to 70% of all clusters of functionally related genes found across the genome. De novo CNVs were more likely to affect functional clusters and affect them to a greater extent than benign CNVs (P = 6 × 10−4). Furthermore, such clusters of functionally related genes are phenotypically informative: Different patients possessing CNVs that affect the same cluster of functionally related genes exhibit more similar phenotypes than expected (P < 0.05). The spanning of multiple functionally similar genes by single CNVs contributes substantially to how these variants exert their pathogenic effects.
Collapse
|
25
|
Farré M, Robinson TJ, Ruiz-Herrera A. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity. Bioessays 2015; 37:479-88. [PMID: 25739389 DOI: 10.1002/bies.201400174] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 02/12/2015] [Accepted: 02/13/2015] [Indexed: 12/23/2022]
Abstract
Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Campus UAB, Barcelona, Spain
| | | | | |
Collapse
|
26
|
Han MV. Characterizing gene movements between chromosomes in Drosophila. Fly (Austin) 2014; 6:121-5. [DOI: 10.4161/fly.20144] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
27
|
Lee YCG, Chang HH. The evolution and functional significance of nested gene structures in Drosophila melanogaster. Genome Biol Evol 2014; 5:1978-85. [PMID: 24084778 PMCID: PMC3814207 DOI: 10.1093/gbe/evt149] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Nearly 10% of the genes in the genome of Drosophila melanogaster are in nested structures, in which one gene is completely nested within the intron of another gene (nested and including gene, respectively). Even though the coding sequences and untranslated regions of these nested/including gene pairs do not overlap, their intimate structures and the possibility of shared regulatory sequences raise questions about the evolutionary forces governing the origination and subsequent functional and evolutionary impacts of these structures. In this study, we show that nested genes experience weaker evolutionary constraint, have faster rates of protein evolution, and are expressed in fewer tissues than other genes, while including genes show the opposite patterns. Surprisingly, despite completely overlapping with each other, nested and including genes are less likely to display correlated gene expression and biological function than the nearby yet nonoverlapping genes. Interestingly, significantly fewer nested genes are transcribed from the same strand as the including gene. We found that same-strand nested genes are more likely to be single-exon genes. In addition, same-strand including genes are less likely to have known lethal or sterile phenotypes than opposite-strand including genes only when the corresponding nested genes have introns. These results support our hypothesis that selection against potential erroneous mRNA splicing when nested and including genes are on the same strand plays an important role in the evolution of nested gene structures.
Collapse
Affiliation(s)
- Yuh Chwen G Lee
- Center for Population Biology and Department of Evolution and Ecology, University of California
| | | |
Collapse
|
28
|
Rubin AF, Green P. Expression-based segmentation of the Drosophila genome. BMC Genomics 2013; 14:812. [PMID: 24256206 PMCID: PMC3909303 DOI: 10.1186/1471-2164-14-812] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 11/18/2013] [Indexed: 01/22/2023] Open
Abstract
Background It is generally accepted that gene order in eukaryotes is nonrandom, with adjacent genes often sharing expression patterns across tissues, and that this organization may be important for gene regulation. Here we describe a novel method, based on an explicit probability model instead of correlation analysis, for identifying coordinately expressed gene clusters (‘coexpression segments’), apply it to Drosophila melanogaster, and look for epigenetic associations using publicly available data. Results We find that two-thirds of Drosophila genes fall into multigenic coexpression segments, and that such segments are of two main types, housekeeping and tissue-restricted. Consistent with correlation-based studies, we find that adjacent genes within the same segment tend to be physically closer to each other than to the adjacent genes in different segments, and that tissue-restricted segments are enriched for testis-expressed genes. Our segmentation pattern correlates with Hi-C based physical interaction domains, but segments are generally much smaller than domains. Intersegment regions (including those which do not correspond to physical domain boundaries) are enriched for insulator binding sites. Conclusions We describe a novel approach for identifying coexpression clusters that does not require arbitrary cutoff values or heuristics, and find that coexpression of adjacent genes is widespread in the Drosophila genome. Coexpression segments appear to reflect a level of regulatory organization related to, but below that of physical interaction domains, and depending in part on insulator binding.
Collapse
Affiliation(s)
- Alan F Rubin
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | | |
Collapse
|
29
|
Genomic rearrangements and the evolution of clusters of locally adaptive loci. Proc Natl Acad Sci U S A 2013; 110:E1743-51. [PMID: 23610436 DOI: 10.1073/pnas.1219381110] [Citation(s) in RCA: 220] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Numerous studies of ecological genetics have found that alleles contributing to local adaptation sometimes cluster together, forming "genomic islands of divergence." Divergence hitchhiking theory posits that these clusters evolve by the preferential establishment of tightly linked locally adapted mutations, because such linkage reduces the rate that recombination breaks up locally favorable combinations of alleles. Here, I use calculations based on previously developed analytical models of divergence hitchhiking to show that very few clustered mutations should be expected in a single bout of adaptation, relative to the number of unlinked mutations, suggesting that divergence hitchhiking theory alone may often be insufficient to explain empirical observations. Using individual-based simulations that allow for the transposition of a single genetic locus from one position on a chromosome to another, I then show that tight clustering of the loci involved in local adaptation tends to evolve on biologically realistic time scales. These results suggest that genomic rearrangements may often be an important component of local adaptation and the evolution of genomic islands of divergence. More generally, these results suggest that genomic architecture and functional neighborhoods of genes may be actively shaped by natural selection in heterogeneous environments. Because small-scale changes in gene order are relatively common in some taxa, comparative genomic studies could be coupled with studies of adaptation to explore how commonly such rearrangements are involved in local adaptation.
Collapse
|
30
|
Aboukhalil R, Fendler B, Atwal GS. Kerfuffle: a web tool for multi-species gene colocalization analysis. BMC Bioinformatics 2013; 14:22. [PMID: 23327649 PMCID: PMC3598493 DOI: 10.1186/1471-2105-14-22] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2012] [Accepted: 01/11/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The evolutionary pressures that underlie the large-scale functional organization of the genome are not well understood in eukaryotes. Recent evidence suggests that functionally similar genes may colocalize (cluster) in the eukaryotic genome, suggesting the role of chromatin-level gene regulation in shaping the physical distribution of coordinated genes. However, few of the bioinformatic tools currently available allow for a systematic study of gene colocalization across several, evolutionarily distant species. Furthermore, most tools require the user to input manually curated lists of gene position information, DNA sequence or gene homology relations between species. With the growing number of sequenced genomes, there is a need to provide new comparative genomics tools that can address the analysis of multi-species gene colocalization. RESULTS Kerfuffle is a web tool designed to help discover, visualize, and quantify the physical organization of genomes by identifying significant gene colocalization and conservation across the assembled genomes of available species (currently up to 47, from humans to worms). Kerfuffle only requires the user to specify a list of human genes and the names of other species of interest. Without further input from the user, the software queries the e!Ensembl BioMart server to obtain positional information and discovers homology relations in all genes and species specified. Using this information, Kerfuffle performs a multi-species clustering analysis, presents downloadable lists of clustered genes, performs Monte Carlo statistical significance calculations, estimates how conserved gene clusters are across species, plots histograms and interactive graphs, allows users to save their queries, and generates a downloadable visualization of the clusters using the Circos software. These analyses may be used to further explore the functional roles of gene clusters by interrogating the enriched molecular pathways associated with each cluster. CONCLUSIONS Kerfuffle is a new, easy-to-use and publicly available tool to aid our understanding of functional genomics and comparative genomics. This software allows for flexibility and quick investigations of a user-defined set of genes, and the results may be saved online for further analysis. Kerfuffle is freely available at http://atwallab.org/kerfuffle, is implemented in JavaScript (using jQuery and jsCharts libraries) and PHP 5.2, runs on an Apache server, and stores data in flat files and an SQLite database.
Collapse
|
31
|
Andolfo G, Sanseverino W, Rombauts S, Van de Peer Y, Bradeen JM, Carputo D, Frusciante L, Ercolano MR. Overview of tomato (Solanum lycopersicum) candidate pathogen recognition genes reveals important Solanum R locus dynamics. THE NEW PHYTOLOGIST 2013; 197:223-237. [PMID: 23163550 DOI: 10.1111/j.1469-8137.2012.04380.x] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Accepted: 09/11/2012] [Indexed: 05/05/2023]
Abstract
To investigate the genome-wide spatial arrangement of R loci, a complete catalogue of tomato (Solanum lycopersicum) and potato (Solanum tuberosum) nucleotide-binding site (NBS) NBS, receptor-like protein (RLP) and receptor-like kinase (RLK) gene repertories was generated. Candidate pathogen recognition genes were characterized with respect to structural diversity, phylogenetic relationships and chromosomal distribution. NBS genes frequently occur in clusters of related gene copies that also include RLP or RLK genes. This scenario is compatible with the existence of selective pressures optimizing coordinated transcription. A number of duplication events associated with lineage-specific evolution were discovered. These findings suggest that different evolutionary mechanisms shaped pathogen recognition gene cluster architecture to expand and to modulate the defence repertoire. Analysis of pathogen recognition gene clusters associated with documented resistance function allowed the identification of adaptive divergence events and the reconstruction of the evolution history of these loci. Differences in candidate pathogen recognition gene number and organization were found between tomato and potato. Most candidate pathogen recognition gene orthologues were distributed at less than perfectly matching positions, suggesting an ongoing lineage-specific rearrangement. Indeed, a local expansion of Toll/Interleukin-1 receptor (TIR)-NBS-leucine-rich repeat (LRR) (TNL) genes in the potato genome was evident. Taken together, these findings have implications for improved understanding of the mechanisms of molecular adaptive selection at Solanum R loci.
Collapse
Affiliation(s)
- G Andolfo
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples 'Federico II', Via Universita 100, 80055, Portici, Italy
| | - W Sanseverino
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples 'Federico II', Via Universita 100, 80055, Portici, Italy
| | - S Rombauts
- Department of Plant Systems Biology, VIB, 9052, Gent, Belgium
| | - Y Van de Peer
- Department of Plant Systems Biology, VIB, 9052, Gent, Belgium
| | - J M Bradeen
- Department of Plant Pathology, University of Minnesota, 495 Borlaug Hall/1991 Upper Buford Circle, St. Paul, MN, 55108, USA
| | - D Carputo
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples 'Federico II', Via Universita 100, 80055, Portici, Italy
| | - L Frusciante
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples 'Federico II', Via Universita 100, 80055, Portici, Italy
| | - M R Ercolano
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples 'Federico II', Via Universita 100, 80055, Portici, Italy
| |
Collapse
|
32
|
Doelken SC, Köhler S, Mungall CJ, Gkoutos GV, Ruef BJ, Smith C, Smedley D, Bauer S, Klopocki E, Schofield PN, Westerfield M, Robinson PN, Lewis SE. Phenotypic overlap in the contribution of individual genes to CNV pathogenicity revealed by cross-species computational analysis of single-gene mutations in humans, mice and zebrafish. Dis Model Mech 2012; 6:358-72. [PMID: 23104991 PMCID: PMC3597018 DOI: 10.1242/dmm.010322] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Numerous disease syndromes are associated with regions of copy number variation (CNV) in the human genome and, in most cases, the pathogenicity of the CNV is thought to be related to altered dosage of the genes contained within the affected segment. However, establishing the contribution of individual genes to the overall pathogenicity of CNV syndromes is difficult and often relies on the identification of potential candidates through manual searches of the literature and online resources. We describe here the development of a computational framework to comprehensively search phenotypic information from model organisms and single-gene human hereditary disorders, and thus speed the interpretation of the complex phenotypes of CNV disorders. There are currently more than 5000 human genes about which nothing is known phenotypically but for which detailed phenotypic information for the mouse and/or zebrafish orthologs is available. Here, we present an ontology-based approach to identify similarities between human disease manifestations and the mutational phenotypes in characterized model organism genes; this approach can therefore be used even in cases where there is little or no information about the function of the human genes. We applied this algorithm to detect candidate genes for 27 recurrent CNV disorders and identified 802 gene-phenotype associations, approximately half of which involved genes that were previously reported to be associated with individual phenotypic features and half of which were novel candidates. A total of 431 associations were made solely on the basis of model organism phenotype data. Additionally, we observed a striking, statistically significant tendency for individual disease phenotypes to be associated with multiple genes located within a single CNV region, a phenomenon that we denote as pheno-clustering. Many of the clusters also display statistically significant similarities in protein function or vicinity within the protein-protein interaction network. Our results provide a basis for understanding previously un-interpretable genotype-phenotype correlations in pathogenic CNVs and for mobilizing the large amount of model organism phenotype data to provide insights into human genetic disorders.
Collapse
Affiliation(s)
- Sandra C Doelken
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Andrews T, Webber C. Characterizing epistatic hotspots of human disease. BMC Proc 2012. [PMCID: PMC3467678 DOI: 10.1186/1753-6561-6-s6-o12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
34
|
Lemay DG, Martin WF, Hinrichs AS, Rijnkels M, German JB, Korf I, Pollard KS. G-NEST: a gene neighborhood scoring tool to identify co-conserved, co-expressed genes. BMC Bioinformatics 2012; 13:253. [PMID: 23020263 PMCID: PMC3575404 DOI: 10.1186/1471-2105-13-253] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2012] [Accepted: 09/23/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In previous studies, gene neighborhoods-spatial clusters of co-expressed genes in the genome-have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Scoring Tool (G-NEST) which combines genomic location, gene expression, and evolutionary sequence conservation data to score putative gene neighborhoods across all possible window sizes simultaneously. RESULTS Using G-NEST on atlases of mouse and human tissue expression data, we found that large neighborhoods of ten or more genes are extremely rare in mammalian genomes. When they do occur, neighborhoods are typically composed of families of related genes. Both the highest scoring and the largest neighborhoods in mammalian genomes are formed by tandem gene duplication. Mammalian gene neighborhoods contain highly and variably expressed genes. Co-localized noisy gene pairs exhibit lower evolutionary conservation of their adjacent genome locations, suggesting that their shared transcriptional background may be disadvantageous. Genes that are essential to mammalian survival and reproduction are less likely to occur in neighborhoods, although neighborhoods are enriched with genes that function in mitosis. We also found that gene orientation and protein-protein interactions are partially responsible for maintenance of gene neighborhoods. CONCLUSIONS Our experiments using G-NEST confirm that tandem gene duplication is the primary driver of non-random gene order in mammalian genomes. Non-essentiality, co-functionality, gene orientation, and protein-protein interactions are additional forces that maintain gene neighborhoods, especially those formed by tandem duplicates. We expect G-NEST to be useful for other applications such as the identification of core regulatory modules, common transcriptional backgrounds, and chromatin domains. The software is available at http://docpollard.org/software.html.
Collapse
Affiliation(s)
- Danielle G Lemay
- Genome Center, University of California Davis, 451 Health Science Dr, Davis, CA, 95616, United States of America.
| | | | | | | | | | | | | |
Collapse
|
35
|
Zou X, Suppanz I, Raman H, Hou J, Wang J, Long Y, Jung C, Meng J. Comparative analysis of FLC homologues in Brassicaceae provides insight into their role in the evolution of oilseed rape. PLoS One 2012; 7:e45751. [PMID: 23029223 PMCID: PMC3459951 DOI: 10.1371/journal.pone.0045751] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2012] [Accepted: 08/24/2012] [Indexed: 11/18/2022] Open
Abstract
We identified nine FLOWERING LOCUS C homologues (BnFLC) in Brassica napus and found that the coding sequences of all BnFLCs were relatively conserved but the intronic and promoter regions were more divergent. The BnFLC homologues were mapped to six of 19 chromosomes. All of the BnFLC homologues were located in the collinear region of FLC in the Arabidopsis genome except BnFLC.A3b and BnFLC.C3b, which were mapped to noncollinear regions of chromosome A3 and C3, respectively. Four of the homologues were associated significantly with quantitative trait loci for flowering time in two mapping populations. The BnFLC homologues showed distinct expression patterns in vegetative and reproductive organs, and at different developmental stages. BnFLC.A3b was differentially expressed between the winter-type and semi-winter-type cultivars. Microsynteny analysis indicated that BnFLC.A3b might have been translocated to the present segment in a cluster with other flowering-time regulators, such as a homologue of FRIGIDA in Arabidopsis. This cluster of flowering-time genes might have conferred a selective advantage to Brassica species in terms of increased adaptability to diverse environments during their evolution and domestication process.
Collapse
Affiliation(s)
- Xiaoxiao Zou
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Ida Suppanz
- Plant Breeding Institute, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Harsh Raman
- EH Graham Centre for Agricultural Innovation (an alliance between the Charles Sturt University and New South Wales Department of Primary Industries), Wagga Wagga Agricultural Institute, Wagga Wagga, New South Wales, Australia
| | - Jinna Hou
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Jing Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Yan Long
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Christian Jung
- Plant Breeding Institute, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Jinling Meng
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
36
|
Boldogköi Z. Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci. Front Genet 2012; 3:122. [PMID: 22783276 PMCID: PMC3389743 DOI: 10.3389/fgene.2012.00122] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Accepted: 06/15/2012] [Indexed: 11/25/2022] Open
Abstract
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
Collapse
Affiliation(s)
- Zsolt Boldogköi
- Department of Medical Biology, Faculty of Medicine, University of Szeged, Szeged, Hungary
| |
Collapse
|
37
|
Pavlidis P, Jensen JD, Stephan W, Stamatakis A. A critical assessment of storytelling: gene ontology categories and the importance of validating genomic scans. Mol Biol Evol 2012; 29:3237-48. [PMID: 22617950 DOI: 10.1093/molbev/mss136] [Citation(s) in RCA: 159] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
In the age of whole-genome population genetics, so-called genomic scan studies often conclude with a long list of putatively selected loci. These lists are then further scrutinized to annotate these regions by gene function, corresponding biological processes, expression levels, or gene networks. Such annotations are often used to assess and/or verify the validity of the genome scan and the statistical methods that have been used to perform the analyses. Furthermore, these results are frequently considered to validate "true-positives" if the identified regions make biological sense a posteriori. Here, we show that this approach can be potentially misleading. By simulating neutral evolutionary histories, we demonstrate that it is possible not only to obtain an extremely high false-positive rate but also to make biological sense out of the false-positives and construct a sensible biological narrative. Results are compared with a recent polymorphism data set from Drosophila melanogaster.
Collapse
Affiliation(s)
- Pavlos Pavlidis
- The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies (HITS gGmbH), Heidelberg, Germany.
| | | | | | | |
Collapse
|
38
|
Walker MB, King BL, Paigen K. Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species. PLoS One 2012; 7:e35274. [PMID: 22563380 PMCID: PMC3338513 DOI: 10.1371/journal.pone.0035274] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 03/14/2012] [Indexed: 11/22/2022] Open
Abstract
Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent [1], [2], [3]. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity. Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Collapse
Affiliation(s)
| | - Benjamin L. King
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Kenneth Paigen
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- * E-mail:
| |
Collapse
|
39
|
Wada M, Takahashi H, Altaf-Ul-Amin M, Nakamura K, Hirai MY, Ohta D, Kanaya S. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes. Gene 2012; 503:56-64. [PMID: 22561113 DOI: 10.1016/j.gene.2012.04.043] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Revised: 03/19/2012] [Accepted: 04/17/2012] [Indexed: 11/24/2022]
Abstract
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system.
Collapse
|
40
|
Buggs R, Chamala S, Wu W, Tate J, Schnable P, Soltis D, Soltis P, Barbazuk W. Rapid, Repeated, and Clustered Loss of Duplicate Genes in Allopolyploid Plant Populations of Independent Origin. Curr Biol 2012; 22:248-52. [DOI: 10.1016/j.cub.2011.12.027] [Citation(s) in RCA: 129] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Revised: 12/09/2011] [Accepted: 12/09/2011] [Indexed: 11/29/2022]
|
41
|
Hopkinson BM, Barbeau KA. Iron transporters in marine prokaryotic genomes and metagenomes. Environ Microbiol 2011; 14:114-28. [DOI: 10.1111/j.1462-2920.2011.02539.x] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
42
|
Lai AG, Denton-Giles M, Mueller-Roeber B, Schippers JHM, Dijkwel PP. Positional information resolves structural variations and uncovers an evolutionarily divergent genetic locus in accessions of Arabidopsis thaliana. Genome Biol Evol 2011; 3:627-40. [PMID: 21622917 PMCID: PMC3157834 DOI: 10.1093/gbe/evr038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genome sequencing of closely related individuals has yielded valuable insights that link genome evolution to phenotypic variations. However, advancement in sequencing technology has also led to an escalation in the number of poor quality–drafted genomes assembled based on reference genomes that can have highly divergent or haplotypic regions. The self-fertilizing nature of Arabidopsis thaliana poses an advantage to sequencing projects because its genome is mostly homozygous. To determine the accuracy of an Arabidopsis drafted genome in less conserved regions, we performed a resequencing experiment on a ∼371-kb genomic interval in the Landsberg erecta (Ler-0) accession. We identified novel structural variations (SVs) between Ler-0 and the reference accession Col-0 using a long-range polymerase chain reaction approach to generate an Illumina data set that has positional information, that is, a data set with reads that map to a known location. Positional information is important for accurate genome assembly and the resolution of SVs particularly in highly duplicated or repetitive regions. Sixty-one regions with misassembly signatures were identified from the Ler-0 draft, suggesting the presence of novel SVs that are not represented in the draft sequence. Sixty of those were resolved by iterative mapping using our data set. Fifteen large indels (>100 bp) identified from this study were found to be located either within protein-coding regions or upstream regulatory regions, suggesting the formation of novel alleles or altered regulation of existing genes in Ler-0. We propose future genome-sequencing experiments to follow a clone-based approach that incorporates positional information to ultimately reveal haplotype-specific differences between accessions.
Collapse
Affiliation(s)
- Alvina G Lai
- Institute of Molecular BioSciences, Massey University, Private Bag 11-222, Palmerston North 4442, New Zealand
| | | | | | | | | |
Collapse
|
43
|
Alloza E, Al-Shahrour F, Cigudosa JC, Dopazo J. A large scale survey reveals that chromosomal copy-number alterations significantly affect gene modules involved in cancer initiation and progression. BMC Med Genomics 2011; 4:37. [PMID: 21548942 PMCID: PMC3112060 DOI: 10.1186/1755-8794-4-37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Accepted: 05/06/2011] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Recent observations point towards the existence of a large number of neighborhoods composed of functionally-related gene modules that lie together in the genome. This local component in the distribution of the functionality across chromosomes is probably affecting the own chromosomal architecture by limiting the possibilities in which genes can be arranged and distributed across the genome. As a direct consequence of this fact it is therefore presumable that diseases such as cancer, harboring DNA copy number alterations (CNAs), will have a symptomatology strongly dependent on modules of functionally-related genes rather than on a unique "important" gene. METHODS We carried out a systematic analysis of more than 140,000 observations of CNAs in cancers and searched by enrichments in gene functional modules associated to high frequencies of loss or gains. RESULTS The analysis of CNAs in cancers clearly demonstrates the existence of a significant pattern of loss of gene modules functionally related to cancer initiation and progression along with the amplification of modules of genes related to unspecific defense against xenobiotics (probably chemotherapeutical agents). With the extension of this analysis to an Array-CGH dataset (glioblastomas) from The Cancer Genome Atlas we demonstrate the validity of this approach to investigate the functional impact of CNAs. CONCLUSIONS The presented results indicate promising clinical and therapeutic implications. Our findings also directly point out to the necessity of adopting a function-centric, rather a gene-centric, view in the understanding of phenotypes or diseases harboring CNAs.
Collapse
Affiliation(s)
- Eva Alloza
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Fátima Al-Shahrour
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
- Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - Juan C Cigudosa
- CIBER de Enfermedades Raras (CIBERER), ISCIII, CIPF, Valencia, Spain
- Molecular Cytogenetics Group. Centro Nacional de Investigaciones Oncologicas (CNIO), Madrid, Spain
| | - Joaquín Dopazo
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
- CIBER de Enfermedades Raras (CIBERER), ISCIII, CIPF, Valencia, Spain
- Functional Genomics Node (INB), CIPF, Valencia, Spain
| |
Collapse
|
44
|
Weber CC, Hurst LD. Support for multiple classes of local expression clusters in Drosophila melanogaster, but no evidence for gene order conservation. Genome Biol 2011; 12:R23. [PMID: 21414197 PMCID: PMC3129673 DOI: 10.1186/gb-2011-12-3-r23] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2011] [Revised: 03/04/2011] [Accepted: 03/17/2011] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Gene order in eukaryotic genomes is not random, with genes with similar expression profiles tending to cluster. In yeasts, the model taxon for gene order analysis, such syntenic clusters of non-homologous genes tend to be conserved over evolutionary time. Whether similar clusters show gene order conservation in other lineages is, however, undecided. Here, we examine this issue in Drosophila melanogaster using high-resolution chromosome rearrangement data. RESULTS We show that D. melanogaster has at least three classes of expression clusters: first, as observed in mammals, large clusters of functionally unrelated housekeeping genes; second, small clusters of functionally related highly co-expressed genes; and finally, as previously defined by Spellman and Rubin, larger domains of co-expressed but functionally unrelated genes. The latter are, however, not independent of the small co-expression clusters and likely reflect a methodological artifact. While the small co-expression and housekeeping/essential gene clusters resemble those observed in yeast, in contrast to yeast, we see no evidence that any of the three cluster types are preserved as synteny blocks. If anything, adjacent co-expressed genes are more likely to become rearranged than expected. Again in contrast to yeast, in D. melanogaster, gene pairs with short intergene distance or in divergent orientations tend to have higher rearrangement rates. These findings are consistent with co-expression being partly due to shared chromatin environment. CONCLUSIONS We conclude that, while similar in terms of cluster types, gene order evolution has strikingly different patterns in yeasts and in D. melanogaster, although recombination is associated with gene order rearrangement in both.
Collapse
Affiliation(s)
- Claudia C Weber
- Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | | |
Collapse
|