1
|
Mascher M, Marone MP, Schreiber M, Stein N. Are cereal grasses a single genetic system? NATURE PLANTS 2024; 10:719-731. [PMID: 38605239 DOI: 10.1038/s41477-024-01674-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 03/17/2024] [Indexed: 04/13/2024]
Abstract
In 1993, a passionate and provocative call to arms urged cereal researchers to consider the taxon they study as a single genetic system and collaborate with each other. Since then, that group of scientists has seen their discipline blossom. In an attempt to understand what unity of genetic systems means and how the notion was borne out by later research, we survey the progress and prospects of cereal genomics: sequence assemblies, population-scale sequencing, resistance gene cloning and domestication genetics. Gene order may not be as extraordinarily well conserved in the grasses as once thought. Still, several recurring themes have emerged. The same ancestral molecular pathways defining plant architecture have been co-opted in the evolution of different cereal crops. Such genetic convergence as much as cross-fertilization of ideas between cereal geneticists has led to a rich harvest of genes that, it is hoped, will lead to improved varieties.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| | - Marina Püpke Marone
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany
| | - Mona Schreiber
- University of Marburg, Department of Biology, Marburg, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany.
- Martin Luther University Halle-Wittenberg, Halle (Saale), Germany.
| |
Collapse
|
2
|
Drown MK, Crawford DL, Oleksiak MF. Transcriptomic analysis provides insights into molecular mechanisms of thermal physiology. BMC Genomics 2022; 23:421. [PMID: 35659182 PMCID: PMC9167525 DOI: 10.1186/s12864-022-08653-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 05/18/2022] [Indexed: 11/15/2022] Open
Abstract
Physiological trait variation underlies health, responses to global climate change, and ecological performance. Yet, most physiological traits are complex, and we have little understanding of the genes and genomic architectures that define their variation. To provide insight into the genetic architecture of physiological processes, we related physiological traits to heart and brain mRNA expression using a weighted gene co-expression network analysis. mRNA expression was used to explain variation in six physiological traits (whole animal metabolism (WAM), critical thermal maximum (CTmax), and four substrate specific cardiac metabolic rates (CaM)) under 12 °C and 28 °C acclimation conditions. Notably, the physiological trait variations among the three geographically close (within 15 km) and genetically similar F. heteroclitus populations are similar to those found among 77 aquatic species spanning 15–20° of latitude (~ 2,000 km). These large physiological trait variations among genetically similar individuals provide a powerful approach to determine the relationship between mRNA expression and heritable fitness related traits unconfounded by interspecific differences. Expression patterns explained up to 82% of metabolic trait variation and were enriched for multiple signaling pathways known to impact metabolic and thermal tolerance (e.g., AMPK, PPAR, mTOR, FoxO, and MAPK) but also contained several unexpected pathways (e.g., apoptosis, cellular senescence), suggesting that physiological trait variation is affected by many diverse genes.
Collapse
|
3
|
Palazzo AF, Kejiou NS. Non-Darwinian Molecular Biology. Front Genet 2022; 13:831068. [PMID: 35251134 PMCID: PMC8888898 DOI: 10.3389/fgene.2022.831068] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/24/2022] [Indexed: 12/14/2022] Open
Abstract
With the discovery of the double helical structure of DNA, a shift occurred in how biologists investigated questions surrounding cellular processes, such as protein synthesis. Instead of viewing biological activity through the lens of chemical reactions, this new field used biological information to gain a new profound view of how biological systems work. Molecular biologists asked new types of questions that would have been inconceivable to the older generation of researchers, such as how cellular machineries convert inherited biological information into functional molecules like proteins. This new focus on biological information also gave molecular biologists a way to link their findings to concepts developed by genetics and the modern synthesis. However, by the late 1960s this all changed. Elevated rates of mutation, unsustainable genetic loads, and high levels of variation in populations, challenged Darwinian evolution, a central tenant of the modern synthesis, where adaptation was the main driver of evolutionary change. Building on these findings, Motoo Kimura advanced the neutral theory of molecular evolution, which advocates that selection in multicellular eukaryotes is weak and that most genomic changes are neutral and due to random drift. This was further elaborated by Jack King and Thomas Jukes, in their paper “Non-Darwinian Evolution”, where they pointed out that the observed changes seen in proteins and the types of polymorphisms observed in populations only become understandable when we take into account biochemistry and Kimura’s new theory. Fifty years later, most molecular biologists remain unaware of these fundamental advances. Their adaptionist viewpoint fails to explain data collected from new powerful technologies which can detect exceedingly rare biochemical events. For example, high throughput sequencing routinely detects RNA transcripts being produced from almost the entire genome yet are present less than one copy per thousand cells and appear to lack any function. Molecular biologists must now reincorporate ideas from classical biochemistry and absorb modern concepts from molecular evolution, to craft a new lens through which they can evaluate the functionality of transcriptional units, and make sense of our messy, intricate, and complicated genome.
Collapse
|
4
|
Fagundes NJ, Bisso-Machado R, Figueiredo PI, Varal M, Zani AL. OUP accepted manuscript. Genome Biol Evol 2022; 14:6583081. [PMID: 35535669 PMCID: PMC9086759 DOI: 10.1093/gbe/evac055] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2022] [Indexed: 11/12/2022] Open
Abstract
“Junk DNA” is a popular yet controversial concept that states that organisms carry in their genomes DNA that has no positive impact on their fitness. Nonetheless, biochemical functions have been identified for an increasing fraction of DNA elements traditionally seen as “Junk DNA”. These findings have been interpreted as fundamentally undermining the “Junk DNA” concept. Here, we reinforce previous arguments that this interpretation relies on an inadequate concept of biological function that does not consider the selected effect of a given genomic structure, which is central to the “Junk DNA” concept. Next, we suggest that another (though ignored) confounding factor is that the discussion about biological functions includes two different dimensions: a horizontal, ecological dimension that reflects how a given genomic element affects fitness in a specific time, and a vertical, temporal dimension that reflects how a given genomic element persisted along time. We suggest that “Junk DNA” should be used exclusively relative to the horizontal dimension, while for the vertical dimension, we propose a new term, “Spam DNA”, that reflects the fact that a given genomic element may persist in the genome even if not selected for on their origin. Importantly, these concepts are complementary. An element can be both “Spam DNA” and “Junk DNA”, and “Spam DNA” can also be recruited to perform evolved biological functions, as illustrated in processes of exaptation or constructive neutral evolution.
Collapse
Affiliation(s)
| | - Rafael Bisso-Machado
- Postgraduate Program in Genetics and Molecular Biology, Institute of Biosciences, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Pedro I.C.C. Figueiredo
- Postgraduate Program in Genetics and Molecular Biology, Institute of Biosciences, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Maikel Varal
- Postgraduate Program in Genetics and Molecular Biology, Institute of Biosciences, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - André L.S. Zani
- Postgraduate Program in Genetics and Molecular Biology, Institute of Biosciences, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| |
Collapse
|
5
|
Akhlaghpour H. An RNA-Based Theory of Natural Universal Computation. J Theor Biol 2021; 537:110984. [PMID: 34979104 DOI: 10.1016/j.jtbi.2021.110984] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 09/30/2021] [Accepted: 12/07/2021] [Indexed: 12/15/2022]
Abstract
Life is confronted with computation problems in a variety of domains including animal behavior, single-cell behavior, and embryonic development. Yet we currently do not know of a naturally existing biological system that is capable of universal computation, i.e., Turing-equivalent in scope. Generic finite-dimensional dynamical systems (which encompass most models of neural networks, intracellular signaling cascades, and gene regulatory networks) fall short of universal computation, but are assumed to be capable of explaining cognition and development. I present a class of models that bridge two concepts from distant fields: combinatory logic (or, equivalently, lambda calculus) and RNA molecular biology. A set of basic RNA editing rules can make it possible to compute any computable function with identical algorithmic complexity to that of Turing machines. The models do not assume extraordinarily complex molecular machinery or any processes that radically differ from what we already know to occur in cells. Distinct independent enzymes can mediate each of the rules and RNA molecules solve the problem of parenthesis matching through their secondary structure. In the most plausible of these models all of the editing rules can be implemented with merely cleavage and ligation operations at fixed positions relative to predefined motifs. This demonstrates that universal computation is well within the reach of molecular biology. It is therefore reasonable to assume that life has evolved - or possibly began with - a universal computer that yet remains to be discovered. The variety of seemingly unrelated computational problems across many scales can potentially be solved using the same RNA-based computation system. Experimental validation of this theory may immensely impact our understanding of memory, cognition, development, disease, evolution, and the early stages of life.
Collapse
Affiliation(s)
- Hessameddin Akhlaghpour
- Laboratory of Integrative Brain Function, The Rockefeller University, New York, NY, 10065, USA
| |
Collapse
|
6
|
Mortimer K, Fitzhugh K, dos Brasil AC, Lana P. Who's who in Magelona: phylogenetic hypotheses under Magelonidae Cunningham & Ramage, 1888 (Annelida: Polychaeta). PeerJ 2021; 9:e11993. [PMID: 35070516 PMCID: PMC8759375 DOI: 10.7717/peerj.11993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 07/27/2021] [Indexed: 11/21/2022] Open
Abstract
Known as shovel head worms, members of Magelonidae comprise a group of polychaetes readily recognised by the uniquely shaped, dorso-ventrally flattened prostomium and paired ventro-laterally inserted papillated palps. The present study is the first published account of inferences of phylogenetic hypotheses within Magelonidae. Members of 72 species of Magelona and two species of Octomagelona were included, with outgroups including members of one species of Chaetopteridae and four of Spionidae. The phylogenetic inferences were performed to causally account for 176 characters distributed among 79 subjects, and produced 2,417,600 cladograms, each with 404 steps. A formal definition of Magelonidae is provided, represented by a composite phylogenetic hypothesis explaining seven synapomorphies: shovel-shaped prostomium, prostomial ridges, absence of nuchal organs, ventral insertion of palps and their papillation, presence of a burrowing organ, and unique body regionation. Octomagelona is synonymised with Magelona due to the latter being paraphyletic relative to the former. The consequence is that Magelonidae is monotypic, such that Magelona cannot be formally defined as associated with any phylogenetic hypotheses. As such, the latter name is an empirically empty placeholder, but because of the binomial name requirement mandated by the International Code of Zoological Nomenclature, the definition is identical to that of Magelonidae. Several key features for future descriptions are suggested: prostomial dimensions, presence/absence of prostomial horns, morphology of anterior lamellae, presence/absence of specialised chaetae, and lateral abdominal pouches. Additionally, great care must be taken to fully describe and illustrate all thoracic chaetigers in descriptions.
Collapse
Affiliation(s)
- Kate Mortimer
- Natural Sciences, Amgueddfa Cymru–National Museum Wales, Cardiff, Wales, United Kingdom
| | - Kirk Fitzhugh
- Natural History Museum of Los Angeles County, Los Angeles, CA, United States of America
| | - Ana Claudia dos Brasil
- Departamento de Biologia Animal, Instituto de Ciências Biológicas e da Saúde, Universidade Federal Rural do Rio de Janeiro, Seropédica, Rio de Janeiro, Brazil
| | - Paulo Lana
- Centro de Estudos do Mar, Universidade Federal do Paraná, Pontal do Sul, Paraná, Brazil
| |
Collapse
|
7
|
Linquist S, Fullerton B. Transposon dynamics and the epigenetic switch hypothesis. THEORETICAL MEDICINE AND BIOETHICS 2021; 42:137-154. [PMID: 34919173 PMCID: PMC8938347 DOI: 10.1007/s11017-021-09548-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 10/15/2021] [Indexed: 06/14/2023]
Abstract
The recent explosion of interest in epigenetics is often portrayed as the dawning of a scientific revolution that promises to transform biomedical science along with developmental and evolutionary biology. Much of this enthusiasm surrounds what we call the epigenetic switch hypothesis, which regards certain examples of epigenetic inheritance as an adaptive organismal response to environmental change. This interpretation overlooks an alternative explanation in terms of coevolutionary dynamics between parasitic transposons and the host genome. This raises a question about whether epigenetics researchers tend to overlook transposon dynamics more generally. To address this question, we surveyed a large sample of scientific publications on the topics of epigenetics and transposons over the past fifty years. We found that enthusiasm for epigenetics is often inversely related to interest in transposon dynamics across the four disciplines we examined. Most surprising was a declining interest in transposons within biomedical science and cellular and molecular biology over the past two decades. Also notable was a delayed and relatively muted enthusiasm for epigenetics within evolutionary biology. An analysis of scientific abstracts from the past twenty-five years further reveals systematic differences among disciplines in their uses of the term epigenetic, especially with respect to heritability commitments and functional interpretations. Taken together, these results paint a nuanced picture of the rise of epigenetics and the possible neglect of transposon dynamics, especially among biomedical scientists.
Collapse
Affiliation(s)
- Stefan Linquist
- Department of Philosophy, University of Guelph, Guelph, ON, Canada.
| | - Brady Fullerton
- Department of Philosophy, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
8
|
Brunet TDP, Doolittle WF, Bielawski JP. The role of purifying selection in the origin and maintenance of complex function. STUDIES IN HISTORY AND PHILOSOPHY OF SCIENCE 2021; 87:125-135. [PMID: 34111815 DOI: 10.1016/j.shpsa.2021.03.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 03/18/2021] [Indexed: 06/12/2023]
Abstract
Fitness contribution alone should not be the criterion of 'function' in molecular biology and genomics. Disagreement over the use of 'function' in molecular biology and genomics is still with us, almost eight years after publicity surrounding the Encyclopedia of DNA Elements project claimed that 80.4% of the human genome comprises "functional elements". Recent approaches attempt to resolve or reformulate this debate by redefining genomic 'function' in terms of current fitness contribution. In its favour, this redefinition for the genomic context is in apparent conformity with predominant experimental practices, especially in biomedical research, and with ascription of function by selective maintenance. We argue against approaches of this kind, however, on the grounds that they could be seen as non-Darwinian, and fail to properly account for the diversity of non-adaptive processes involved in the origin and maintenance of genomic complexity. We examine cases of molecular and organismal complexity that arise neutrally, showing how purifying selection maintains non-adaptive genomic complexity. Rather than lumping different sorts of genomic complexity together by defining 'function' as fitness contribution, we argue that it is best to separate the heterogeneous contributions of preaptation, exaptation and adaptation to the historical processes of origin and maintenance for complex features.
Collapse
Affiliation(s)
- Tyler D P Brunet
- Department of the History and Philosophy of Science, University of Cambridge, United Kingdom.
| | - W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Canada
| | - Joseph P Bielawski
- Departments of Biology and Mathematics and Statistics, Dalhousie University, Canada
| |
Collapse
|
9
|
Schmitt-Ulms G, Mehrabian M, Williams D, Ehsani S. The IDIP framework for assessing protein function and its application to the prion protein. Biol Rev Camb Philos Soc 2021; 96:1907-1932. [PMID: 33960099 DOI: 10.1111/brv.12731] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 04/22/2021] [Accepted: 04/26/2021] [Indexed: 01/06/2023]
Abstract
The quest to determine the function of a protein can represent a profound challenge. Although this task is the mandate of countless research groups, a general framework for how it can be approached is conspicuously lacking. Moreover, even expectations for when the function of a protein can be considered to be 'known' are not well defined. In this review, we begin by introducing concepts pertinent to the challenge of protein function assignments. We then propose a framework for inferring a protein's function from four data categories: 'inheritance', 'distribution', 'interactions' and 'phenotypes' (IDIP). We document that the functions of proteins emerge at the intersection of inferences drawn from these data categories and emphasise the benefit of considering them in an evolutionary context. We then apply this approach to the cellular prion protein (PrPC ), well known for its central role in prion diseases, whose function continues to be considered elusive by many investigators. We document that available data converge on the conclusion that the function of the prion protein is to control a critical post-translational modification of the neural cell adhesion molecule in the context of epithelial-to-mesenchymal transition and related plasticity programmes. Finally, we argue that this proposed function of PrPC has already passed the test of time and is concordant with the IDIP framework in a way that other functions considered for this protein fail to achieve. We anticipate that the IDIP framework and the concepts analysed herein will aid the investigation of other proteins whose primary functional assignments have thus far been intractable.
Collapse
Affiliation(s)
- Gerold Schmitt-Ulms
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Toronto, ON, M5T 0S8, Canada.,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | | | - Declan Williams
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Toronto, ON, M5T 0S8, Canada
| | - Sepehr Ehsani
- Theoretical and Philosophical Biology, Department of Philosophy, University College London, Bloomsbury, London, WC1E 6BT, U.K.,Ronin Institute for Independent Scholarship, Montclair, NJ, 07043, U.S.A
| |
Collapse
|
10
|
Giudicelli F, Roest Crollius H. On the importance of evolutionary constraint for regulatory sequence identification. Brief Funct Genomics 2021:elab015. [PMID: 33754633 DOI: 10.1093/bfgp/elab015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/15/2021] [Accepted: 02/19/2021] [Indexed: 11/13/2022] Open
Abstract
Regulation of gene expression relies on the activity of specialized genomic elements, enhancers or silencers, distributed over sometimes large distance from their target gene promoters. A significant part of vertebrate genomes consists in such regulatory elements, but their identification and that of their target genes remains challenging, due to the lack of clear signature at the nucleotide level. For many years the main hallmark used for identifying functional elements has been their sequence conservation between genomes of distant species, indicative of purifying selection. More recently, genome-wide biochemical assays have opened new avenues for detecting regulatory regions, shifting attention away from evolutionary constraints. Here, we review the respective contributions of comparative genomics and biochemical assays for the definition of regulatory elements and their targets and advocate that both sequence conservation and preserved synteny, taken as signature of functional constraint, remain essential tools in this task.
Collapse
|
11
|
Palazzo AF, Koonin EV. Functional Long Non-coding RNAs Evolve from Junk Transcripts. Cell 2020; 183:1151-1161. [PMID: 33068526 DOI: 10.1016/j.cell.2020.09.047] [Citation(s) in RCA: 129] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 08/20/2020] [Accepted: 09/17/2020] [Indexed: 12/30/2022]
Abstract
Transcriptome studies reveal pervasive transcription of complex genomes, such as those of mammals. Despite popular arguments for functionality of most, if not all, of these transcripts, genome-wide analysis of selective constraints indicates that most of the produced RNA are junk. However, junk is not garbage. On the contrary, junk transcripts provide the raw material for the evolution of diverse long non-coding (lnc) RNAs by non-adaptive mechanisms, such as constructive neutral evolution. The generation of many novel functional entities, such as lncRNAs, that fuels organismal complexity does not seem to be driven by strong positive selection. Rather, the weak selection regime that dominates the evolution of most multicellular eukaryotes provides ample material for functional innovation with relatively little adaptation involved.
Collapse
Affiliation(s)
- Alexander F Palazzo
- Department of Biochemistry, University of Toronto, Toronto, ON M5G 1M1, Canada.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
12
|
Blommaert J. Genome size evolution: towards new model systems for old questions. Proc Biol Sci 2020; 287:20201441. [PMID: 32842932 PMCID: PMC7482279 DOI: 10.1098/rspb.2020.1441] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 07/29/2020] [Indexed: 12/20/2022] Open
Abstract
Genome size (GS) variation is a fundamental biological characteristic; however, its evolutionary causes and consequences are the topic of ongoing debate. Whether GS is a neutral trait or one subject to selective pressures, and how strong these selective pressures are, may remain open questions. Fundamentally, the genomic sequences responsible for this variation directly impact the potential evolutionary outcomes and, equally, are the targets of different evolutionary pressures. For example, duplications and deletions of genic regions (large or small) can have immediate and drastic phenotypic effects, while an expansion or contraction of non-coding DNA is less likely to cause catastrophic phenotypic effects. However, in the long term, the accumulation or deletion of ncDNA is likely to have larger effects. Modern sequencing technologies are allowing for the dissection of these proximate causes, but a combination of these new technologies with more traditional evolutionary experiments and approaches could revolutionize this debate and potentially resolve many of these arguments. Here, I discuss an ambitious way forward for GS research, putting it in context of historical debates, theories and sometimes contradictory evidence, and highlighting the promise of combining new sequencing technologies and analytical developments with more traditional experimental evolution approaches.
Collapse
Affiliation(s)
- Julie Blommaert
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
13
|
Zile K, Dessimoz C, Wurm Y, Masel J. Only a Single Taxonomically Restricted Gene Family in the Drosophila melanogaster Subgroup Can Be Identified with High Confidence. Genome Biol Evol 2020; 12:1355-1366. [PMID: 32589737 PMCID: PMC8059200 DOI: 10.1093/gbe/evaa127] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/19/2020] [Indexed: 12/12/2022] Open
Abstract
Taxonomically restricted genes (TRGs) are genes that are present only in one clade. Protein-coding TRGs may evolve de novo from previously noncoding sequences: functional ncRNA, introns, or alternative reading frames of older protein-coding genes, or intergenic sequences. A major challenge in studying de novo genes is the need to avoid both false-positives (nonfunctional open reading frames and/or functional genes that did not arise de novo) and false-negatives. Here, we search conservatively for high-confidence TRGs as the most promising candidates for experimental studies, ensuring functionality through conservation across at least two species, and ensuring de novo status through examination of homologous noncoding sequences. Our pipeline also avoids ascertainment biases associated with preconceptions of how de novo genes are born. We identify one TRG family that evolved de novo in the Drosophila melanogaster subgroup. This TRG family contains single-copy genes in Drosophila simulans and Drosophila sechellia. It originated in an intron of a well-established gene, sharing that intron with another well-established gene upstream. These TRGs contain an intron that predates their open reading frame. These genes have not been previously reported as de novo originated, and to our knowledge, they are the best Drosophila candidates identified so far for experimental studies aimed at elucidating the properties of de novo genes.
Collapse
Affiliation(s)
- Karina Zile
- Division of Biosciences, University College London, United Kingdom
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, Switzerland
- Department of Genetics, Evolution and Environment, University College London, United Kingdom
- Department of Computer Science, University College London, United Kingdom
| | - Yannick Wurm
- School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom
- Alan Turing Institute, London, United Kingdom
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona
| |
Collapse
|