Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Neme R, Tautz D. Evolution: dynamics of de novo gene emergence. Curr Biol 2016;24:R238-40. [PMID: 24650912 DOI: 10.1016/j.cub.2014.02.016] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Number

Cited by Other Article(s)

Klimovich A, Bosch TCG. Novel technologies uncover novel 'anti'-microbial peptides in Hydra shaping the species-specific microbiome. Philos Trans R Soc Lond B Biol Sci 2024;379:20230058. [PMID: 38497265 PMCID: PMC10945409 DOI: 10.1098/rstb.2023.0058] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/16/2023] [Indexed: 03/19/2024] Open

Domazet-Lošo M, Široki T, Šimičević K, Domazet-Lošo T. Macroevolutionary dynamics of gene family gain and loss along multicellular eukaryotic lineages. Nat Commun 2024;15:2663. [PMID: 38531970 DOI: 10.1038/s41467-024-47017-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open

Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Affiliation(s)

Katherine Fleck Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269 Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
Victor Luria Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510 Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142 Department of Systems Biology, Harvard Medical School, Boston, MA 02115
Nitanta Garag Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
Amir Karger IT-Research Computing, Harvard Medical School, Boston, MA 02115
Trevor Hunter Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
Daniel Marten Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
William Phu Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
Kee-Myoung Nam Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510
Nenad Sestan Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
Anne H. O’Donnell-Luria Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142 Department of Pediatrics, Harvard Medical School, Boston, MA 02115
Jelena Erceg Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269 Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269 Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030

Collapse

Álvarez-Lugo A, Becerra A. The Fate of Duplicated Enzymes in Prokaryotes: The Case of Isomerases. J Mol Evol 2023;91:76-92. [PMID: 36580111 DOI: 10.1007/s00239-022-10085-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 12/16/2022] [Indexed: 12/30/2022]

Moutinho AF, Eyre-Walker A, Dutheil JY. Strong evidence for the adaptive walk model of gene evolution in Drosophila and Arabidopsis. PLoS Biol 2022;20:e3001775. [PMID: 36099311 PMCID: PMC9470001 DOI: 10.1371/journal.pbio.3001775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 08/01/2022] [Indexed: 11/19/2022] Open

Heinen T, Xie C, Keshavarz M, Stappert D, Künzel S, Tautz D. Evolution of a New Testis-Specific Functional Promoter Within the Highly Conserved Map2k7 Gene of the Mouse. Front Genet 2022;12:812139. [PMID: 35069705 PMCID: PMC8766832 DOI: 10.3389/fgene.2021.812139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/08/2021] [Indexed: 12/03/2022] Open

Castro JF, Tautz D. The Effects of Sequence Length and Composition of Random Sequence Peptides on the Growth of E. coli Cells. Genes (Basel) 2021;12:1913. [PMID: 34946861 PMCID: PMC8702183 DOI: 10.3390/genes12121913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 11/22/2021] [Accepted: 11/26/2021] [Indexed: 12/21/2022] Open

Goymann W, Schwabl H. The tyranny of phylogeny-A plea for a less dogmatic stance on two-species comparisons: Funding bodies, journals and referees discourage two- or few-species comparisons, but such studies provide essential insights complementary to phylogenetic comparative studies. Bioessays 2021;43:e2100071. [PMID: 34155665 DOI: 10.1002/bies.202100071] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/04/2021] [Accepted: 06/08/2021] [Indexed: 11/11/2022]

Xie C, Bekpen C, Künzel S, Keshavarz M, Krebs-Wheaton R, Skrabar N, Ullrich KK, Zhang W, Tautz D. Dedicated transcriptomics combined with power analysis lead to functional understanding of genes with weak phenotypic changes in knockout lines. PLoS Comput Biol 2020;16:e1008354. [PMID: 33180766 PMCID: PMC7685438 DOI: 10.1371/journal.pcbi.1008354] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 11/24/2020] [Accepted: 09/20/2020] [Indexed: 12/26/2022] Open

Abstract

Systematic knockout studies in mice have shown that a large fraction of the gene replacements show no lethal or other overt phenotypes. This has led to the development of more refined analysis schemes, including physiological, behavioral, developmental and cytological tests. However, transcriptomic analyses have not yet been systematically evaluated for non-lethal knockouts. We conducted a power analysis to determine the experimental conditions under which even small changes in transcript levels can be reliably traced. We have applied this to two gene disruption lines of genes for which no function was known so far. Dedicated phenotyping tests informed by the tissues and stages of highest expression of the two genes show small effects on the tested phenotypes. For the transcriptome analysis of these stages and tissues, we used a prior power analysis to determine the number of biological replicates and the sequencing depth. We find that under these conditions, the knockouts have a significant impact on the transcriptional networks, with thousands of genes showing small transcriptional changes. GO analysis suggests that A930004D18Rik is involved in developmental processes through contributing to protein complexes, and A830005F24Rik in extracellular matrix functions. Subsampling analysis of the data reveals that the increase in the number of biological replicates was more important that increasing the sequencing depth to arrive at these results. Hence, our proof-of-principle experiment suggests that transcriptomic analysis is indeed an option to study gene functions of genes with weak or no traceable phenotypic effects and it provides the boundary conditions under which this is possible.

Knockout mice benefit the understanding of gene functions in mammals. However, it has proven difficult for many genes to identify clear phenotypes, related due to lack of sufficient assays. As Lewis Wolpert put it in a famous quote “But did you take them to the opera?”, thus metaphorically alluding to the need to extend phenotyping efforts. This insight led to the establishment of phenotyping pipelines that are nowadays routinely used to characterize knock-out lines. However, transcriptomic approaches based on RNA-Seq have been much less explored for such deep-level studies. We conducted here both, a theoretical power analysis and practical RNA-Seq experiments on two knockout lines with small phenotypic effects to investigate the parameters including sample size, sequencing depth, fold change, and dispersion. Our dedicated RNA-Seq studies discovered thousands of genes with small transcriptional changes and enriched in specific functions in both knockout lines. We find that it is more important to increase the number of samples than to increase the sequencing depth. Our work shows that a deep RNA-Seq study on knockouts is powerful for understanding gene functions in cases of weak phenotypic effects, and provides a guideline for the experimental design of such studies.

Collapse

Cutter AD, Bundus JD. Speciation and the developmental alarm clock. eLife 2020;9:e56276. [PMID: 32902377 PMCID: PMC7481004 DOI: 10.7554/elife.56276] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 08/28/2020] [Indexed: 12/16/2022] Open

Combination of Proteogenomics with Peptide De Novo Sequencing Identifies New Genes and Hidden Posttranscriptional Modifications. mBio 2019;10:mBio.02367-19. [PMID: 31615963 PMCID: PMC6794485 DOI: 10.1128/mbio.02367-19] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Abstract

Next-generation sequencing techniques have considerably increased the number of completely sequenced eukaryotic genomes. These genomes are mostly automatically annotated, and ab initio gene prediction is commonly combined with homology-based search approaches and often supported by transcriptomic data. The latter in particular improve the prediction of intron splice sites and untranslated regions. However, correct prediction of translation initiation sites (TIS), alternative splice junctions, and protein-coding potential remains challenging. Here, we present an advanced proteogenomics approach, namely, the combination of proteogenomics and de novo peptide sequencing analysis, in conjunction with Blast2GO and phylostratigraphy. Using the model fungus Sordaria macrospora as an example, we provide a comprehensive view of the proteome that not only increases the functional understanding of this multicellular organism at different developmental stages but also immensely enhances the genome annotation quality.

Proteogenomics combines proteomics, genomics, and transcriptomics and has considerably improved genome annotation in poorly investigated phylogenetic groups for which homology information is lacking. Furthermore, it can be advantageous when reinvestigating well-annotated genomes. Here, we applied an advanced proteogenomics approach, combining standard proteogenomics with peptide de novo sequencing, to refine annotation of the well-studied model fungus Sordaria macrospora. We investigated samples from different developmental and physiological conditions, resulting in the detection of 104 so-far hidden proteins and annotation changes in 575 genes, including 389 splice site refinements. Significantly, our approach provides peptide-level evidence for 113 single-amino-acid variations and 15 C-terminal protein elongations originating from A-to-I RNA editing, a phenomenon recently detected in fungi. Coexpression and phylostratigraphic analysis of the refined proteome suggest that new functions in evolutionarily young genes correlate with distinct developmental stages. In conclusion, our advanced proteogenomics approach supports and promotes functional studies of fungal model systems.

Collapse

In-depth analysis of Bacillus subtilis proteome identifies new ORFs and traces the evolutionary history of modified proteins. Sci Rep 2018;8:17246. [PMID: 30467398 PMCID: PMC6250715 DOI: 10.1038/s41598-018-35589-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 11/07/2018] [Indexed: 01/05/2023] Open

Bekpen C, Xie C, Tautz D. Dealing with the adaptive immune system during de novo evolution of genes from intergenic sequences. BMC Evol Biol 2018;18:121. [PMID: 30075701 PMCID: PMC6091031 DOI: 10.1186/s12862-018-1232-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 07/16/2018] [Indexed: 12/26/2022] Open

Abstract

Background

The adaptive immune system of vertebrates has an extraordinary potential to sense and neutralize foreign antigens entering the body. De novo evolution of genes implies that the genome itself expresses novel antigens from intergenic sequences which could cause a problem with this immune system. Peptides from these novel proteins could be presented by the major histocompatibility complex (MHC) receptors to the cell surface and would be recognized as foreign. The respective cells would then be attacked and destroyed, or would cause inflammatory responses. Hence, de novo expressed peptides have to be introduced to the immune system as being self-peptides to avoid such autoimmune reactions. The regulation of the distinction between self and non-self starts during embryonic development, but continues late into adulthood. It is mostly mediated by specialized cells in the thymus, but can also be conveyed in peripheral tissues, such as the lymph nodes and the spleen. The self-antigens need to be exposed to the reactive T-cells, which requires the expression of the genes in the respective tissues. Since the initial activation of a promotor for new intergenic transcription of a de novo gene could occur in any tissue, we should expect that the evolutionary establishment of a de novo gene in animals with an adaptive immune system should also involve expression in at least one of the tissues that confer self-recognition.

Results

We have studied this question by analyzing the transcriptomes of multiple tissues from young mice in three closely related natural populations of the house mouse (M. m. domesticus). We find that new intergenic transcription occurs indeed mostly in only a single tissue. When a second tissue becomes involved, thymus and spleen are significantly overrepresented.

Conclusions

We conclude that the inclusion of de novo transcripts in the processes for the induction of self-tolerance is indeed an important step in the evolution of functional de novo genes in vertebrates.

Electronic supplementary material

The online version of this article (10.1186/s12862-018-1232-z) contains supplementary material, which is available to authorized users.

Collapse

Li Z, Wan X. Long-term evolutionary DNA methylation dynamic of protein-coding genes and its underlying mechanism. Gene 2018;677:96-104. [PMID: 30031907 DOI: 10.1016/j.gene.2018.07.051] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 07/05/2018] [Accepted: 07/18/2018] [Indexed: 10/28/2022]

Banerjee S, Chakraborty S. Protein intrinsic disorder negatively associates with gene age in different eukaryotic lineages. MOLECULAR BIOSYSTEMS 2018;13:2044-2055. [PMID: 28783193 DOI: 10.1039/c7mb00230k] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Gambetta GA, Matthews MA, Syvanen M. The Xylella fastidosa RTX operons: evidence for the evolution of protein mosaics through novel genetic exchanges. BMC Genomics 2018;19:329. [PMID: 29728072 PMCID: PMC5935956 DOI: 10.1186/s12864-018-4731-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 04/26/2018] [Indexed: 01/14/2023] Open

Pang Y, Mao C, Liu S. Encoding activities of non-coding RNAs. Am J Cancer Res 2018;8:2496-2507. [PMID: 29721095 PMCID: PMC5928905 DOI: 10.7150/thno.24677] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Accepted: 02/25/2018] [Indexed: 12/14/2022] Open

Pezer Ž, Chung AG, Karn RC, Laukaitis CM. Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions. Genome Biol Evol 2018;9:3858091. [PMID: 28575204 PMCID: PMC5513543 DOI: 10.1093/gbe/evx099] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/26/2017] [Indexed: 12/26/2022] Open

Lei L, Steffen JG, Osborne EJ, Toomajian C. Plant organ evolution revealed by phylotranscriptomics in Arabidopsis thaliana. Sci Rep 2017;7:7567. [PMID: 28790409 PMCID: PMC5548721 DOI: 10.1038/s41598-017-07866-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 07/04/2017] [Indexed: 11/18/2022] Open

Domazet-Lošo T, Carvunis AR, Albà MM, Šestak MS, Bakaric R, Neme R, Tautz D. No Evidence for Phylostratigraphic Bias Impacting Inferences on Patterns of Gene Emergence and Evolution. Mol Biol Evol 2017;34:843-856. [PMID: 28087778 PMCID: PMC5400388 DOI: 10.1093/molbev/msw284] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open

Turetzek N, Khadjeh S, Schomburg C, Prpic NM. Rapid diversification of homothorax expression patterns after gene duplication in spiders. BMC Evol Biol 2017;17:168. [PMID: 28709396 PMCID: PMC5513375 DOI: 10.1186/s12862-017-1013-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 07/04/2017] [Indexed: 01/09/2023] Open

Catania F. From intronization to intron loss: How the interplay between mRNA-associated processes can shape the architecture and the expression of eukaryotic genes. Int J Biochem Cell Biol 2017;91:136-144. [PMID: 28673893 DOI: 10.1016/j.biocel.2017.06.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/25/2017] [Accepted: 06/30/2017] [Indexed: 12/29/2022]

Luis Villanueva-Cañas J, Ruiz-Orera J, Agea MI, Gallo M, Andreu D, Albà MM. New Genes and Functional Innovation in Mammals. Genome Biol Evol 2017;9:1886-1900. [PMID: 28854603 PMCID: PMC5554394 DOI: 10.1093/gbe/evx136] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2017] [Indexed: 12/22/2022] Open

Basile W, Sachenkova O, Light S, Elofsson A. High GC content causes orphan proteins to be intrinsically disordered. PLoS Comput Biol 2017;13:e1005375. [PMID: 28355220 PMCID: PMC5389847 DOI: 10.1371/journal.pcbi.1005375] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Revised: 04/12/2017] [Accepted: 01/21/2017] [Indexed: 01/29/2023] Open

Abstract

De novo creation of protein coding genes involves the formation of short ORFs from noncoding regions; some of these ORFs might then become fixed in the population. These orphan proteins need to, at the bare minimum, not cause serious harm to the organism, meaning that they should for instance not aggregate. Therefore, although the creation of short ORFs could be truly random, the fixation should be subjected to some selective pressure. The selective forces acting on orphan proteins have been elusive, and contradictory results have been reported. In Drosophila young proteins are more disordered than ancient ones, while the opposite trend is present in yeast. To the best of our knowledge no valid explanation for this difference has been proposed. To solve this riddle we studied structural properties and age of proteins in 187 eukaryotic organisms. We find that, with the exception of length, there are only small differences in the properties between proteins of different ages. However, when we take the GC content into account we noted that it could explain the opposite trends observed for orphans in yeast (low GC) and Drosophila (high GC). GC content is correlated with codons coding for disorder promoting amino acids. This leads us to propose that intrinsic disorder is not a strong determining factor for fixation of orphan proteins. Instead these proteins largely resemble random proteins given a particular GC level. During evolution the properties of a protein change faster than the GC level causing the relationship between disorder and GC to gradually weaken.

We show that the GC content of a genome is of great importance for the properties of an orphan protein. GC content affects the frequency of the codons and this affects the probability for each amino acid to be included in a de novo created protein. The codons encoding for Ala, Pro and Gly contain 80% GC, while codons for Lys, Phe, Asn, Tyr and Ile contain 20% or less. The three high GC amino acids are all disorder promoting, while Phe, Tyr and Ile are order promoting. Therefore, random protein sequences at a high GC will be more disordered than the ones created at a low GC. The structural properties of the youngest proteins match to a large degree the properties of random proteins when the GC content is taken into account. In contrast, structural properties of ancient proteins only show a weak correlation with GC content. This suggests that even after fixation in the population, proteins largely resemble random proteins given a certain GC content. Thereafter, during evolution the correlation between structural properties and GC weakens.

Collapse

Multi-step formation, evolution, and functionalization of new cytoplasmic male sterility genes in the plant mitochondrial genomes. Cell Res 2016;27:130-146. [PMID: 27725674 DOI: 10.1038/cr.2016.115] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Revised: 08/04/2016] [Accepted: 09/01/2016] [Indexed: 01/28/2023] Open

Li ZW, Chen X, Wu Q, Hagmann J, Han TS, Zou YP, Ge S, Guo YL. On the Origin of De Novo Genes in Arabidopsis thaliana Populations. Genome Biol Evol 2016;8:2190-202. [PMID: 27401176 PMCID: PMC4987118 DOI: 10.1093/gbe/evw164] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

McLysaght A, Hurst LD. Open questions in the study of de novo genes: what, how and why. Nat Rev Genet 2016;17:567-78. [PMID: 27452112 DOI: 10.1038/nrg.2016.78] [Citation(s) in RCA: 125] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

McLysaght A, Guerzoni D. New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation. Philos Trans R Soc Lond B Biol Sci 2016;370:20140332. [PMID: 26323763 PMCID: PMC4571571 DOI: 10.1098/rstb.2014.0332] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Emera D, Yin J, Reilly SK, Gockley J, Noonan JP. Origin and evolution of developmental enhancers in the mammalian neocortex. Proc Natl Acad Sci U S A 2016;113:E2617-26. [PMID: 27114548 PMCID: PMC4868431 DOI: 10.1073/pnas.1603718113] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Koonin EV. The meaning of biological information. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2016;374:rsta.2015.0065. [PMID: 26857678 PMCID: PMC4760125 DOI: 10.1098/rsta.2015.0065] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 07/27/2015] [Indexed: 06/05/2023]

Neme R, Tautz D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 2016;5:e09977. [PMID: 26836309 PMCID: PMC4829534 DOI: 10.7554/elife.09977] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 02/01/2016] [Indexed: 01/17/2023] Open

Abstract

Deep sequencing analyses have shown that a large fraction of genomes is transcribed, but the significance of this transcription is much debated. Here, we characterize the phylogenetic turnover of poly-adenylated transcripts in a comprehensive sampling of taxa of the mouse (genus Mus), spanning a phylogenetic distance of 10 Myr. Using deep RNA sequencing we find that at a given sequencing depth transcriptome coverage becomes saturated within a taxon, but keeps extending when compared between taxa, even at this very shallow phylogenetic level. Our data show a high turnover of transcriptional states between taxa and that no major transcript-free islands exist across evolutionary time. This suggests that the entire genome can be transcribed into poly-adenylated RNA when viewed at an evolutionary time scale. We conclude that any part of the non-coding genome can potentially become subject to evolutionary functionalization via de novo gene evolution within relatively short evolutionary time spans.

DOI:http://dx.doi.org/10.7554/eLife.09977.001

Traditionally, the genome – the sum total of DNA within a cell – was thought to be divided into genes and ‘non-coding’ regions. Genes are copied, or “transcribed”, into molecules called RNA that perform essential tasks in the cell. The roles of the non-coding regions were often less clear, although it has since become apparent that some are also transcribed and generate low levels of RNA molecules. However, many debate how significant this transcription is to living organisms.

Neme and Tautz have now used a technique called deep RNA sequencing to study the RNA molecules produced in several different species and types of mice whose last common ancestor lived 10 million years ago. Different species produced RNA molecules from different portions – both genes and non-coding regions – of their genomes. Comparing these RNA sequences suggests that changes to the regions that are transcribed occur relatively quickly for a large portion of the genome. Furthermore, there have been no significant areas of the common ancestor’s genome that have not been transcribed at some point in at least one of its descendent species.

This therefore suggests that over a relatively short evolutionary period, any part of the genome can acquire the ability to be transcribed and potentially form a new gene. The next challenge is to find out how often these transcribed non-coding parts of the genome show important biochemical activities, and how they find their way into becoming new genes.

DOI:http://dx.doi.org/10.7554/eLife.09977.002

Collapse

Ruiz-Orera J, Hernandez-Rodriguez J, Chiva C, Sabidó E, Kondova I, Bontrop R, Marqués-Bonet T, Albà M. Origins of De Novo Genes in Human and Chimpanzee. PLoS Genet 2015;11:e1005721. [PMID: 26720152 PMCID: PMC4697840 DOI: 10.1371/journal.pgen.1005721] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 11/11/2015] [Indexed: 11/18/2022] Open

Laan L, Koschwanez JH, Murray AW. Evolutionary adaptation after crippling cell polarization follows reproducible trajectories. eLife 2015;4. [PMID: 26426479 PMCID: PMC4630673 DOI: 10.7554/elife.09638] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Accepted: 09/30/2015] [Indexed: 12/21/2022] Open

Abstract

Cells are organized by functional modules, which typically contain components whose removal severely compromises the module's function. Despite their importance, these components are not absolutely conserved between parts of the tree of life, suggesting that cells can evolve to perform the same biological functions with different proteins. We evolved Saccharomyces cerevisiae for 1000 generations without the important polarity gene BEM1. Initially the bem1∆ lineages rapidly increase in fitness and then slowly reach >90% of the fitness of their BEM1 ancestors at the end of the evolution. Sequencing their genomes and monitoring polarization reveals a common evolutionary trajectory, with a fixed sequence of adaptive mutations, each improving cell polarization by inactivating proteins. Our results show that organisms can be evolutionarily robust to physiologically destructive perturbations and suggest that recovery by gene inactivation can lead to rapid divergence in the parts list for cell biologically important functions.

DOI:http://dx.doi.org/10.7554/eLife.09638.001

Cells use the genetic instructions provided by genes in particular combinations called ‘modules’ to perform particular jobs. Very different organisms can share many of the same modules because certain abilities are fundamental to the survival of all cells and so they have been retained over the course of evolution. That said, these modules may not necessarily involve the same genes because it is often possible to achieve the same result using different components.

One way to study how those modules can diversify is to deliberately disrupt one of the genes in a module, and observe how the organism and its descendants respond over many generations. Other genes in these organisms may acquire genetic mutations that enable the genes to take on the role of the missing protein. However, the removal of a single component can be detrimental to the survival of the organisms or may affect many different processes. This can make it difficult to understand what is going on.

A gene called BEM1 is crucial for yeast cells to establish polarity, that is, to allow the different sides of a cell to become distinct from one another. This activity is essential for the yeast to replicate itself. Previous studies have shown that the BEM1 gene had a different role in other species of fungi, which suggests that yeast may have other genes that previously assumed the role that BEM1 does now. In this study, Laan et al. removed BEM1 from yeast and allowed the population of mutant cells to evolve for a thousand generations. The approach differs from previous studies because Laan et al. deliberately selected for yeast that had acquired multiple genetic mutations that can together almost fully compensate for the loss of BEM1.

Initially, the mutant cells grew very slowly, were abnormal in shape and likely to burst open. However, by the end of the experiment, the cells were able to grow almost as well as the original yeast cells had before the gene deletion. Genetic analysis revealed that the deletion of BEM1 triggers the inactivation of other genes that are also involved in the regulation of polarity, which largely restored the ability of the disrupted polarity module to work. This restoration follows a ‘reproducible trajectory’, as the same genes were switched off in the same order in different populations of yeast that were studied at the same time.

The work is an example of reproducible evolution, whereby a specific order of changes to gene activity repeatedly enables cells with severe defects in important processes to adapt and restore a gene module, using whatever components they have left. The next challenge will be to understand how the particular roles of important modules affect their adaptability.

DOI:http://dx.doi.org/10.7554/eLife.09638.002

Collapse

Chen JY, Shen QS, Zhou WZ, Peng J, He BZ, Li Y, Liu CJ, Luan X, Ding W, Li S, Chen C, Tan BCM, Zhang YE, He A, Li CY. Emergence, Retention and Selection: A Trilogy of Origination for Functional De Novo Proteins from Ancestral LncRNAs in Primates. PLoS Genet 2015;11:e1005391. [PMID: 26177073 PMCID: PMC4503675 DOI: 10.1371/journal.pgen.1005391] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 06/24/2015] [Indexed: 01/08/2023] Open

Abstract

While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age, due to their GC-rich sequence property enabling stable ORFs with lower chance of non-sense mutations. Interestingly, although the emergence and retention of these de novo genes are likely driven by neutral forces, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution, which may contribute to human-specific genetic novelties by taking advantage of existed genomic contexts.

Although gene duplication has been believed as a predominant mechanism for creating new genes, recent reports suggested that new proteins could evolve “de novo” from non-coding DNA regions. These de novo genes are also named as “motherless” genes due to their lack of ancestral proteins as precursors, while recently we and others found that lncRNAs may represent an intermediate stage of their origination. To further elucidate this lncRNA-protein transition process, here we identified 64 hominoid-specific de novo genes and report a new mechanism for the origination of functional de novo proteins from ancestral non-coding transcripts: These non-coding “precursors” are generally not more selectively constrained than other lncRNA loci; and the existence of these de novo proteins is not beyond anticipation under neutral expectation; however, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution.

Collapse

Affiliation(s)

Jia-Yu Chen Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Qing Sunny Shen Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Wei-Zhen Zhou Center for Bioinformatics, National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University, Beijing, China
Jiguang Peng Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Bin Z. He FAS Center for Systems Biology & Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts, United States of America
Yumei Li Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Chu-Jun Liu Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Xuke Luan Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China Peking-Tsinghua Center for Life Sciences, Beijing, China
Wanqiu Ding Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Shuxian Li Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
Chunyan Chen Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Bertrand Chin-Ming Tan Molecular Medicine Research Center, Chang Gung University, Tao-Yuan, Taiwan
Yong E. Zhang Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Aibin He Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China Peking-Tsinghua Center for Life Sciences, Beijing, China * E-mail: (AH); (CYL)
Chuan-Yun Li Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, Peking University, Beijing, China * E-mail: (AH); (CYL)

Collapse

Pezer Ž, Harr B, Teschke M, Babiker H, Tautz D. Divergence patterns of genic copy number variation in natural populations of the house mouse (Mus musculus domesticus) reveal three conserved genes with major population-specific expansions. Genome Res 2015;25:1114-24. [PMID: 26149421 PMCID: PMC4509996 DOI: 10.1101/gr.187187.114] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 06/05/2015] [Indexed: 11/29/2022]

De La Torre AR, Lin YC, Van de Peer Y, Ingvarsson PK. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families. Genome Biol Evol 2015;7:1002-15. [PMID: 25747252 PMCID: PMC4419791 DOI: 10.1093/gbe/evv044] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Bitard-Feildel T, Heberlein M, Bornberg-Bauer E, Callebaut I. Detection of orphan domains in Drosophila using "hydrophobic cluster analysis". Biochimie 2015;119:244-53. [PMID: 25736992 DOI: 10.1016/j.biochi.2015.02.019] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 02/20/2015] [Indexed: 11/30/2022]

Abstract

INTRODUCTION

Comparative genomics has become an important strategy in life science research. While many genes, and the proteins they code for, can be well characterized by assigning orthologs, a significant amount of proteins or domains remain obscure "orphans". Some orphans are overlooked by current computational methods because they rapidly diverged, others emerged relatively recently (de novo). Recent research has demonstrated the importance of orphans, and of de novo proteins and domains for development of new phenotypic traits and adaptation. New approaches for detecting novel domains are thus of paramount importance.

RESULTS

The hydrophobic cluster analysis (HCA) method delineates globular-like domains from the information of a protein sequence and thereby allows bypassing some of the established methods limitations based on conserved sequence similarity. In this study, HCA is tested for orphan domain detection on 12 Drosophila genomes. After their detection, the oprhan domains are classified into two categories, depending on their presence/absence in distantly related species. The two categories show significantly different physico-chemical properties when compared to previously characterized domains from the Pfam database. The newly detected domains have a higher degree of intrinsic disorder and a particular hydrophobic cluster composition. The older the domains are, the more similar their hydrophobic cluster content is to the cluster content of Pfam domains. The results suggest that, over time, newly created domains acquire a canonical set of hydrophobic clusters but conserve some features of intrinsically disordered regions.

CONCLUSION

Our results agree with previous findings on orphan domains and suggest that the physico-chemical properties of domains change over evolutionary long time scale. The presented HCA-based method is able to detect domains with unusual properties without relying on prior knowledge, such as the availability of homologs. Therefore, the method has large potential for complementing existing strategies to annotate genomes, and for better understanding how molecular features emerge.

Collapse

Karn RC, Chung AG, Laukaitis CM. Did androgen-binding protein paralogs undergo neo- and/or Subfunctionalization as the Abp gene region expanded in the mouse genome? PLoS One 2014;9:e115454. [PMID: 25531410 PMCID: PMC4274081 DOI: 10.1371/journal.pone.0115454] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 11/24/2014] [Indexed: 11/19/2022] Open

Abstract

The Androgen-binding protein (Abp) region of the mouse genome contains 30 Abpa genes encoding alpha subunits and 34 Abpbg genes encoding betagamma subunits, their products forming dimers composed of an alpha and a betagamma subunit. We endeavored to determine how many Abp genes are expressed as proteins in tears and saliva, and as transcripts in the exocrine glands producing them. Using standard PCR, we amplified Abp transcripts from cDNA libraries of C57BL/6 mice and found fifteen Abp gene transcripts in the lacrimal gland and five in the submandibular gland. Proteomic analyses identified proteins corresponding to eleven of the lacrimal gland transcripts, all of them different from the three salivary ABPs reported previously. Our qPCR results showed that five of the six transcripts that lacked corresponding proteins are expressed at very low levels compared to those transcripts with proteins. We found 1) no overlap in the repertoires of expressed Abp paralogs in lacrimal gland/tears and salivary glands/saliva; 2) substantial sex-limited expression of lacrimal gland/tear expressed-paralogs in males but no sex-limited expression in females; and 3) that the lacrimal gland/tear expressed-paralogs are found exclusively in ancestral clades 1, 2 and 3 of the five clades described previously while the salivary glands/saliva expressed-paralogs are found only in clade 5. The number of instances of extremely low levels of transcription without corresponding protein production in paralogs specific to tears and saliva suggested the role of subfunctionalization, a derived condition wherein genes that may have been expressed highly in both glands ancestrally were down-regulated subsequent to duplication. Thus, evidence for subfunctionalization can be seen in our data and we argue that the partitioning of paralog expression between lacrimal and salivary glands that we report here occurred as the result of adaptive evolution.

Collapse

Bosch TC. Rethinking the role of immunity: lessons from Hydra. Trends Immunol 2014;35:495-502. [DOI: 10.1016/j.it.2014.07.008] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Revised: 07/28/2014] [Accepted: 07/29/2014] [Indexed: 12/24/2022]

Ruiz-Orera J, Messeguer X, Subirana JA, Alba MM. Long non-coding RNAs as a source of new peptides. eLife 2014;3:e03523. [PMID: 25233276 PMCID: PMC4359382 DOI: 10.7554/elife.03523] [Citation(s) in RCA: 366] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 08/11/2014] [Indexed: 12/11/2022] Open

Abstract

Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.

DOI:http://dx.doi.org/10.7554/eLife.03523.001

Despite the terms being largely interchangeable in modern language, ‘DNA’ and ‘gene’ do not mean the same thing. A gene is made of DNA and contains the instructions to make a protein, and it is the protein that performs the function of the gene. However, cells in the body also contain DNA that does not form genes. Far from being ‘junk’ DNA with no biological purpose; this DNA has a variety of roles, including affecting how other genes are used.

To produce a protein, the DNA sequence of a gene is transcribed into an intermediate molecule called RNA, which is then translated to produce a protein. So-called long non-coding RNA (lncRNA) molecules are also transcribed from DNA, but whether these are translated to make proteins has been a subject of much debate. Indeed, the function of the vast majority of lncRNA molecules is unknown.

Ruiz-Orera et al. analyzed RNA sequences collected from earlier experiments on six different species—humans, mice, fish, flies, yeast, and a plant—and found nearly 2500 as yet unstudied lncRNAs in addition to those previously identified. Many of the lncRNAs that Ruiz-Orera et al. investigated could be found lodged inside the cellular machinery used to translate RNA into proteins. Furthermore, these lncRNA molecules are oriented in the machinery as if they are primed and ready for translation, suggesting that many lncRNAs do produce proteins. However, it is unclear how many of these proteins have a useful function.

Very few lncRNAs were found in more than one species, suggesting that they have evolved recently. The properties of lncRNA molecules also show many similarities with the properties of ‘young’—recently evolved—genes that are known to produce proteins. The combined findings of Ruiz-Orera et al. therefore suggest that lncRNAs are important for developing new proteins. The emergence of proteins with new functions has been an important driving force in evolution, and this work provides important clues into the first steps of this process.

DOI:http://dx.doi.org/10.7554/eLife.03523.002

Collapse