Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Vakirlis N, Vance Z, Duggan KM, McLysaght A. De novo birth of functional microproteins in the human lineage. Cell Rep 2022;41:111808. [PMID: 36543139 PMCID: PMC10073203 DOI: 10.1016/j.celrep.2022.111808] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 06/21/2022] [Accepted: 11/18/2022] [Indexed: 12/24/2022] Open

For:	Vakirlis N, Vance Z, Duggan KM, McLysaght A. De novo birth of functional microproteins in the human lineage. Cell Rep 2022;41:111808. [PMID: 36543139 PMCID: PMC10073203 DOI: 10.1016/j.celrep.2022.111808] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 06/21/2022] [Accepted: 11/18/2022] [Indexed: 12/24/2022] Open

Number

Cited by Other Article(s)

Guay SY, Patel PH, Thomalla JM, McDermott KL, O'Toole JM, Arnold SE, Obrycki SJ, Wolfner MF, Findlay GD. A newly evolved gene is essential for efficient sperm entry into eggs in Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.08.607187. [PMID: 39149251 PMCID: PMC11326263 DOI: 10.1101/2024.08.08.607187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]

Abstract

New genes arise through a variety of evolutionary processes and provide raw material for adaptation in the face of both natural and sexual selection. De novo evolved genes emerge from previously non-protein-coding DNA sequences, and many such genes are expressed in male reproductive structures. In Drosophila melanogaster , several putative de novo genes have evolved essential roles in spermatogenesis, but whether such genes can also impact sperm function beyond the male has not been investigated. We identified a putative de novo gene, katherine johnson ( kj ), that is required for high levels of male fertility. Males that do not express kj produce and transfer sperm that are stored normally in females, but sperm from these males enter eggs with severely reduced efficiency. Using a tagged transgenic rescue construct, we observed that KJ protein localizes to the nuclear periphery in various stages of spermatogenesis, but is not detectable in mature sperm. These data suggest that kj exerts an effect on sperm development, the loss of which results in reduced fertilization ability. While previous bioinformatic analyses suggested the kj gene was restricted to the melanogaster group of Drosophila , we identified putative orthologs with conserved synteny, male-biased expression, and predicted protein features across the genus, as well as instances of gene loss in some lineages. Thus, kj potentially arose in the Drosophila common ancestor and subsequently evolved an essential role in D. melanogaster . Our results demonstrate a new aspect of male reproduction that has been shaped by new gene evolution and provide a molecular foothold for further investigating the mechanism of sperm entry into eggs in Drosophila .

Article Summary

How fruit fly sperm enter eggs is poorly understood. Here, we identify a gene that potentially arose from non-protein-coding DNA and is required for efficient fertilization. Sperm from males lacking this gene's function cannot enter eggs. The gene appears to act during sperm production, rather than in mature sperm. This study illustrates how newly evolved genes can affect important aspects of reproduction and provides insights into sperm-egg interactions.

Collapse

Vakirlis N, Acar O, Cherupally V, Carvunis AR. Ancestral Sequence Reconstruction as a Tool to Detect and Study De Novo Gene Emergence. Genome Biol Evol 2024;16:evae151. [PMID: 39004885 PMCID: PMC11299112 DOI: 10.1093/gbe/evae151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 06/17/2024] [Accepted: 07/09/2024] [Indexed: 07/16/2024] Open

Abstract

New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.

Collapse

Vakirlis N, Kupczok A. Large-scale investigation of species-specific orphan genes in the human gut microbiome elucidates their evolutionary origins. Genome Res 2024;34:888-903. [PMID: 38977308 PMCID: PMC11293555 DOI: 10.1101/gr.278977.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 06/12/2024] [Indexed: 07/10/2024]

Rich A, Acar O, Carvunis AR. Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome. Genome Biol 2024;25:183. [PMID: 38978079 PMCID: PMC11232214 DOI: 10.1186/s13059-024-03287-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 05/20/2024] [Indexed: 07/10/2024] Open

Abstract

BACKGROUND

Recent studies uncovered pervasive transcription and translation of thousands of noncanonical open reading frames (nORFs) outside of annotated genes. The contribution of nORFs to cellular phenotypes is difficult to infer using conventional approaches because nORFs tend to be short, of recent de novo origins, and lowly expressed. Here we develop a dedicated coexpression analysis framework that accounts for low expression to investigate the transcriptional regulation, evolution, and potential cellular roles of nORFs in Saccharomyces cerevisiae.

RESULTS

Our results reveal that nORFs tend to be preferentially coexpressed with genes involved in cellular transport or homeostasis but rarely with genes involved in RNA processing. Mechanistically, we discover that young de novo nORFs located downstream of conserved genes tend to leverage their neighbors' promoters through transcription readthrough, resulting in high coexpression and high expression levels. Transcriptional piggybacking also influences the coexpression profiles of young de novo nORFs located upstream of genes, but to a lesser extent and without detectable impact on expression levels. Transcriptional piggybacking influences, but does not determine, the transcription profiles of de novo nORFs emerging nearby genes. About 40% of nORFs are not strongly coexpressed with any gene but are transcriptionally regulated nonetheless and tend to form entirely new transcription modules. We offer a web browser interface ( https://carvunislab.csb.pitt.edu/shiny/coexpression/ ) to efficiently query, visualize, and download our coexpression inferences.

CONCLUSIONS

Our results suggest that nORF transcription is highly regulated. Our coexpression dataset serves as an unprecedented resource for unraveling how nORFs integrate into cellular networks, contribute to cellular phenotypes, and evolve.

Collapse

Sanejouand YH. Are Most Human-Specific Proteins Encoded by Long Noncoding RNAs? J Mol Evol 2024:10.1007/s00239-024-10174-z. [PMID: 38916610 DOI: 10.1007/s00239-024-10174-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/03/2024] [Indexed: 06/26/2024]

Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024;16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open

Lindhout FW, Krienen FM, Pollard KS, Lancaster MA. A molecular and cellular perspective on human brain evolution and tempo. Nature 2024;630:596-608. [PMID: 38898293 DOI: 10.1038/s41586-024-07521-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 04/29/2024] [Indexed: 06/21/2024]

Tong G, Hah N, Martinez TF. Comparison of software packages for detecting unannotated translated small open reading frames by Ribo-seq. Brief Bioinform 2024;25:bbae268. [PMID: 38842510 PMCID: PMC11155197 DOI: 10.1093/bib/bbae268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 05/12/2024] [Accepted: 05/21/2024] [Indexed: 06/07/2024] Open

Duffy EE, Assad EG, Kalish BT, Greenberg ME. Small but mighty: the rise of microprotein biology in neuroscience. Front Mol Neurosci 2024;17:1386219. [PMID: 38807924 PMCID: PMC11130481 DOI: 10.3389/fnmol.2024.1386219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 04/30/2024] [Indexed: 05/30/2024] Open

Nichols C, Do-Thi VA, Peltier DC. Noncanonical microprotein regulation of immunity. Mol Ther 2024:S1525-0016(24)00324-1. [PMID: 38734902 DOI: 10.1016/j.ymthe.2024.05.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/19/2024] [Accepted: 05/09/2024] [Indexed: 05/13/2024] Open

Kurgan N, Kjærgaard Larsen J, Deshmukh AS. Harnessing the power of proteomics in precision diabetes medicine. Diabetologia 2024;67:783-797. [PMID: 38345659 DOI: 10.1007/s00125-024-06097-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 12/20/2023] [Indexed: 03/21/2024]

Qin Z, Yang J, Zhang K, Gao X, Ran Q, Xu Y, Wang Z, Lou D, Huang C, Zellmer L, Meng G, Chen N, Ma H, Wang Z, Liao DJ. Updating mRNA variants of the human RSK4 gene and their expression in different stressed situations. Heliyon 2024;10:e27475. [PMID: 38560189 PMCID: PMC10980951 DOI: 10.1016/j.heliyon.2024.e27475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 02/11/2024] [Accepted: 02/29/2024] [Indexed: 04/04/2024] Open

Affiliation(s)

Zhenwei Qin Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
Jianglin Yang Center for Clinical Laboratories, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Rd, Guiyang, 550004, Guizhou Province, China Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang, 550004, Guizhou Province, China
Keyin Zhang Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
Xia Gao Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
Qianchuan Ran Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
Yuanhong Xu Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
Zhi Wang Department of Pathology, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Road, Guiyang, 550004, Guizhou Province, China
Didong Lou Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
Chunhua Huang Section of Forensic Science and Pathology, School of Basic Medical Sciences, Guizhou University of Traditional Chinese Medicine, Dong-Qing-Nan Road, Guiyang, 550025, Guizhou Province, China
Lucas Zellmer Department of Medicine, Hennepin County Medical Center, 730 South 8th St., Minneapolis, MN, 55415, USA
Guangxue Meng Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
Na Chen Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
Hong Ma Department of Oral and Maxillofacial Surgery, School of Stomatology, Guizhou Medical University, 9 Beijing Road, Guiyang, 550004, Guizhou Province, China
Zhe Wang State Key Laboratory of Cancer Biology, Department of Pathology, Xijing Hospital, Air Force Medical University, 169 Changle West Road, Xi'an, 710032, China
Dezhong Joshua Liao Center for Clinical Laboratories, The Affiliated Hospital of Guizhou Medical University, 4 Beijing Rd, Guiyang, 550004, Guizhou Province, China Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang, 550004, Guizhou Province, China

Collapse

Aubel M, Buchel F, Heames B, Jones A, Honc O, Bornberg-Bauer E, Hlouchova K. High-throughput Selection of Human de novo-emerged sORFs with High Folding Potential. Genome Biol Evol 2024;16:evae069. [PMID: 38597156 PMCID: PMC11024478 DOI: 10.1093/gbe/evae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 03/23/2024] [Indexed: 04/11/2024] Open

Delihas N. Evolution of a Human-Specific De Novo Open Reading Frame and Its Linked Transcriptional Silencer. Int J Mol Sci 2024;25:3924. [PMID: 38612733 PMCID: PMC11011693 DOI: 10.3390/ijms25073924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 03/23/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024] Open

Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Affiliation(s)

Katherine Fleck Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269 Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
Victor Luria Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510 Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142 Department of Systems Biology, Harvard Medical School, Boston, MA 02115
Nitanta Garag Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
Amir Karger IT-Research Computing, Harvard Medical School, Boston, MA 02115
Trevor Hunter Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
Daniel Marten Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
William Phu Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
Kee-Myoung Nam Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510
Nenad Sestan Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
Anne H. O’Donnell-Luria Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115 Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142 Department of Pediatrics, Harvard Medical School, Boston, MA 02115
Jelena Erceg Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269 Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269 Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030

Collapse

Liu X, Xiao C, Xu X, Zhang J, Mo F, Chen JY, Delihas N, Zhang L, An NA, Li CY. Origin of functional de novo genes in humans from "hopeful monsters". WILEY INTERDISCIPLINARY REVIEWS. RNA 2024;15:e1845. [PMID: 38605485 DOI: 10.1002/wrna.1845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/13/2024]

Affiliation(s)

Xiaoge Liu State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
Chunfu Xiao State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
Xinwei Xu State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
Jie Zhang State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
Fan Mo State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Jia-Yu Chen State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center (ChemBIC), Nanjing University, Nanjing, China
Nicholas Delihas Department of Microbiology and Immunology, Renaissance School of Medicine, Stony Brook University, Stony Brook, New York, USA
Li Zhang Chinese Institute for Brain Research, Beijing, China
Ni A An State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
Chuan-Yun Li State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China Chinese Institute for Brain Research, Beijing, China Southwest United Graduate School, Kunming, China

Collapse

Fesenko I, Sahakyan H, Shabalina SA, Koonin EV. The Cryptic Bacterial Microproteome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.17.580829. [PMID: 38903115 PMCID: PMC11188072 DOI: 10.1101/2024.02.17.580829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]

Hannon Bozorgmehr J. Four classic "de novo" genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences. Mol Genet Genomics 2024;299:6. [PMID: 38315248 DOI: 10.1007/s00438-023-02090-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 10/15/2023] [Indexed: 02/07/2024]

Tong G, Hah N, Martinez TF. Comparison of software packages for detecting unannotated translated small open reading frames by Ribo-seq. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.30.573709. [PMID: 38234848 PMCID: PMC10793472 DOI: 10.1101/2023.12.30.573709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]

Kore H, Datta KK, Nagaraj SH, Gowda H. Protein-coding potential of non-canonical open reading frames in human transcriptome. Biochem Biophys Res Commun 2023;684:149040. [PMID: 37897910 DOI: 10.1016/j.bbrc.2023.09.068] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/09/2023] [Accepted: 09/23/2023] [Indexed: 10/30/2023]

Frumkin I, Laub MT. Selection of a de novo gene that can promote survival of Escherichia coli by modulating protein homeostasis pathways. Nat Ecol Evol 2023;7:2067-2079. [PMID: 37945946 PMCID: PMC10697842 DOI: 10.1038/s41559-023-02224-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 09/12/2023] [Indexed: 11/12/2023]

Mohsen JJ, Martel AA, Slavoff SA. Microproteins-Discovery, structure, and function. Proteomics 2023;23:e2100211. [PMID: 37603371 PMCID: PMC10841188 DOI: 10.1002/pmic.202100211] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/22/2023]

Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023;22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open

Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding".

In brief

The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting.

Highlights

Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.

Collapse

Ardern Z, Uz-Zaman MH. Between noise and function: Toward a taxonomy of the non-canonical translatome. Cell Syst 2023;14:343-345. [PMID: 37201506 DOI: 10.1016/j.cels.2023.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 04/17/2023] [Indexed: 05/20/2023]

Girardini KN, Olthof AM, Kanadia RN. Introns: the "dark matter" of the eukaryotic genome. Front Genet 2023;14:1150212. [PMID: 37260773 PMCID: PMC10228655 DOI: 10.3389/fgene.2023.1150212] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 04/28/2023] [Indexed: 06/02/2023] Open

Jain N, Richter F, Adzhubei I, Sharp AJ, Gelb BD. Small open reading frames: a comparative genetics approach to validation. BMC Genomics 2023;24:226. [PMID: 37127568 PMCID: PMC10152738 DOI: 10.1186/s12864-023-09311-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 04/13/2023] [Indexed: 05/03/2023] Open

Liu J, Yuan R, Shao W, Wang J, Silman I, Sussman JL. Do "Newly Born" orphan proteins resemble "Never Born" proteins? A study using three deep learning algorithms. Proteins 2023. [PMID: 37092778 DOI: 10.1002/prot.26496] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 02/26/2023] [Accepted: 04/01/2023] [Indexed: 04/25/2023]

Abstract

"Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such "Newly Born" proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called "Never Born" proteins. The programs were used to compare the structures of two sets of "Never Born" proteins that had been expressed-Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high-quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well-identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high-quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3.

Collapse

Xu D, Tang L, Kapranov P. Complexities of mammalian transcriptome revealed by targeted RNA enrichment techniques. Trends Genet 2023;39:320-333. [PMID: 36681580 DOI: 10.1016/j.tig.2022.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 01/21/2023]

Papadopoulos C, Albà MM. Newly evolved genes in the human lineage are functional. Trends Genet 2023;39:235-236. [PMID: 36774242 DOI: 10.1016/j.tig.2023.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 02/02/2023] [Indexed: 02/12/2023]

Sandmann CL, Schulz JF, Ruiz-Orera J, Kirchner M, Ziehm M, Adami E, Marczenke M, Christ A, Liebe N, Greiner J, Schoenenberger A, Muecke MB, Liang N, Moritz RL, Sun Z, Deutsch EW, Gotthardt M, Mudge JM, Prensner JR, Willnow TE, Mertins P, van Heesch S, Hubner N. Evolutionary origins and interactomes of human, young microproteins and small peptides translated from short open reading frames. Mol Cell 2023;83:994-1011.e18. [PMID: 36806354 PMCID: PMC10032668 DOI: 10.1016/j.molcel.2023.01.023] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/12/2022] [Accepted: 01/25/2023] [Indexed: 02/19/2023]

Affiliation(s)

Clara-L Sandmann Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
Jana F Schulz Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
Jorge Ruiz-Orera Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Marieluise Kirchner Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
Matthias Ziehm Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
Eleonora Adami Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Maike Marczenke Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Annabel Christ Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Nina Liebe Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Johannes Greiner Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Aaron Schoenenberger Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Michael B Muecke Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
Ning Liang Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
Robert L Moritz Institute for Systems Biology, Seattle, WA 98109, USA
Zhi Sun Institute for Systems Biology, Seattle, WA 98109, USA
Eric W Deutsch Institute for Systems Biology, Seattle, WA 98109, USA
Michael Gotthardt Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
Jonathan M Mudge European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
John R Prensner Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
Thomas E Willnow Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Department of Biomedicine, Aarhus University, 8000 Aarhus, Denmark
Philipp Mertins Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
Sebastiaan van Heesch Princess Máxima Center for Pediatric Oncology, 3584 CS Utrecht, the Netherlands.
Norbert Hubner Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany.

Collapse

Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]

Jordan B. [The birth of a gene]. Med Sci (Paris) 2023;39:297-300. [PMID: 36943130 DOI: 10.1051/medsci/2023021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023] Open