1
|
Nelson CW, Mirabello L. Human papillomavirus genomics: Understanding carcinogenicity. Tumour Virus Res 2023; 15:200258. [PMID: 36812987 PMCID: PMC10063409 DOI: 10.1016/j.tvr.2023.200258] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/01/2023] [Accepted: 02/17/2023] [Indexed: 02/22/2023] Open
Abstract
Human papillomavirus (HPV) causes virtually all cervical cancers and many cancers at other anatomical sites in both men and women. However, only 12 of 448 known HPV types are currently classified as carcinogens, and even the most carcinogenic type - HPV16 - only rarely leads to cancer. HPV is therefore necessary but insufficient for cervical cancer, with other contributing factors including host and viral genetics. Over the last decade, HPV whole genome sequencing has established that even fine-scale within-type HPV variation influences precancer/cancer risks, and that these risks vary by histology and host race/ethnicity. In this review, we place these findings in the context of the HPV life cycle and evolution at various levels of viral diversity: between-type, within-type, and within-host. We also discuss key concepts necessary for interpreting HPV genomic data, including features of the viral genome; events leading to carcinogenesis; the role of APOBEC3 in HPV infection and evolution; and methodologies that use deep (high-coverage) sequencing to characterize within-host variation, as opposed to relying on a single representative (consensus) sequence. Given the continued high burden of HPV-associated cancers, understanding HPV carcinogenicity remains important for better understanding, preventing, and treating cancers attributable to infection.
Collapse
Affiliation(s)
- Chase W Nelson
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD, 20850, USA; Institute for Comparative Genomics, American Museum of Natural History, New York, NY, 10024, USA.
| | - Lisa Mirabello
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD, 20850, USA.
| |
Collapse
|
2
|
Li X, He W, Fang J, Liang Y, Zhang H, Chen D, Wu X, Zhang Z, Wang L, Han P, Zhang B, Xue T, Zheng W, He J, Bai C. Genomic and transcriptomic-based analysis of agronomic traits in sugar beet ( Beta vulgaris L.) pure line IMA1. FRONTIERS IN PLANT SCIENCE 2022; 13:1028885. [PMID: 36311117 PMCID: PMC9608375 DOI: 10.3389/fpls.2022.1028885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Sugar beet (Beta vulgaris L.) is an important sugar-producing and energy crop worldwide. The sugar beet pure line IMA1 independently bred by Chinese scientists is a standard diploid parent material that is widely used in hybrid-breeding programs. In this study, a high-quality, chromosome-level genome assembly for IMA1was conducted, and 99.1% of genome sequences were assigned to nine chromosomes. A total of 35,003 protein-coding genes were annotated, with 91.56% functionally annotated by public databases. Compared with previously released sugar beet assemblies, the new genome was larger with at least 1.6 times larger N50 size, thereby substantially improving the completeness and continuity of the sugar beet genome. A Genome-Wide Association Studies analysis identified 10 disease-resistance genes associated with three important beet diseases and five genes associated with sugar yield per hectare, which could be key targets to improve sugar productivity. Nine highly expressed genes associated with pollen fertility of sugar beet were also identified. The results of this study provide valuable information to identify and dissect functional genes affecting sugar beet agronomic traits, which can increase sugar beet production and help screen for excellent sugar beet breeding materials. In addition, information is provided that can precisely incorporate biotechnology tools into breeding efforts.
Collapse
Affiliation(s)
- Xiaodong Li
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Wenjin He
- Life Science College of Fujian Normal University, Fuzhou, China
| | - Jingping Fang
- Life Science College of Fujian Normal University, Fuzhou, China
| | - Yahui Liang
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Huizhong Zhang
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Duo Chen
- Life Science College of Fujian Normal University, Fuzhou, China
| | - Xingrong Wu
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Ziqiang Zhang
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Liang Wang
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Pingan Han
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Bizhou Zhang
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Ting Xue
- Life Science College of Fujian Normal University, Fuzhou, China
| | - Wenzhe Zheng
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Jiangfeng He
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| | - Chen Bai
- Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
- Inner Mongolia Key Laboratory of Sugarbeet Genetics & Germplasm Enhancement, Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, Hohhot, China
| |
Collapse
|
3
|
Kreitmeier M, Ardern Z, Abele M, Ludwig C, Scherer S, Neuhaus K. Spotlight on alternative frame coding: Two long overlapping genes in Pseudomonas aeruginosa are translated and under purifying selection. iScience 2022; 25:103844. [PMID: 35198897 PMCID: PMC8850804 DOI: 10.1016/j.isci.2022.103844] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 10/14/2021] [Accepted: 01/27/2022] [Indexed: 12/13/2022] Open
Abstract
The existence of overlapping genes (OLGs) with significant coding overlaps revolutionizes our understanding of genomic complexity. We report two exceptionally long (957 nt and 1536 nt), evolutionarily novel, translated antisense open reading frames (ORFs) embedded within annotated genes in the pathogenic Gram-negative bacterium Pseudomonas aeruginosa. Both OLG pairs show sequence features consistent with being genes and transcriptional signals in RNA sequencing. Translation of both OLGs was confirmed by ribosome profiling and mass spectrometry. Quantitative proteomics of samples taken during different phases of growth revealed regulation of protein abundances, implying biological functionality. Both OLGs are taxonomically restricted, and likely arose by overprinting within the genus. Evidence for purifying selection further supports functionality. The OLGs reported here, designated olg1 and olg2, are the longest yet proposed in prokaryotes and are among the best attested in terms of translation and evolutionary constraint. These results highlight a potentially large unexplored dimension of prokaryotic genomes.
Collapse
Affiliation(s)
- Michaela Kreitmeier
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Miriam Abele
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technische Universität München, Gregor-Mendel-Strasse 4, 85354 Freising, Germany
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technische Universität München, Gregor-Mendel-Strasse 4, 85354 Freising, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Klaus Neuhaus
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| |
Collapse
|
4
|
Ahmad HI, Afzal G, Iqbal MN, Iqbal MA, Shokrollahi B, Mansoor MK, Chen J. Positive Selection Drives the Adaptive Evolution of Mitochondrial Antiviral Signaling (MAVS) Proteins-Mediating Innate Immunity in Mammals. Front Vet Sci 2022; 8:814765. [PMID: 35174241 PMCID: PMC8841730 DOI: 10.3389/fvets.2021.814765] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 12/24/2021] [Indexed: 12/17/2022] Open
Abstract
The regulated production of filamentous protein complexes is essential in many biological processes and provides a new paradigm in signal transmission. The mitochondrial antiviral signaling protein (MAVS) is a critical signaling hub in innate immunity that is activated when a receptor induces a shift in the globular caspase activation and recruitment domain of MAVS into helical superstructures (filaments). It is of interest whether adaptive evolution affects the proteins involved in innate immunity. Here, we explore and confer the role of selection and diversification on mitochondrial antiviral signaling protein in mammalian species. We obtined the MAVS proteins of mammalian species and examined their differences in evolutionary patterns. We discovered evidence for these proteins being subjected to substantial positive selection. We demonstrate that immune system proteins, particularly those encoding recognition proteins, develop under positive selection using codon-based probability methods. Positively chosen regions within recognition proteins cluster in domains involved in microorganism recognition, implying that molecular interactions between hosts and pathogens may promote adaptive evolution in the mammalian immune systems. These significant variations in MAVS development in mammalian species highlights the involvement of MAVS in innate immunity. Our findings highlight the significance of accounting for how non-synonymous alterations affect structure and function when employing sequence-level studies to determine and quantify positive selection.
Collapse
Affiliation(s)
- Hafiz Ishfaq Ahmad
- Department of Animal Breeding and Genetics, University of Veterinary and Animal Sciences, Lahore, Pakistan
| | - Gulnaz Afzal
- Department of Zoology, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | | | | | - Borhan Shokrollahi
- Department of Animal Science, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran
| | - Muhammad Khalid Mansoor
- Department of Microbiology, Faculty of Veterinary and Animal Science, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Jinping Chen
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Sciences, Guangzhou, China
- *Correspondence: Jinping Chen
| |
Collapse
|
5
|
Analyses of Leishmania-LRV Co-Phylogenetic Patterns and Evolutionary Variability of Viral Proteins. Viruses 2021; 13:v13112305. [PMID: 34835111 PMCID: PMC8624691 DOI: 10.3390/v13112305] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/09/2021] [Indexed: 01/07/2023] Open
Abstract
Leishmania spp. are important pathogens causing a vector-borne disease with a broad range of clinical manifestations from self-healing ulcers to the life-threatening visceral forms. Presence of Leishmania RNA virus (LRV) confers survival advantage to these parasites by suppressing anti-leishmanial immunity in the vertebrate host. The two viral species, LRV1 and LRV2 infect species of the subgenera Viannia and Leishmania, respectively. In this work we investigated co-phylogenetic patterns of leishmaniae and their viruses on a small scale (LRV2 in L. major) and demonstrated their predominant coevolution, occasionally broken by intraspecific host switches. Our analysis of the two viral genes, encoding the capsid and RNA-dependent RNA polymerase (RDRP), revealed them to be under the pressure of purifying selection, which was considerably stronger for the former gene across the whole tree. The selective pressure also differs between the LRV clades and correlates with the frequency of interspecific host switches. In addition, using experimental (capsid) and predicted (RDRP) models we demonstrated that the evolutionary variability across the structure is strikingly different in these two viral proteins.
Collapse
|
6
|
Computational methods for inferring location and genealogy of overlapping genes in virus genomes: approaches and applications. Curr Opin Virol 2021; 52:1-8. [PMID: 34798370 PMCID: PMC8594276 DOI: 10.1016/j.coviro.2021.10.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 10/21/2021] [Accepted: 10/22/2021] [Indexed: 12/02/2022]
Abstract
Viruses may evolve to increase the amount of encoded genetic information by means of overlapping genes, which utilize several reading frames. Such overlapping genes may be especially impactful for genomes of small size, often serving a source of novel accessory proteins, some of which play a crucial role in viral pathogenicity or in promoting the systemic spread of virus. Diverse genome-based metrics were proposed to facilitate recognition of overlapping genes that otherwise may be overlooked during genome annotation. They can detect the atypical codon bias associated with the overlap (e.g. a statistically significant reduction in variability at synonymous sites) or other sequence-composition features peculiar to overlapping genes. In this review, I compare nine computational methods, discuss their strengths and limitations, and survey how they were applied to detect candidate overlapping genes in the genome of SARS-CoV-2, the etiological agent of COVID-19 pandemic.
Collapse
|
7
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
8
|
Nelson CW, Ardern Z, Wei X. OLGenie: Estimating Natural Selection to Predict Functional Overlapping Genes. Mol Biol Evol 2021; 37:2440-2449. [PMID: 32243542 PMCID: PMC7531306 DOI: 10.1093/molbev/msaa087] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Purifying (negative) natural selection is a hallmark of functional biological sequences, and can be detected in protein-coding genes using the ratio of nonsynonymous to synonymous substitutions per site (dN/dS). However, when two genes overlap the same nucleotide sites in different frames, synonymous changes in one gene may be nonsynonymous in the other, perturbing dN/dS. Thus, scalable methods are needed to estimate functional constraint specifically for overlapping genes (OLGs). We propose OLGenie, which implements a modification of the Wei–Zhang method. Assessment with simulations and controls from viral genomes (58 OLGs and 176 non-OLGs) demonstrates low false-positive rates and good discriminatory ability in differentiating true OLGs from non-OLGs. We also apply OLGenie to the unresolved case of HIV-1’s putative antisense protein gene, showing significant purifying selection. OLGenie can be used to study known OLGs and to predict new OLGs in genome annotation. Software and example data are freely available at https://github.com/chasewnelson/OLGenie (last accessed April 10, 2020).
Collapse
Affiliation(s)
- Chase W Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY.,Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Zachary Ardern
- Microbial Ecology, ZIEL-Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Xinzhu Wei
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI.,Department of Integrative Biology and Statistics, University of California, Berkeley, CA
| |
Collapse
|
9
|
Belinky F, Ganguly I, Poliakov E, Yurchenko V, Rogozin IB. Analysis of Stop Codons within Prokaryotic Protein-Coding Genes Suggests Frequent Readthrough Events. Int J Mol Sci 2021; 22:ijms22041876. [PMID: 33672790 PMCID: PMC7918605 DOI: 10.3390/ijms22041876] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 02/05/2021] [Accepted: 02/09/2021] [Indexed: 02/07/2023] Open
Abstract
Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons—where an intermediate step is a nonsense substitution—show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (F.B.); (I.G.)
| | - Ishan Ganguly
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (F.B.); (I.G.)
| | - Eugenia Poliakov
- National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA;
| | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czech Republic
- Martsinovsky Institute of Medical Parasitology, Tropical and Vector Borne Diseases, Sechenov University, 119435 Moscow, Russia
- Correspondence: (V.Y.); (I.B.R.)
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; (F.B.); (I.G.)
- Correspondence: (V.Y.); (I.B.R.)
| |
Collapse
|
10
|
Nelson CW, Ardern Z, Goldberg TL, Meng C, Kuo CH, Ludwig C, Kolokotronis SO, Wei X. Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic. eLife 2020; 9:e59633. [PMID: 33001029 PMCID: PMC7655111 DOI: 10.7554/elife.59633] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/30/2020] [Indexed: 12/11/2022] Open
Abstract
Understanding the emergence of novel viruses requires an accurate and comprehensive annotation of their genomes. Overlapping genes (OLGs) are common in viruses and have been associated with pandemics but are still widely overlooked. We identify and characterize ORF3d, a novel OLG in SARS-CoV-2 that is also present in Guangxi pangolin-CoVs but not other closely related pangolin-CoVs or bat-CoVs. We then document evidence of ORF3d translation, characterize its protein sequence, and conduct an evolutionary analysis at three levels: between taxa (21 members of Severe acute respiratory syndrome-related coronavirus), between human hosts (3978 SARS-CoV-2 consensus sequences), and within human hosts (401 deeply sequenced SARS-CoV-2 samples). ORF3d has been independently identified and shown to elicit a strong antibody response in COVID-19 patients. However, it has been misclassified as the unrelated gene ORF3b, leading to confusion. Our results liken ORF3d to other accessory genes in emerging viruses and highlight the importance of OLGs.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Antibodies, Viral/immunology
- Antibody Specificity
- Antigens, Viral/biosynthesis
- Antigens, Viral/genetics
- Antigens, Viral/immunology
- Betacoronavirus/genetics
- Betacoronavirus/pathogenicity
- Betacoronavirus/physiology
- COVID-19
- China/epidemiology
- Chiroptera/virology
- Coronavirus/genetics
- Coronavirus Infections/epidemiology
- Coronavirus Infections/virology
- Epitopes/genetics
- Epitopes/immunology
- Europe/epidemiology
- Eutheria/virology
- Evolution, Molecular
- Gene Expression Regulation, Viral
- Genes, Overlapping
- Genes, Viral
- Genetic Variation
- Haplotypes/genetics
- Host Specificity/genetics
- Humans
- Models, Molecular
- Mutation
- Open Reading Frames/genetics
- Pandemics
- Phylogeny
- Pneumonia, Viral/epidemiology
- Pneumonia, Viral/virology
- Protein Biosynthesis
- Protein Conformation
- RNA, Viral/genetics
- SARS-CoV-2
- Sequence Alignment
- Sequence Homology, Nucleic Acid
- Viral Proteins/genetics
- Viral Proteins/immunology
Collapse
Affiliation(s)
- Chase W Nelson
- Biodiversity Research Center, Academia SinicaTaipeiTaiwan
- Institute for Comparative Genomics, American Museum of Natural HistoryNew YorkUnited States
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of MunichFreisingGermany
| | - Tony L Goldberg
- Department of Pathobiological Sciences, University of Wisconsin-MadisonMadisonUnited States
- Global Health Institute, University of Wisconsin-MadisonMadisonUnited States
| | - Chen Meng
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of MunichFreisingGermany
| | - Chen-Hao Kuo
- Biodiversity Research Center, Academia SinicaTaipeiTaiwan
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of MunichFreisingGermany
| | - Sergios-Orestis Kolokotronis
- Institute for Comparative Genomics, American Museum of Natural HistoryNew YorkUnited States
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences UniversityBrooklynUnited States
- Institute for Genomic Health, SUNY Downstate Health Sciences UniversityBrooklynUnited States
- Division of Infectious Diseases, Department of Medicine, SUNY Downstate Health Sciences UniversityBrooklynUnited States
| | - Xinzhu Wei
- Departments of Integrative Biology and Statistics, University of California, BerkeleyBerkeleyUnited States
- Departments of Computer Science, Human Genetics, and Computational Medicine, University of California, Los AngelesLos AngelesUnited States
| |
Collapse
|
11
|
Ardern Z, Neuhaus K, Scherer S. Are Antisense Proteins in Prokaryotes Functional? Front Mol Biosci 2020; 7:187. [PMID: 32923454 PMCID: PMC7457138 DOI: 10.3389/fmolb.2020.00187] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 07/16/2020] [Indexed: 12/16/2022] Open
Abstract
Many prokaryotic RNAs are transcribed from loci outside of annotated protein coding genes. Across bacterial species hundreds of short open reading frames antisense to annotated genes show evidence of both transcription and translation, for instance in ribosome profiling data. Determining the functional fraction of these protein products awaits further research, including insights from studies of molecular interactions and detailed evolutionary analysis. There are multiple lines of evidence, however, that many of these newly discovered proteins are of use to the organism. Condition-specific phenotypes have been characterized for a few. These proteins should be added to genome annotations, and the methods for predicting them standardized. Evolutionary analysis of these typically young sequences also may provide important insights into gene evolution. This research should be prioritized for its exciting potential to uncover large numbers of novel proteins with extremely diverse potential practical uses, including applications in synthetic biology and responding to pathogens.
Collapse
Affiliation(s)
- Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Munich, Germany
| | | | | |
Collapse
|
12
|
Pavesi A. Asymmetric evolution in viral overlapping genes is a source of selective protein adaptation. Virology 2019; 532:39-47. [PMID: 31004987 PMCID: PMC7125799 DOI: 10.1016/j.virol.2019.03.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 03/25/2019] [Accepted: 03/26/2019] [Indexed: 12/29/2022]
Abstract
Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo “symmetric evolution” (similar selection pressures on the two proteins) or “asymmetric evolution” (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation. A dataset of 80 pairs of homologous overlapping genes from viruses is examined. Its analysis reveals that half of overlapping genes undergo asymmetric evolution. The most variable gene product is that encoded by the de novo overlapping gene. Overlapping genes evolving asymmetrically are a source of selective protein adaptation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 11/A, I-43124, Parma, Italy.
| |
Collapse
|
13
|
Mori S, Matsunami M. Signature of positive selection in mitochondrial DNA in Cetartiodactyla. Genes Genet Syst 2018; 93:65-73. [PMID: 29643269 DOI: 10.1266/ggs.17-00015] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Acceleration of the amino acid substitution rate is a good indicator of positive selection in adaptive evolutionary changes of functional genes. Genomic information about mammals has become readily available in recent years, as many researchers have attempted to clarify the adaptive evolution of mammals by examining evolutionary rate change based on multiple loci. The order Cetartiodactyla (Artiodactyla and Cetacea) is one of the most diverse orders of mammals. Species in this order are found throughout all continents and seas, except Antarctica, and they exhibit wide variation in morphology and habitat. Here, we focused on the metabolism-related genes of mitochondrial DNA (mtDNA) in species of the order Cetartiodactyla using 191 mtDNA sequences available in databases. Based on comparisons of the dN/dS ratio (ω) in 12 protein-coding genes, ATP8 was shown to have a higher ω value (ω = 0.247) throughout Cetartiodactyla than the other 11 genes (ω < 0.05). In a branch-site analysis of ATP8 sequences, a markedly higher ω value of 0.801 was observed in the ancestral lineage of the clade of Cetacea, which is indicative of adaptive evolution. Through efforts to detect positively selected amino acids, codon positions 52 and 54 of ATP8 were shown to have experienced positive selective pressure during the course of evolution; multiple substitutions have occurred at these sites throughout the cetacean lineage. At position 52, glutamic acid was replaced with asparagine, and, at position 54, lysine was replaced with non-charged amino acids. These sites are conserved in most Artiodactyla. These results imply that the ancestor of cetaceans underwent accelerated amino acid changes in ATP8 and replacements at codons 52 and 54, which adjusted metabolism to adapt to the marine environment.
Collapse
Affiliation(s)
- Satoko Mori
- Laboratory of Ecology and Genetics, Graduate School of Environmental Science, Hokkaido University
| | - Masatoshi Matsunami
- Laboratory of Ecology and Genetics, Graduate School of Environmental Science, Hokkaido University.,Graduate School of Medicine, University of the Ryukyus
| |
Collapse
|
14
|
Fernandes JD, Faust TB, Strauli NB, Smith C, Crosby DC, Nakamura RL, Hernandez RD, Frankel AD. Functional Segregation of Overlapping Genes in HIV. Cell 2017; 167:1762-1773.e12. [PMID: 27984726 DOI: 10.1016/j.cell.2016.11.031] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Revised: 09/29/2016] [Accepted: 11/15/2016] [Indexed: 11/28/2022]
Abstract
Overlapping genes pose an evolutionary dilemma as one DNA sequence evolves under the selection pressures of multiple proteins. Here, we perform systematic statistical and mutational analyses of the overlapping HIV-1 genes tat and rev and engineer exhaustive libraries of non-overlapped viruses to perform deep mutational scanning of each gene independently. We find a "segregated" organization in which overlapped sites encode functional residues of one gene or the other, but never both. Furthermore, this organization eliminates unfit genotypes, providing a fitness advantage to the population. Our comprehensive analysis reveals the extraordinary manner in which HIV minimizes the constraint of overlapping genes and repurposes that constraint to its own advantage. Thus, overlaps are not just consequences of evolutionary constraints, but rather can provide population fitness advantages.
Collapse
Affiliation(s)
- Jason D Fernandes
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tyler B Faust
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Tetrad Program, Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nicolas B Strauli
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA; Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Cynthia Smith
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - David C Crosby
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Robert L Nakamura
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Alan D Frankel
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
15
|
The combinatorics of overlapping genes. J Theor Biol 2016; 415:90-101. [PMID: 27737786 DOI: 10.1016/j.jtbi.2016.09.018] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 08/31/2016] [Accepted: 09/22/2016] [Indexed: 11/23/2022]
Abstract
Overlapping genes exist in all domains of life and are much more abundant than expected upon their first discovery in the late 1970s. Assuming that the reference gene is read in frame +0, an overlapping gene can be encoded in two reading frames in the sense strand, denoted by +1 and +2, and in three reading frames in the opposite strand, denoted by -0, -1, and -2. This motivated numerous researchers to study the constraints induced by the genetic code on the various overlapping frames, mostly based on information theory. Our focus in this paper is on the constraints induced on two overlapping genes in terms of amino acids, as well as polypeptides. We show that simple linear constraints bind the amino-acid composition of two proteins encoded by overlapping genes. Novel constraints are revealed when polypeptides are considered, and not just single amino acids. For example, in double-coding sequences with an overlapping reading frame -2, each Tyrosine (denoted as Tyr or Y) in the overlapping frame overlaps a Tyrosine in the reference frame +0 (and reciprocally), whereas specific words (e.g. YY) never occur. We thus distinguish between null constraints (YY = 0 in frame -2) and non-null constraints (Y in frame +0 ⇔ Y in frame -2). Our equivalence-based constraints are symmetrical and thus enable the characterization of the joint composition of overlapping proteins. We describe several formal frameworks and a graph algorithm to characterize and compute these constraints. As expected, the degrees of freedom left by these constraints vary drastically among the different overlapping frames. Interestingly, the biological meaning of constraints induced on two overlapping proteins (hydropathy, forbidden di-peptides, expected overlap length …) is also specific to the reading frame. We study the combinatorics of these constraints for overlapping polypeptides of length n, pointing out that, (i) except for frame -2, non-null constraints are deduced from the amino-acid (length = 1) constraints and (ii) null constraints are deduced from the di-peptide (length = 2) constraints. These results yield support for understanding the mechanisms and evolution of overlapping genes, and for developing novel overlapping gene detection methods.
Collapse
|
16
|
Moyers BA, Zhang J. Evaluating Phylostratigraphic Evidence for Widespread De Novo Gene Birth in Genome Evolution. Mol Biol Evol 2016; 33:1245-56. [PMID: 26758516 DOI: 10.1093/molbev/msw008] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The source of genetic novelty is an area of wide interest and intense investigation. Although gene duplication is conventionally thought to dominate the production of new genes, this view was recently challenged by a proposal of widespread de novo gene origination in eukaryotic evolution. Specifically, distributions of various gene properties such as coding sequence length, expression level, codon usage, and probability of being subject to purifying selection among groups of genes with different estimated ages were reported to support a model in which new protein-coding proto-genes arise from noncoding DNA and gradually integrate into cellular networks. Here we show that the genomic patterns asserted to support widespread de novo gene origination are largely attributable to biases in gene age estimation by phylostratigraphy, because such patterns are also observed in phylostratigraphic analysis of simulated genes bearing identical ages. Furthermore, there is no evidence of purifying selection on very young de novo genes previously claimed to show such signals. Together, these findings are consistent with the prevailing view that de novo gene birth is a relatively minor contributor to new genes in genome evolution. They also illustrate the danger of using phylostratigraphy in the study of new gene origination without considering its inherent bias.
Collapse
Affiliation(s)
- Bryan A Moyers
- Department of Computational Medicine and Bioinformatics, University of Michigan
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan
| |
Collapse
|