1
|
Roden CA, Gladfelter AS. Experimental Considerations for the Evaluation of Viral Biomolecular Condensates. Annu Rev Virol 2024; 11:105-124. [PMID: 39326881 DOI: 10.1146/annurev-virology-093022-010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2024]
Abstract
Biomolecular condensates are nonmembrane-bound assemblies of biological polymers such as protein and nucleic acids. An increasingly accepted paradigm across the viral tree of life is (a) that viruses form biomolecular condensates and (b) that the formation is required for the virus. Condensates can promote viral replication by promoting packaging, genome compaction, membrane bending, and co-opting of host translation. This review is primarily concerned with exploring methodologies for assessing virally encoded biomolecular condensates. The goal of this review is to provide an experimental framework for virologists to consider when designing experiments to (a) identify viral condensates and their components, (b) reconstitute condensation cell free from minimal components, (c) ask questions about what conditions lead to condensation, (d) map these questions back to the viral life cycle, and (e) design and test inhibitors/modulators of condensation as potential therapeutics. This experimental framework attempts to integrate virology, cell biology, and biochemistry approaches.
Collapse
Affiliation(s)
- Christine A Roden
- Department of Cell Biology, Duke University School of Medicine, Durham, North Carolina, USA;
| | - Amy S Gladfelter
- Department of Cell Biology, Duke University School of Medicine, Durham, North Carolina, USA;
| |
Collapse
|
2
|
López-Pérez M, Aguirre-Garrido F, Herrera-Zúñiga L, Fernández FJ. Gene as a dynamical notion: An extensive and integrative vision. Redefining the gene concept, from traditional to genic-interaction, as a new dynamical version. Biosystems 2023; 234:105060. [PMID: 37844827 DOI: 10.1016/j.biosystems.2023.105060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 09/08/2023] [Accepted: 10/10/2023] [Indexed: 10/18/2023]
Abstract
The current concept of gene has been very useful during the 20th and 21st centuries. However, recent advances in molecular biology and bioinformatics, which have further diversified the functional and adaptive profile of genetic information and its integration with cell physiology and environmental response, have contributed to focusing on additional new gene properties besides the traditional definition. Considering the inherent complexity of gene expression, whose adaptive objective must be referred to the Tortoise-Hare model, in which two tendencies converge, one focused on rapid adaptation to achieve survival, and the other that prevents an over-adaptation effect. In this context, a revision of the gene concept must be made, which must include these new mechanisms and approaches. In this paper, we propose a new conception of the idea of a gene that moves from a static and defined version of hereditary information to a dynamic idea that preponderates gene interaction (circumscribed to that established between protein-protein, protein-nucleic acid, and nucleic acid-nucleic acid) and the selection it exerts, as the irreducible element that works in a coordinated way in a genomic regulatory network (GRN).
Collapse
Affiliation(s)
- Marcos López-Pérez
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico.
| | - Félix Aguirre-Garrido
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico
| | - Leonardo Herrera-Zúñiga
- Chemistry Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico
| | - Francisco J Fernández
- Biotechnology Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico.
| |
Collapse
|
3
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
4
|
Mavaie P, Holder L, Skinner M. Identifying unique exposure-specific transgenerational differentially DNA methylated region epimutations in the genome using hybrid deep learning prediction models. ENVIRONMENTAL EPIGENETICS 2023; 9:dvad007. [PMID: 38130880 PMCID: PMC10735314 DOI: 10.1093/eep/dvad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 10/04/2023] [Accepted: 11/28/2023] [Indexed: 12/23/2023]
Abstract
Exposure to environmental toxicants can lead to epimutations in the genome and an increase in differential DNA methylated regions (DMRs) that have been linked to increased susceptibility to various diseases. However, the unique effect of particular toxicants on the genome in terms of leading to unique DMRs for the toxicants has been less studied. One hurdle to such studies is the low number of observed DMRs per toxicants. To address this hurdle, a previously validated hybrid deep-learning cross-exposure prediction model is trained per exposure and used to predict exposure-specific DMRs in the genome. Given these predicted exposure-specific DMRs, a set of unique DMRs per exposure can be identified. Analysis of these unique DMRs through visualization, DNA sequence motif matching, and gene association reveals known and unknown links between individual exposures and their unique effects on the genome. The results indicate the potential ability to define exposure-specific epigenetic markers in the genome and the potential relative impact of different exposures. Therefore, a computational approach to predict exposure-specific transgenerational epimutations was developed, which supported the exposure specificity of ancestral toxicant actions and provided epigenome information on the DMR sites predicted.
Collapse
Affiliation(s)
- Pegah Mavaie
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164-2752, USA
| | - Lawrence Holder
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164-2752, USA
| | - Michael Skinner
- School of Biological Sciences, Washington State University, Pullman, WA 99164-4236, USA
| |
Collapse
|
5
|
Zhang Y, Liang X, Zhao M, Qi T, Guo H, Zhao J, Zhao J, Zhan G, Kang Z, Zheng L. A novel ambigrammatic mycovirus, PsV5, works hand in glove with wheat stripe rust fungus to facilitate infection. PLANT COMMUNICATIONS 2023; 4:100505. [PMID: 36527233 DOI: 10.1016/j.xplc.2022.100505] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 11/16/2022] [Accepted: 12/14/2022] [Indexed: 05/11/2023]
Abstract
Here we describe a novel narnavirus, Puccinia striiformis virus 5 (PsV5), from the devastating wheat stripe rust fungus P. striiformis f. sp. tritici (Pst). The genome of PsV5 contains two predicted open reading frames (ORFs) that largely overlap on reverse strands: an RNA-dependent RNA polymerase (RdRp) and a reverse-frame ORF (rORF) with unknown function. Protein translations of both ORFs were demonstrated by immune technology. Transgenic wheat lines overexpressing PsV5 (RdRp-rORF), RdRp ORF, or rORF were more susceptible to Pst infection, whereas PsV5-RNA interference (RNAi) lines were more resistant. Overexpression of PsV5 (RdRp-rORF), RdRp ORF, or rORF in Fusarium graminearum also boosted fungal virulence. We thus report a novel ambigrammatic mycovirus that promotes the virulence of its fungal host. The results are a significant addition to our understanding of virosphere diversity and offer insights for sustainable wheat rust disease control.
Collapse
Affiliation(s)
- Yanhui Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xiaofei Liang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mengxin Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Tuo Qi
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, State Key Laboratory of Hybrid Rice, Key Laboratory of Major Crop Diseases & Collaborative Innovation Center for Hybrid Rice in Yangtze River Basin, Rice Research Institute, Sichuan Agricultural University at Wenjiang, Chengdu, Sichuan 611130, China
| | - Hualong Guo
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jing Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jie Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Gangming Zhan
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhensheng Kang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China.
| | - Li Zheng
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China; Key Laboratory of Green Prevention and Control of Tropical Plant Diseases and Pests, Ministry of Education and School of Plant Protection, Hainan University, Haikou, Hainan 570228, China.
| |
Collapse
|
6
|
Kreitmeier M, Ardern Z, Abele M, Ludwig C, Scherer S, Neuhaus K. Spotlight on alternative frame coding: Two long overlapping genes in Pseudomonas aeruginosa are translated and under purifying selection. iScience 2022; 25:103844. [PMID: 35198897 PMCID: PMC8850804 DOI: 10.1016/j.isci.2022.103844] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 10/14/2021] [Accepted: 01/27/2022] [Indexed: 12/13/2022] Open
Abstract
The existence of overlapping genes (OLGs) with significant coding overlaps revolutionizes our understanding of genomic complexity. We report two exceptionally long (957 nt and 1536 nt), evolutionarily novel, translated antisense open reading frames (ORFs) embedded within annotated genes in the pathogenic Gram-negative bacterium Pseudomonas aeruginosa. Both OLG pairs show sequence features consistent with being genes and transcriptional signals in RNA sequencing. Translation of both OLGs was confirmed by ribosome profiling and mass spectrometry. Quantitative proteomics of samples taken during different phases of growth revealed regulation of protein abundances, implying biological functionality. Both OLGs are taxonomically restricted, and likely arose by overprinting within the genus. Evidence for purifying selection further supports functionality. The OLGs reported here, designated olg1 and olg2, are the longest yet proposed in prokaryotes and are among the best attested in terms of translation and evolutionary constraint. These results highlight a potentially large unexplored dimension of prokaryotic genomes.
Collapse
Affiliation(s)
- Michaela Kreitmeier
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Miriam Abele
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technische Universität München, Gregor-Mendel-Strasse 4, 85354 Freising, Germany
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technische Universität München, Gregor-Mendel-Strasse 4, 85354 Freising, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Klaus Neuhaus
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| |
Collapse
|
7
|
Computational methods for inferring location and genealogy of overlapping genes in virus genomes: approaches and applications. Curr Opin Virol 2021; 52:1-8. [PMID: 34798370 PMCID: PMC8594276 DOI: 10.1016/j.coviro.2021.10.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 10/21/2021] [Accepted: 10/22/2021] [Indexed: 12/02/2022]
Abstract
Viruses may evolve to increase the amount of encoded genetic information by means of overlapping genes, which utilize several reading frames. Such overlapping genes may be especially impactful for genomes of small size, often serving a source of novel accessory proteins, some of which play a crucial role in viral pathogenicity or in promoting the systemic spread of virus. Diverse genome-based metrics were proposed to facilitate recognition of overlapping genes that otherwise may be overlooked during genome annotation. They can detect the atypical codon bias associated with the overlap (e.g. a statistically significant reduction in variability at synonymous sites) or other sequence-composition features peculiar to overlapping genes. In this review, I compare nine computational methods, discuss their strengths and limitations, and survey how they were applied to detect candidate overlapping genes in the genome of SARS-CoV-2, the etiological agent of COVID-19 pandemic.
Collapse
|
8
|
Ramos-González PL, Pons T, Chabi-Jesus C, Arena GD, Freitas-Astua J. Poorly Conserved P15 Proteins of Cileviruses Retain Elements of Common Ancestry and Putative Functionality: A Theoretical Assessment on the Evolution of Cilevirus Genomes. FRONTIERS IN PLANT SCIENCE 2021; 12:771983. [PMID: 34804105 PMCID: PMC8602818 DOI: 10.3389/fpls.2021.771983] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 10/18/2021] [Indexed: 06/13/2023]
Abstract
The genus Cilevirus groups enveloped single-stranded (+) RNA virus members of the family Kitaviridae, order Martellivirales. Proteins P15, scarcely conserved polypeptides encoded by cileviruses, have no apparent homologs in public databases. Accordingly, the open reading frames (ORFs) p15, located at the 5'-end of the viral RNA2 molecules, are considered orphan genes (ORFans). In this study, we have delved into ORFs p15 and the relatively poorly understood biochemical properties of the proteins P15 to posit their importance for viruses across the genus and theorize on their origin. We detected that the ORFs p15 are under purifying selection and that, in some viral strains, the use of synonymous codons is biased, which might be a sign of adaptation to their plant hosts. Despite the high amino acid sequence divergence, proteins P15 show the conserved motif [FY]-L-x(3)-[FL]-H-x-x-[LIV]-S-C-x-C-x(2)-C-x-G-x-C, which occurs exclusively in members of this protein family. Proteins P15 also show a common predicted 3D structure that resembles the helical scaffold of the protein ORF49 encoded by radinoviruses and the phosphoprotein C-terminal domain of mononegavirids. Based on the 3D structural similarities of P15, we suggest elements of common ancestry, conserved functionality, and relevant amino acid residues. We conclude by postulating a plausible evolutionary trajectory of ORFans p15 and the 5'-end of the RNA2 of cileviruses considering both protein fold superpositions and comparative genomic analyses with the closest kitaviruses, negeviruses, nege/kita-like viruses, and unrelated viruses that share the ecological niches of cileviruses.
Collapse
Affiliation(s)
- Pedro L. Ramos-González
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
| | - Tirso Pons
- National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
| | - Camila Chabi-Jesus
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
- Escola Superior de Agricultura Luiz de Queiroz (ESALQ), Universidade de São Paulo, Piracicaba, Brazil
| | - Gabriella Dias Arena
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
| | - Juliana Freitas-Astua
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
- Embrapa Mandioca e Fruticultura, Cruz das Almas, Brazil
| |
Collapse
|
9
|
Pavesi A. Prediction of two novel overlapping ORFs in the genome of SARS-CoV-2. Virology 2021; 562:149-157. [PMID: 34339929 PMCID: PMC8317007 DOI: 10.1016/j.virol.2021.07.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 07/21/2021] [Accepted: 07/21/2021] [Indexed: 10/25/2022]
Abstract
Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
10
|
Abstract
Narnaviruses are RNA viruses detected in diverse fungi, plants, protists, arthropods, and nematodes. Though initially described as simple single-gene nonsegmented viruses encoding RNA-dependent RNA polymerase (RdRp), a subset of narnaviruses referred to as "ambigrammatic" harbor a unique genomic configuration consisting of overlapping open reading frames (ORFs) encoded on opposite strands. Phylogenetic analysis supports selection to maintain this unusual genome organization, but functional investigations are lacking. Here, we establish the mosquito-infecting Culex narnavirus 1 (CxNV1) as a model to investigate the functional role of overlapping ORFs in narnavirus replication. In CxNV1, a reverse ORF without homology to known proteins covers nearly the entire 3.2-kb segment encoding the RdRp. Additionally, two opposing and nearly completely overlapping novel ORFs are found on the second putative CxNV1 segment, the 0.8-kb "Robin" RNA. We developed a system to launch CxNV1 in a naive mosquito cell line and then showed that functional RdRp is required for persistence of both segments, and an intact reverse ORF is required on the RdRp segment for persistence. Mass spectrometry of persistently CxNV1-infected cells provided evidence for translation of this reverse ORF. Finally, ribosome profiling yielded a striking pattern of footprints for all four CxNV1 RNA strands that was distinct from actively translating ribosomes on host mRNA or coinfecting RNA viruses. Taken together, these data raise the possibility that the process of translation itself is important for persistence of ambigrammatic narnaviruses, potentially by protecting viral RNA with ribosomes, thus suggesting a heretofore undescribed viral tactic for replication and transmission. IMPORTANCE Fundamental to our understanding of RNA viruses is a description of which strand(s) of RNA are transmitted as the viral genome relative to which encode the viral proteins. Ambigrammatic narnaviruses break the mold. These viruses, found broadly in fungi, plants, and insects, have the unique feature of two overlapping genes encoded on opposite strands, comprising nearly the full length of the viral genome. Such extensive overlap is not seen in other RNA viruses and comes at the cost of reduced evolutionary flexibility in the sequence. The present study is motivated by investigating the benefits which balance that cost. We show for the first time a functional requirement for the ambigrammatic genome configuration in Culex narnavirus 1, which suggests a model for how translation of both strands might benefit this virus. Our work highlights a new blueprint for viral persistence, distinct from strategies defined by canonical definitions of the coding strand.
Collapse
|
11
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
12
|
Nelson CW, Ardern Z, Wei X. OLGenie: Estimating Natural Selection to Predict Functional Overlapping Genes. Mol Biol Evol 2021; 37:2440-2449. [PMID: 32243542 PMCID: PMC7531306 DOI: 10.1093/molbev/msaa087] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Purifying (negative) natural selection is a hallmark of functional biological sequences, and can be detected in protein-coding genes using the ratio of nonsynonymous to synonymous substitutions per site (dN/dS). However, when two genes overlap the same nucleotide sites in different frames, synonymous changes in one gene may be nonsynonymous in the other, perturbing dN/dS. Thus, scalable methods are needed to estimate functional constraint specifically for overlapping genes (OLGs). We propose OLGenie, which implements a modification of the Wei–Zhang method. Assessment with simulations and controls from viral genomes (58 OLGs and 176 non-OLGs) demonstrates low false-positive rates and good discriminatory ability in differentiating true OLGs from non-OLGs. We also apply OLGenie to the unresolved case of HIV-1’s putative antisense protein gene, showing significant purifying selection. OLGenie can be used to study known OLGs and to predict new OLGs in genome annotation. Software and example data are freely available at https://github.com/chasewnelson/OLGenie (last accessed April 10, 2020).
Collapse
Affiliation(s)
- Chase W Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY.,Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Zachary Ardern
- Microbial Ecology, ZIEL-Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Xinzhu Wei
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI.,Department of Integrative Biology and Statistics, University of California, Berkeley, CA
| |
Collapse
|
13
|
Jungreis I, Nelson CW, Ardern Z, Finkel Y, Krogan NJ, Sato K, Ziebuhr J, Stern-Ginossar N, Pavesi A, Firth AE, Gorbalenya AE, Kellis M. Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution. Virology 2021; 558:145-151. [PMID: 33774510 PMCID: PMC7967279 DOI: 10.1016/j.virol.2021.02.013] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 02/21/2021] [Accepted: 02/22/2021] [Indexed: 12/14/2022]
Abstract
At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity caused by prior usage of alternative names.
Collapse
Affiliation(s)
- Irwin Jungreis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| | - Chase W Nelson
- Biodiversity Research Center, Academia Sinica, Taipei, 115, Taiwan; Institute for Comparative Genomics, American Museum of Natural History, New York City, NY, 10024, USA
| | - Zachary Ardern
- Chair of Microbial Ecology, Technical University of Munich, 85354, Germany
| | - Yaara Finkel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Nevan J Krogan
- Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, 94158, USA; Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, 94158, USA; J. David Gladstone Institutes, San Francisco, CA, 94158, USA
| | - Kei Sato
- Division of Systems Virology, Department of Infectious Disease Control, Institute of Medical Science, The University of Tokyo, 1088639, Tokyo, Japan
| | - John Ziebuhr
- Institute of Medical Virology, Justus Liebig University Giessen, 35392, Giessen, Germany
| | - Noam Stern-Ginossar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Italy
| | - Andrew E Firth
- Division of Virology, Department of Pathology, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK
| | - Alexander E Gorbalenya
- Department of Medical Microbiology, Leiden University Medical Center, 2300 RC, Leiden, the Netherlands; Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119899, Moscow, Russia
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| |
Collapse
|
14
|
Tan X, Letendre JH, Collins JJ, Wong WW. Synthetic biology in the clinic: engineering vaccines, diagnostics, and therapeutics. Cell 2021; 184:881-898. [PMID: 33571426 PMCID: PMC7897318 DOI: 10.1016/j.cell.2021.01.017] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 01/12/2021] [Accepted: 01/13/2021] [Indexed: 12/17/2022]
Abstract
Synthetic biology is a design-driven discipline centered on engineering novel biological functions through the discovery, characterization, and repurposing of molecular parts. Several synthetic biological solutions to critical biomedical problems are on the verge of widespread adoption and demonstrate the burgeoning maturation of the field. Here, we highlight applications of synthetic biology in vaccine development, molecular diagnostics, and cell-based therapeutics, emphasizing technologies approved for clinical use or in active clinical trials. We conclude by drawing attention to recent innovations in synthetic biology that are likely to have a significant impact on future applications in biomedicine.
Collapse
Affiliation(s)
- Xiao Tan
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Division of Gastroenterology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA; Harvard Medical School, 25 Shattuck St., Boston, MA 02115, USA; Institute for Medical Engineering and Science, MIT, Cambridge, MA 02139, USA
| | - Justin H Letendre
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Biological Design Center, Boston University, Boston, MA 02215, USA
| | - James J Collins
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Institute for Medical Engineering and Science, MIT, Cambridge, MA 02139, USA; Department of Biological Engineering, MIT, Cambridge, MA 02139, USA; Synthetic Biology Center, MIT, 77 Massachusetts Ave., Cambridge, MA 02139, USA; Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA.
| | - Wilson W Wong
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Biological Design Center, Boston University, Boston, MA 02215, USA.
| |
Collapse
|
15
|
Nelson CW, Ardern Z, Goldberg TL, Meng C, Kuo CH, Ludwig C, Kolokotronis SO, Wei X. Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic. eLife 2020; 9:e59633. [PMID: 33001029 PMCID: PMC7655111 DOI: 10.7554/elife.59633] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/30/2020] [Indexed: 12/11/2022] Open
Abstract
Understanding the emergence of novel viruses requires an accurate and comprehensive annotation of their genomes. Overlapping genes (OLGs) are common in viruses and have been associated with pandemics but are still widely overlooked. We identify and characterize ORF3d, a novel OLG in SARS-CoV-2 that is also present in Guangxi pangolin-CoVs but not other closely related pangolin-CoVs or bat-CoVs. We then document evidence of ORF3d translation, characterize its protein sequence, and conduct an evolutionary analysis at three levels: between taxa (21 members of Severe acute respiratory syndrome-related coronavirus), between human hosts (3978 SARS-CoV-2 consensus sequences), and within human hosts (401 deeply sequenced SARS-CoV-2 samples). ORF3d has been independently identified and shown to elicit a strong antibody response in COVID-19 patients. However, it has been misclassified as the unrelated gene ORF3b, leading to confusion. Our results liken ORF3d to other accessory genes in emerging viruses and highlight the importance of OLGs.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Antibodies, Viral/immunology
- Antibody Specificity
- Antigens, Viral/biosynthesis
- Antigens, Viral/genetics
- Antigens, Viral/immunology
- Betacoronavirus/genetics
- Betacoronavirus/pathogenicity
- Betacoronavirus/physiology
- COVID-19
- China/epidemiology
- Chiroptera/virology
- Coronavirus/genetics
- Coronavirus Infections/epidemiology
- Coronavirus Infections/virology
- Epitopes/genetics
- Epitopes/immunology
- Europe/epidemiology
- Eutheria/virology
- Evolution, Molecular
- Gene Expression Regulation, Viral
- Genes, Overlapping
- Genes, Viral
- Genetic Variation
- Haplotypes/genetics
- Host Specificity/genetics
- Humans
- Models, Molecular
- Mutation
- Open Reading Frames/genetics
- Pandemics
- Phylogeny
- Pneumonia, Viral/epidemiology
- Pneumonia, Viral/virology
- Protein Biosynthesis
- Protein Conformation
- RNA, Viral/genetics
- SARS-CoV-2
- Sequence Alignment
- Sequence Homology, Nucleic Acid
- Viral Proteins/genetics
- Viral Proteins/immunology
Collapse
Affiliation(s)
- Chase W Nelson
- Biodiversity Research Center, Academia SinicaTaipeiTaiwan
- Institute for Comparative Genomics, American Museum of Natural HistoryNew YorkUnited States
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of MunichFreisingGermany
| | - Tony L Goldberg
- Department of Pathobiological Sciences, University of Wisconsin-MadisonMadisonUnited States
- Global Health Institute, University of Wisconsin-MadisonMadisonUnited States
| | - Chen Meng
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of MunichFreisingGermany
| | - Chen-Hao Kuo
- Biodiversity Research Center, Academia SinicaTaipeiTaiwan
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of MunichFreisingGermany
| | - Sergios-Orestis Kolokotronis
- Institute for Comparative Genomics, American Museum of Natural HistoryNew YorkUnited States
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences UniversityBrooklynUnited States
- Institute for Genomic Health, SUNY Downstate Health Sciences UniversityBrooklynUnited States
- Division of Infectious Diseases, Department of Medicine, SUNY Downstate Health Sciences UniversityBrooklynUnited States
| | - Xinzhu Wei
- Departments of Integrative Biology and Statistics, University of California, BerkeleyBerkeleyUnited States
- Departments of Computer Science, Human Genetics, and Computational Medicine, University of California, Los AngelesLos AngelesUnited States
| |
Collapse
|
16
|
Michel CJ, Mayer C, Poch O, Thompson JD. Characterization of accessory genes in coronavirus genomes. Virol J 2020; 17:131. [PMID: 32854725 PMCID: PMC7450977 DOI: 10.1186/s12985-020-01402-1] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 08/16/2020] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND The Covid19 infection is caused by the SARS-CoV-2 virus, a novel member of the coronavirus (CoV) family. CoV genomes code for a ORF1a / ORF1ab polyprotein and four structural proteins widely studied as major drug targets. The genomes also contain a variable number of open reading frames (ORFs) coding for accessory proteins that are not essential for virus replication, but appear to have a role in pathogenesis. The accessory proteins have been less well characterized and are difficult to predict by classical bioinformatics methods. METHODS We propose a computational tool GOFIX to characterize potential ORFs in virus genomes. In particular, ORF coding potential is estimated by searching for enrichment in motifs of the X circular code, that is known to be over-represented in the reading frames of viral genes. RESULTS We applied GOFIX to study the SARS-CoV-2 and related genomes including SARS-CoV and SARS-like viruses from bat, civet and pangolin hosts, focusing on the accessory proteins. Our analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations. In contrast, we predict that ORF3b is not functional in all genomes. Novel putative ORFs were also predicted, including a truncated form of the ORF10 previously identified in SARS-CoV-2 and a little known ORF overlapping the Spike protein in Civet-CoV and SARS-CoV. CONCLUSIONS Our findings contribute to characterizing sequence properties of accessory genes of SARS coronaviruses, and especially the newly acquired genes making use of overlapping reading frames.
Collapse
Affiliation(s)
- Christian Jean Michel
- Laboratoire ICube, Department of Computer Science, CNRS, University of Strasbourg, F-67412 Strasbourg, France
| | - Claudine Mayer
- Laboratoire ICube, Department of Computer Science, CNRS, University of Strasbourg, F-67412 Strasbourg, France
- Unité de Microbiologie Structurale, Institut Pasteur, CNRS UMR 3528, 75724 Paris Cedex 15, France
- Université Paris Diderot, Sorbonne Paris Cité, 75724 Paris Cedex 15, France
| | - Olivier Poch
- Laboratoire ICube, Department of Computer Science, CNRS, University of Strasbourg, F-67412 Strasbourg, France
| | - Julie Dawn Thompson
- Laboratoire ICube, Department of Computer Science, CNRS, University of Strasbourg, F-67412 Strasbourg, France
| |
Collapse
|
17
|
Pavesi A. New insights into the evolutionary features of viral overlapping genes by discriminant analysis. Virology 2020; 546:51-66. [PMID: 32452417 PMCID: PMC7157939 DOI: 10.1016/j.virol.2020.03.007] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 03/29/2020] [Indexed: 12/18/2022]
Abstract
Overlapping genes originate by a mechanism of overprinting, in which nucleotide substitutions in a pre-existing frame induce the expression of a de novo protein from an alternative frame. In this study, I assembled a dataset of 319 viral overlapping genes, which included 82 overlaps whose expression is experimentally known and the respective 237 homologs. Principal component analysis revealed that overlapping genes have a common pattern of nucleotide and amino acid composition. Discriminant analysis separated overlapping from non-overlapping genes with an accuracy of 97%. When applied to overlapping genes with known genealogy, it separated ancestral from de novo frames with an accuracy close to 100%. This high discriminant power was crucial to computationally design variants of de novo viral proteins known to possess selective anticancer toxicity (apoptin) or protection against neurodegeneration (X protein), as well as to detect two new potential overlapping genes in the genome of the new coronavirus SARS-CoV-2.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
18
|
Dinan AM, Lukhovitskaya NI, Olendraite I, Firth AE. A case for a negative-strand coding sequence in a group of positive-sense RNA viruses. Virus Evol 2020; 6:veaa007. [PMID: 32064120 PMCID: PMC7010960 DOI: 10.1093/ve/veaa007] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Positive-sense single-stranded RNA viruses form the largest and most diverse group of eukaryote-infecting viruses. Their genomes comprise one or more segments of coding-sense RNA that function directly as messenger RNAs upon release into the cytoplasm of infected cells. Positive-sense RNA viruses are generally accepted to encode proteins solely on the positive strand. However, we previously identified a surprisingly long (∼1,000-codon) open reading frame (ORF) on the negative strand of some members of the family Narnaviridae which, together with RNA bacteriophages of the family Leviviridae, form a sister group to all other positive-sense RNA viruses. Here, we completed the genomes of three mosquito-associated narnaviruses, all of which have the long reverse-frame ORF. We systematically identified narnaviral sequences in public data sets from a wide range of sources, including arthropod, fungal, and plant transcriptomic data sets. Long reverse-frame ORFs are widespread in one clade of narnaviruses, where they frequently occupy >95 per cent of the genome. The reverse-frame ORFs correspond to a specific avoidance of CUA, UUA, and UCA codons (i.e. stop codon reverse complements) in the forward-frame RNA-dependent RNA polymerase ORF. However, absence of these codons cannot be explained by other factors such as inability to decode these codons or GC3 bias. Together with other analyses, we provide the strongest evidence yet of coding capacity on the negative strand of a positive-sense RNA virus. As these ORFs comprise some of the longest known overlapping genes, their study may be of broad relevance to understanding overlapping gene evolution and de novo origin of genes.
Collapse
Affiliation(s)
- Adam M Dinan
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Nina I Lukhovitskaya
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Ingrida Olendraite
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| |
Collapse
|
19
|
DeRisi JL, Huber G, Kistler A, Retallack H, Wilkinson M, Yllanes D. An exploration of ambigrammatic sequences in narnaviruses. Sci Rep 2019; 9:17982. [PMID: 31784609 PMCID: PMC6884476 DOI: 10.1038/s41598-019-54181-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 11/11/2019] [Indexed: 11/09/2022] Open
Abstract
Narnaviruses have been described as positive-sense RNA viruses with a remarkably simple genome of ~3 kb, encoding only a highly conserved RNA-dependent RNA polymerase (RdRp). Many narnaviruses, however, are 'ambigrammatic' and harbour an additional uninterrupted open reading frame (ORF) covering almost the entire length of the reverse complement strand. No function has been described for this ORF, yet the absence of stops is conserved across diverse narnaviruses, and in every case the codons in the reverse ORF and the RdRp are aligned. The >3 kb ORF overlap on opposite strands, unprecedented among RNA viruses, motivates an exploration of the constraints imposed or alleviated by the codon alignment. Here, we show that only when the codon frames are aligned can all stop codons be eliminated from the reverse strand by synonymous single-nucleotide substitutions in the RdRp gene, suggesting a mechanism for de novo gene creation within a strongly conserved amino-acid sequence. It will be fascinating to explore what implications this coding strategy has for other aspects of narnavirus biology. Beyond narnaviruses, our rapidly expanding catalogue of viral diversity may yet reveal additional examples of this broadly-extensible principle for ambigrammatic-sequence development.
Collapse
Affiliation(s)
- Joseph L DeRisi
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Greg Huber
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Amy Kistler
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Hanna Retallack
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Michael Wilkinson
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- School of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes, MK7 6AA, England
| | - David Yllanes
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA.
| |
Collapse
|