1
|
Lessons Learned and Yet-to-Be Learned on the Importance of RNA Structure in SARS-CoV-2 Replication. Microbiol Mol Biol Rev 2022; 86:e0005721. [PMID: 35862724 PMCID: PMC9491204 DOI: 10.1128/mmbr.00057-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
SARS-CoV-2, the etiological agent responsible for the COVID-19 pandemic, is a member of the virus family Coronaviridae, known for relatively extensive (~30-kb) RNA genomes that not only encode for numerous proteins but are also capable of forming elaborate structures. As highlighted in this review, these structures perform critical functions in various steps of the viral life cycle, ultimately impacting pathogenesis and transmissibility. We examine these elements in the context of coronavirus evolutionary history and future directions for curbing the spread of SARS-CoV-2 and other potential human coronaviruses. While we focus on structures supported by a variety of biochemical, biophysical, and/or computational methods, we also touch here on recent evidence for novel structures in both protein-coding and noncoding regions of the genome, including an assessment of the potential role for RNA structure in the controversial finding of SARS-CoV-2 integration in “long COVID” patients. This review aims to serve as a consolidation of previous works on coronavirus and more recent investigation of SARS-CoV-2, emphasizing the need for improved understanding of the role of RNA structure in the evolution and adaptation of these human viruses.
Collapse
|
2
|
Chesnokova E, Beletskiy A, Kolosov P. The Role of Transposable Elements of the Human Genome in Neuronal Function and Pathology. Int J Mol Sci 2022; 23:5847. [PMID: 35628657 PMCID: PMC9148063 DOI: 10.3390/ijms23105847] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/17/2022] [Accepted: 05/19/2022] [Indexed: 12/13/2022] Open
Abstract
Transposable elements (TEs) have been extensively studied for decades. In recent years, the introduction of whole-genome and whole-transcriptome approaches, as well as single-cell resolution techniques, provided a breakthrough that uncovered TE involvement in host gene expression regulation underlying multiple normal and pathological processes. Of particular interest is increased TE activity in neuronal tissue, and specifically in the hippocampus, that was repeatedly demonstrated in multiple experiments. On the other hand, numerous neuropathologies are associated with TE dysregulation. Here, we provide a comprehensive review of literature about the role of TEs in neurons published over the last three decades. The first chapter of the present review describes known mechanisms of TE interaction with host genomes in general, with the focus on mammalian and human TEs; the second chapter provides examples of TE exaptation in normal neuronal tissue, including TE involvement in neuronal differentiation and plasticity; and the last chapter lists TE-related neuropathologies. We sought to provide specific molecular mechanisms of TE involvement in neuron-specific processes whenever possible; however, in many cases, only phenomenological reports were available. This underscores the importance of further studies in this area.
Collapse
Affiliation(s)
- Ekaterina Chesnokova
- Laboratory of Cellular Neurobiology of Learning, Institute of Higher Nervous Activity and Neurophysiology of the Russian Academy of Sciences, 117485 Moscow, Russia; (A.B.); (P.K.)
| | | | | |
Collapse
|
3
|
Radecki P, Uppuluri R, Deshpande K, Aviran S. Accurate detection of RNA stem-loops in structurome data reveals widespread association with protein binding sites. RNA Biol 2021; 18:521-536. [PMID: 34606413 PMCID: PMC8677038 DOI: 10.1080/15476286.2021.1971382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
RNA molecules are known to fold into specific structures which often play a central role in their functions and regulation. In silico folding of RNA transcripts, especially when assisted with structure profiling (SP) data, is capable of accurately elucidating relevant structural conformations. However, such methods scale poorly to the swaths of SP data generated by transcriptome-wide experiments, which are becoming more commonplace and advancing our understanding of RNA structure and its regulation at global and local levels. This has created a need for tools capable of rapidly deriving structural assessments from SP data in a scalable manner. One such tool we previously introduced that aims to process such data is patteRNA, a statistical learning algorithm capable of rapidly mining big SP datasets for structural elements. Here, we present a reformulation of patteRNA's pattern recognition scheme that sees significantly improved precision without major compromises to computational overhead. Specifically, we developed a data-driven logistic classifier which interprets patteRNA's statistical characterizations of SP data in addition to local sequence properties as measured with a nearest neighbour thermodynamic model. Application of the classifier to human structurome data reveals a marked association between detected stem-loops and RNA binding protein (RBP) footprints. The results of our application demonstrate that upwards of 30% of RBP footprints occur within loops of stable stem-loop elements. Overall, our work arrives at a rapid and accurate method for automatically detecting families of RNA structure motifs and demonstrates the functional relevance of identifying them transcriptome-wide.
Collapse
Affiliation(s)
- Pierce Radecki
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Rahul Uppuluri
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Kaustubh Deshpande
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| | - Sharon Aviran
- Biomedical Engineering Department and Genome Center, University of California, Davis, CA, USA
| |
Collapse
|
4
|
Jiang JC, Rothnagel JA, Upton KR. Widespread Exaptation of L1 Transposons for Transcription Factor Binding in Breast Cancer. Int J Mol Sci 2021; 22:5625. [PMID: 34070697 PMCID: PMC8199441 DOI: 10.3390/ijms22115625] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/18/2021] [Accepted: 05/20/2021] [Indexed: 12/29/2022] Open
Abstract
L1 transposons occupy 17% of the human genome and are widely exapted for the regulation of human genes, particularly in breast cancer, where we have previously shown abundant cancer-specific transcription factor (TF) binding sites within the L1PA2 subfamily. In the current study, we performed a comprehensive analysis of TF binding activities in primate-specific L1 subfamilies and identified pervasive exaptation events amongst these evolutionarily related L1 transposons. By motif scanning, we predicted diverse and abundant TF binding potentials within the L1 transposons. We confirmed substantial TF binding activities in the L1 subfamilies using TF binding sites consolidated from an extensive collection of publicly available ChIP-seq datasets. Young L1 subfamilies (L1HS, L1PA2 and L1PA3) contributed abundant TF binding sites in MCF7 cells, primarily via their 5' UTR. This is expected as the L1 5' UTR hosts cis-regulatory elements that are crucial for L1 replication and mobilisation. Interestingly, the ancient L1 subfamilies, where 5' truncation was common, displayed comparable TF binding capacity through their 3' ends, suggesting an alternative exaptation mechanism in L1 transposons that was previously unnoticed. Overall, primate-specific L1 transposons were extensively exapted for TF binding in MCF7 breast cancer cells and are likely prominent genetic players modulating breast cancer transcriptional regulation.
Collapse
Affiliation(s)
| | | | - Kyle R. Upton
- School of Chemistry and Molecular Biosciences, The University of Queensland, St. Lucia, QLD 4072, Australia; (J.-C.J.); (J.A.R.)
| |
Collapse
|
5
|
Bravo JI, Nozownik S, Danthi PS, Benayoun BA. Transposable elements, circular RNAs and mitochondrial transcription in age-related genomic regulation. Development 2020; 147:dev175786. [PMID: 32527937 PMCID: PMC10680986 DOI: 10.1242/dev.175786] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Our understanding of the molecular regulation of aging and age-related diseases is still in its infancy, requiring in-depth characterization of the molecular landscape shaping these complex phenotypes. Emerging classes of molecules with promise as aging modulators include transposable elements, circRNAs and the mitochondrial transcriptome. Analytical complexity means that these molecules are often overlooked, even though they exhibit strong associations with aging and, in some cases, may directly contribute to its progress. Here, we review the links between these novel factors and age-related phenotypes, and we suggest tools that can be easily incorporated into existing pipelines to better understand the aging process.
Collapse
Affiliation(s)
- Juan I Bravo
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Graduate Program in the Biology of Aging, University of Southern California, Los Angeles, CA 90089, USA
| | - Séverine Nozownik
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Magistère européen de Génétique, Université Paris Diderot-Paris 7, Paris 75014, France
| | - Prakroothi S Danthi
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA 90089, USA
- USC Norris Comprehensive Cancer Center, Epigenetics and Gene Regulation, Los Angeles, CA 90089, USA
- USC Stem Cell Initiative, Los Angeles, CA 90089, USA
| |
Collapse
|
6
|
Seibt KM, Schmidt T, Heitkam T. The conserved 3' Angio-domain defines a superfamily of short interspersed nuclear elements (SINEs) in higher plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:681-699. [PMID: 31610059 DOI: 10.1111/tpj.14567] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 09/13/2019] [Accepted: 09/17/2019] [Indexed: 06/10/2023]
Abstract
Repetitive sequences are ubiquitous components of eukaryotic genomes affecting genome size and evolution as well as gene regulation. Among them, short interspersed nuclear elements (SINEs) are non-coding retrotransposons usually shorter than 1000 bp. They contain only few short conserved structural motifs, in particular an internal promoter derived from cellular RNAs and a mostly AT-rich 3' tail, whereas the remaining regions are highly variable. SINEs emerge and vanish during evolution, and often diversify into numerous families and subfamilies that are usually specific for only a limited number of species. In contrast, at the 3' end of multiple plant SINEs we detected the highly conserved 'Angio-domain'. This 37 bp segment defines the Angio-SINE superfamily, which encompasses 24 plant SINE families widely distributed across 13 orders within the plant kingdom. We retrieved 28 433 full-length Angio-SINE copies from genome assemblies of 46 plant species, frequently located in genes. Compensatory mutations in and adjacent to the Angio-domain imply selective restraints maintaining its RNA structure. Angio-SINE families share segmental sequence similarities, indicating a modular evolution with strong Angio-domain preservation. We suggest that the conserved domain contributes to the evolutionary success of Angio-SINEs through either structural interactions between SINE RNA and proteins increasing their transpositional efficiency, or by enhancing their accumulation in genes.
Collapse
Affiliation(s)
- Kathrin M Seibt
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| | - Thomas Schmidt
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| | - Tony Heitkam
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| |
Collapse
|
7
|
Shein A, Zaikin A, Poptsova M. Recognition of 3'-end L1, Alu, processed pseudogenes, and mRNA stem-loops in the human genome using sequence-based and structure-based machine-learning models. Sci Rep 2019; 9:7211. [PMID: 31076573 PMCID: PMC6510757 DOI: 10.1038/s41598-019-43403-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 04/24/2019] [Indexed: 11/09/2022] Open
Abstract
The role of 3′-end stem-loops in retrotransposition was experimentally demonstrated for transposons of various species, where LINE-SINE retrotransposons share the same 3′-end sequences, containing a stem-loop. We have discovered that 62–68% of processed pseduogenes and mRNAs also have 3′-end stem-loops. We investigated the properties of 3′-end stem-loops of human L1s, Alus, processed pseudogenes and mRNAs that do not share the same sequences, but all have 3′-end stem-loops. We have built sequence-based and structure-based machine-learning models that are able to recognize 3′-end L1, Alu, processed pseudogene and mRNA stem-loops with high performance. The sequence-based models use only sequence information and capture compositional bias in 3′-ends. The structure-based models consider physical, chemical and geometrical properties of dinucleotides composing a stem and position-specific nucleotide content of a loop and a bulge. The most important parameters include shift, tilt, rise, and hydrophilicity. The obtained results clearly point to the existence of structural constrains for 3′-end stem-loops of L1 and Alu, which are probably important for transposition, and reveal the potential of mRNAs to be recognized by the L1 machinery. The proposed approach is applicable to a broader task of recognizing RNA (DNA) secondary structures. The constructed models are freely available at github (https://github.com/AlexShein/transposons/).
Collapse
Affiliation(s)
- Alexander Shein
- Laboratory of Bioinformatics, Big Data and Information Retrieval School, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
| | - Anton Zaikin
- Laboratory of Bioinformatics, Big Data and Information Retrieval School, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
| | - Maria Poptsova
- Laboratory of Bioinformatics, Big Data and Information Retrieval School, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia.
| |
Collapse
|
8
|
Abstract
Arc, a master regulator of synaptic plasticity, contains sequence elements that are evolutionarily related to retrotransposon Gag genes. Two related papers in this issue of Cell show that Arc retains retroviral-like capsid-forming ability and can transmit mRNA between cells in the nervous system, a process that may be important for synaptic function.
Collapse
Affiliation(s)
- Nicholas F Parrish
- Section of Surgical Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Keizo Tomonaga
- Department of Virus Research, Institute for Frontier Life and Medical Sciences, Kyoto University, Kyoto, Japan; Department of Mammalian Regulatory Network, Graduate School of Biostudies, Kyoto University, Kyoto, Japan.
| |
Collapse
|
9
|
Tobar-Tosse F, Veléz PE, Ocampo-Toro E, Moreno PA. Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes. BMC Genomics 2018; 19:862. [PMID: 30537933 PMCID: PMC6288848 DOI: 10.1186/s12864-018-5196-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background Repetitive DNA sequences (Repeats) are significant regions in the human genome that have a specific genomic distribution, structure, and several binding sites for genome architecture and function. In consequence, the possible configurations of Repeats in specific and dynamic regions like the gene promoters could define footprints for molecular mechanisms, pathways, and cell function beyond their density in the genome. Here we explored the distribution of Repeats in the upstream promoter region of the human coding genes with the aim to identify specific configurations, clusters and functional meaning of those elements. Our method includes structural descriptions, hierarchical clustering, pathway association, and functional enrichment analysis. Results We report here several configurations of Repeats in the upstream promoter region (UPR), which define 2729 patterns for the 80% of the human coding genes. There are 47 types of Repeats in these configurations, where the most frequent were Alu, Low_complexity, MIR, Simple_repeat, LINE/L2, LINE/L1, hAT-Charlie, and ERV1. The distribution, length, and the high frequency of Repeats in the UPR defines several patterns and clusters, where the minimum frequency of configuration among Repeats was higher than 0.7. We found those clusters associated with cellular pathways and ontologies; thus, it was plausible to determine groups of Repeats to specific functional insights, for example, pathways for Genetic Information Processing or Metabolism shows particular groups of Repeats with specific configurations. Conclusion Based on these findings, we propose that specific configurations of repetitive elements describe frequent patterns in the upstream promoter for sets of human coding genes, which those correlated to specific and essential cell pathways and functions. Electronic supplementary material The online version of this article (10.1186/s12864-018-5196-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Fabian Tobar-Tosse
- Departamento de Ciencias Básicas de la Salud, Pontificia Universidad Javeriana Cali, Cali, Colombia
| | - Patricia E Veléz
- Departamento de Biología, FACNED, Universidad del Cauca, Popayán, Colombia
| | - Eliana Ocampo-Toro
- Especialización en Hematología y Oncología Clínica, Universidad Libre Seccional Cali, Cali, Colombia
| | - Pedro A Moreno
- Escuela de Ingeniería de Sistemas y Computación, Universidad del Valle, Cali, Colombia.
| |
Collapse
|
10
|
Szafranski P, Kośmider E, Liu Q, Karolak JA, Currie L, Parkash S, Kahler SG, Roeder E, Littlejohn RO, DeNapoli TS, Shardonofsky FR, Henderson C, Powers G, Poisson V, Bérubé D, Oligny L, Michaud JL, Janssens S, De Coen K, Van Dorpe J, Dheedene A, Harting MT, Weaver MD, Khan AM, Tatevian N, Wambach J, Gibbs KA, Popek E, Gambin A, Stankiewicz P. LINE- and Alu-containing genomic instability hotspot at 16q24.1 associated with recurrent and nonrecurrent CNV deletions causative for ACDMPV. Hum Mutat 2018; 39:1916-1925. [PMID: 30084155 DOI: 10.1002/humu.23608] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 08/01/2018] [Accepted: 08/02/2018] [Indexed: 01/20/2023]
Abstract
Transposable elements modify human genome by inserting into new loci or by mediating homology-, microhomology-, or homeology-driven DNA recombination or repair, resulting in genomic structural variation. Alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV) is a rare lethal neonatal developmental lung disorder caused by point mutations or copy-number variant (CNV) deletions of FOXF1 or its distant tissue-specific enhancer. Eighty-five percent of 45 ACDMPV-causative CNV deletions, of which junctions have been sequenced, had at least one of their two breakpoints located in a retrotransposon, with more than half of them being Alu elements. We describe a novel ∼35 kb-large genomic instability hotspot at 16q24.1, involving two evolutionarily young LINE-1 (L1) elements, L1PA2 and L1PA3, flanking AluY, two AluSx, AluSx1, and AluJr elements. The occurrence of L1s at this location coincided with the branching out of the Homo-Pan-Gorilla clade, and was preceded by the insertion of AluSx, AluSx1, and AluJr. Our data show that, in addition to mediating recurrent CNVs, L1 and Alu retrotransposons can predispose the human genome to formation of variably sized CNVs, both of clinical and evolutionary relevance. Nonetheless, epigenetic or other genomic features of this locus might also contribute to its increased instability.
Collapse
Affiliation(s)
- Przemyslaw Szafranski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Ewelina Kośmider
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Qian Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Justyna A Karolak
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Department of Genetics and Pharmaceutical Microbiology, Poznan University of Medical Sciences, Poznan, Poland
| | - Lauren Currie
- Maritime Medical Genetics Service, IWK Health Centre, Halifax, Canada
| | - Sandhya Parkash
- Maritime Medical Genetics Service, IWK Health Centre, Halifax, Canada
| | - Stephen G Kahler
- Section of Genetics and Metabolism, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Elizabeth Roeder
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Department of Pediatrics, Baylor College of Medicine, San Antonio, Texas
| | | | - Thomas S DeNapoli
- Department of Pathology, Children's Hospital of San Antonio, San Antonio, Texas
| | - Felix R Shardonofsky
- Pediatric Pulmonary Center, Children's Hospital of San Antonio, San Antonio, Texas
| | - Cody Henderson
- Department of Pediatrics, Baylor College of Medicine, San Antonio, Texas.,Neonatal-Perinatal Medicine, Children's Hospital of San Antonio, San Antonio, Texas
| | - George Powers
- Department of Pediatrics, Baylor College of Medicine, San Antonio, Texas.,Neonatal-Perinatal Medicine, Children's Hospital of San Antonio, San Antonio, Texas
| | | | | | | | | | - Sandra Janssens
- Center for Medical Genetics, Ghent University, Ghent, Belgium
| | - Kris De Coen
- Department of Neonatal Intensive Care, Ghent University, Ghent, Belgium
| | - Jo Van Dorpe
- Department of Pathology, Ghent University, Ghent, Belgium
| | | | | | | | - Amir M Khan
- McGovern Medical School at UTHealth, Houston, Texas
| | | | - Jennifer Wambach
- Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, Missouri
| | - Kathleen A Gibbs
- Children's Hospital of Philadelphia, and University of Pennsylvania, Philadelphia, Pennsylvania
| | - Edwina Popek
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas
| | - Anna Gambin
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Paweł Stankiewicz
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
11
|
Grechishnikova DA, Poptsova MS. The Physical and Geometric Properties of Human Transposon Stem–Loop Structures under Natural Selection. Biophysics (Nagoya-shi) 2017. [DOI: 10.1134/s0006350917060070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|