1
|
Torrillo PA, Lieberman TD. Reversions mask the contribution of adaptive evolution in microbiomes. eLife 2024; 13:e93146. [PMID: 39240756 PMCID: PMC11379459 DOI: 10.7554/elife.93146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 07/30/2024] [Indexed: 09/08/2024] Open
Abstract
When examining bacterial genomes for evidence of past selection, the results depend heavily on the mutational distance between chosen genomes. Even within a bacterial species, genomes separated by larger mutational distances exhibit stronger evidence of purifying selection as assessed by dN/dS, the normalized ratio of nonsynonymous to synonymous mutations. Here, we show that the classical interpretation of this scale dependence, weak purifying selection, leads to problematic mutation accumulation when applied to available gut microbiome data. We propose an alternative, adaptive reversion model with opposite implications for dynamical intuition and applications of dN/dS. Reversions that occur and sweep within-host populations are nearly guaranteed in microbiomes due to large population sizes, short generation times, and variable environments. Using analytical and simulation approaches, we show that adaptive reversion can explain the dN/dS decay given only dozens of locally fluctuating selective pressures, which is realistic in the context of Bacteroides genomes. The success of the adaptive reversion model argues for interpreting low values of dN/dS obtained from long timescales with caution as they may emerge even when adaptive sweeps are frequent. Our work thus inverts the interpretation of an old observation in bacterial evolution, illustrates the potential of mutational reversions to shape genomic landscapes over time, and highlights the importance of studying bacterial genomic evolution on short timescales.
Collapse
Affiliation(s)
- Paul A Torrillo
- Institute for Medical Engineering and Sciences, Massachusetts Institute of TechnologyCambridgeUnited States
- Department of Civil and Environmental Engineering, Massachusetts Institute of TechnologyCambridgeUnited States
| | - Tami D Lieberman
- Institute for Medical Engineering and Sciences, Massachusetts Institute of TechnologyCambridgeUnited States
- Department of Civil and Environmental Engineering, Massachusetts Institute of TechnologyCambridgeUnited States
- Broad Institute of MIT and HarvardCambridgeUnited States
- Ragon Institute of MGH, MIT and HarvardCambridgeUnited States
| |
Collapse
|
2
|
Triebel S, Lamkiewicz K, Ontiveros N, Sweeney B, Stadler PF, Petrov AI, Niepmann M, Marz M. Comprehensive survey of conserved RNA secondary structures in full-genome alignment of Hepatitis C virus. Sci Rep 2024; 14:15145. [PMID: 38956134 PMCID: PMC11219754 DOI: 10.1038/s41598-024-62897-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 05/22/2024] [Indexed: 07/04/2024] Open
Abstract
Hepatitis C virus (HCV) is a plus-stranded RNA virus that often chronically infects liver hepatocytes and causes liver cirrhosis and cancer. These viruses replicate their genomes employing error-prone replicases. Thereby, they routinely generate a large 'cloud' of RNA genomes (quasispecies) which-by trial and error-comprehensively explore the sequence space available for functional RNA genomes that maintain the ability for efficient replication and immune escape. In this context, it is important to identify which RNA secondary structures in the sequence space of the HCV genome are conserved, likely due to functional requirements. Here, we provide the first genome-wide multiple sequence alignment (MSA) with the prediction of RNA secondary structures throughout all representative full-length HCV genomes. We selected 57 representative genomes by clustering all complete HCV genomes from the BV-BRC database based on k-mer distributions and dimension reduction and adding RefSeq sequences. We include annotations of previously recognized features for easy comparison to other studies. Our results indicate that mainly the core coding region, the C-terminal NS5A region, and the NS5B region contain secondary structure elements that are conserved beyond coding sequence requirements, indicating functionality on the RNA level. In contrast, the genome regions in between contain less highly conserved structures. The results provide a complete description of all conserved RNA secondary structures and make clear that functionally important RNA secondary structures are present in certain HCV genome regions but are largely absent from other regions. Full-genome alignments of all branches of Hepacivirus C are provided in the supplement.
Collapse
Affiliation(s)
- Sandra Triebel
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743, Jena, Germany
- European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743, Jena, Germany
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743, Jena, Germany
- European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743, Jena, Germany
| | - Nancy Ontiveros
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Blake Sweeney
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Peter F Stadler
- European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743, Jena, Germany
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, 04107, Leipzig, Germany
- German Center for Integrative Biodiversity Research (iDiv), 04103, Leipzig, Germany
| | | | - Michael Niepmann
- Institute for Biochemistry, Justus-Liebig-University Giessen, 35392, Giessen, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743, Jena, Germany.
- European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743, Jena, Germany.
- Leibniz Institute on Aging-Fritz Lipmann Institute, 07745, Jena, Germany.
- German Center for Integrative Biodiversity Research (iDiv), 04103, Leipzig, Germany.
- Michael Stifel Center Jena, Friedrich Schiller University Jena, 07743, Jena, Germany.
- Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743, Jena, Germany.
| |
Collapse
|
3
|
Torrillo PA, Lieberman TD. Reversions mask the contribution of adaptive evolution in microbiomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.14.557751. [PMID: 37745437 PMCID: PMC10515931 DOI: 10.1101/2023.09.14.557751] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
When examining bacterial genomes for evidence of past selection, the results obtained depend heavily on the mutational distance between chosen genomes. Even within a bacterial species, genomes separated by larger mutational distances exhibit stronger evidence of purifying selection as assessed byd N / d S , the normalized ratio of nonsynonymous to synonymous mutations. Here, we show that the classical interpretation of this scale-dependence, weak purifying selection, leads to problematic mutation accumulation when applied to available gut microbiome data. We propose an alternative, adaptive reversion model with exactly opposite implications for dynamical intuition and applications ofd N / d S . Reversions that occur and sweep within-host populations are nearly guaranteed in microbiomes due to large population sizes, short generation times, and variable environments. Using analytical and simulation approaches, we show that adaptive reversion can explain thed N / d S decay given only dozens of locally-fluctuating selective pressures, which is realistic in the context of Bacteroides genomes. The success of the adaptive reversion model argues for interpreting low values ofd N / d S obtained from long-time scales with caution, as they may emerge even when adaptive sweeps are frequent. Our work thus inverts the interpretation of an old observation in bacterial evolution, illustrates the potential of mutational reversions to shape genomic landscapes over time, and highlights the importance of studying bacterial genomic evolution on short time scales.
Collapse
Affiliation(s)
- Paul A. Torrillo
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tami D. Lieberman
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA 02139, USA
| |
Collapse
|
4
|
Rojas-Cruz AF, Bermúdez-Santana CI. Computational Prediction of RNA-RNA Interactions between Small RNA Tracks from Betacoronavirus Nonstructural Protein 3 and Neurotrophin Genes during Infection of an Epithelial Lung Cancer Cell Line: Potential Role of Novel Small Regulatory RNA. Viruses 2023; 15:1647. [PMID: 37631989 PMCID: PMC10458423 DOI: 10.3390/v15081647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/26/2023] [Accepted: 07/26/2023] [Indexed: 08/27/2023] Open
Abstract
Whether RNA-RNA interactions of cytoplasmic RNA viruses, such as Betacoronavirus, might end in the biogenesis of putative virus-derived small RNAs as miRNA-like molecules has been controversial. Even more, whether RNA-RNA interactions of wild animal viruses may act as virus-derived small RNAs is unknown. Here, we address these issues in four ways. First, we use conserved RNA structures undergoing negative selection in the genomes of SARS-CoV, MERS-CoV, and SARS-CoV-2 circulating in different bat species, intermediate animals, and human hosts. Second, a systematic literature review was conducted to identify Betacoronavirus-targeting hsa-miRNAs involved in lung cell infection. Third, we employed sophisticated long-range RNA-RNA interactions to refine the seed sequence homology of hsa-miRNAs with conserved RNA structures. Fourth, we used high-throughput RNA sequencing of a Betacoronavirus-infected epithelial lung cancer cell line (Calu-3) to validate the results. We proposed nine potential virus-derived small RNAs: two vsRNAs in SARS-CoV (Bats: SB-vsRNA-ORF1a-3p; SB-vsRNA-S-5p), one vsRNA in MERS-CoV (Bats: MB-vsRNA-ORF1b-3p), and six vsRNAs in SARS-CoV-2 (Bats: S2B-vsRNA-ORF1a-5p; intermediate animals: S2I-vsRNA-ORF1a-5p; and humans: S2H-vsRNA-ORF1a-5p, S2H-vsRNA-ORF1a-3p, S2H-vsRNA-ORF1b-3p, S2H-vsRNA-ORF3a-3p), mainly encoded by nonstructural protein 3. Notably, Betacoronavirus-derived small RNAs targeted 74 differentially expressed genes in infected human cells, of which 55 upregulate the molecular mechanisms underlying acute respiratory distress syndrome (ARDS), and the 19 downregulated genes might be implicated in neurotrophin signaling impairment. These results reveal a novel small RNA-based regulatory mechanism involved in neuropathogenesis that must be further studied to validate its therapeutic use.
Collapse
Affiliation(s)
- Alexis Felipe Rojas-Cruz
- Theoretical and Computational RNomics Group, Department of Biology, Faculty of Sciences, Universidad Nacional de Colombia, Bogotá 111321, Colombia;
- Center of Excellence in Scientific Computing, Universidad Nacional de Colombia, Bogotá 111321, Colombia
| | - Clara Isabel Bermúdez-Santana
- Theoretical and Computational RNomics Group, Department of Biology, Faculty of Sciences, Universidad Nacional de Colombia, Bogotá 111321, Colombia;
- Center of Excellence in Scientific Computing, Universidad Nacional de Colombia, Bogotá 111321, Colombia
| |
Collapse
|
5
|
Vu NT, Nguyen NBT, Ha HH, Nguyen LN, Luu LH, Dao HQ, Vu TT, Huynh HTT, Le HTT. Evolutionary analysis and expression profiling of the HSP70 gene family in response to abiotic stresses in tomato ( Solanum lycopersicum). Sci Prog 2023; 106:368504221148843. [PMID: 36650980 PMCID: PMC10358566 DOI: 10.1177/00368504221148843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Heat shock protein 70 (HSP70) genes play essential roles in guarding plants against abiotic stresses, including heat, drought, and salt. In this study, the SlHSP70 gene family in tomatoes has been characterized using bioinformatic tools. 25 putative SlHSP70 genes in the tomato genome were found and classified into five subfamilies, with multi-subcellular localizations. Twelve pairs of gene duplications were identified, and segmental events were determined as the main factor for the gene family expansion. Based on public RNA-seq data, gene expression analysis identified the majority of genes expressed in the examined organelles. Further RNA-seq analysis and then quantitative RT-PCR validation showed that many SlHSP70 members are responsible for cellular feedback to heat, drought, and salt treatments, in which, at least five genes might be potential key players in the stress response. Our results provided a thorough overview of the SlHSP70 gene family in the tomato, which may be useful for the evolutionary and functional analysis of SlHSP70 under abiotic stress conditions.
Collapse
Affiliation(s)
- Nam Tuan Vu
- Department of Biotechnology, Graduate University of Science and Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Ngoc Bich Thi Nguyen
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Hanh Hong Ha
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Linh Nhat Nguyen
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Ly Han Luu
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Ha Quang Dao
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Trinh Thi Vu
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Hue Thu Thi Huynh
- Department of Biotechnology, Graduate University of Science and Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| | - Hien Thu Thi Le
- Department of Biotechnology, Graduate University of Science and Technology, Vietnam Academy of Science and Technology, Hanoi, Vietnam
- Laboratory of Genome Biodiversity, Institute of Genome Research, Vietnam Academy of Science and Technology, Hanoi, Vietnam
| |
Collapse
|
6
|
Li Y, Baptista RP, Sateriale A, Striepen B, Kissinger JC. Analysis of Long Non-Coding RNA in Cryptosporidium parvum Reveals Significant Stage-Specific Antisense Transcription. Front Cell Infect Microbiol 2021; 10:608298. [PMID: 33520737 PMCID: PMC7840661 DOI: 10.3389/fcimb.2020.608298] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 11/26/2020] [Indexed: 12/13/2022] Open
Abstract
Cryptosporidium is a protist parasite that has been identified as the second leading cause of moderate to severe diarrhea in children younger than two and a significant cause of mortality worldwide. Cryptosporidium has a complex, obligate, intracellular but extra cytoplasmic lifecycle in a single host. How genes are regulated in this parasite remains largely unknown. Long non-coding RNAs (lncRNAs) play critical regulatory roles, including gene expression across a broad range of organisms. Cryptosporidium lncRNAs have been reported to enter the host cell nucleus and affect the host response. However, no systematic study of lncRNAs in Cryptosporidium has been conducted to identify additional lncRNAs. In this study, we analyzed a C. parvum in vitro strand-specific RNA-seq developmental time series covering both asexual and sexual stages to identify lncRNAs associated with parasite development. In total, we identified 396 novel lncRNAs, mostly antisense, with 86% being differentially expressed. Surprisingly, nearly 10% of annotated mRNAs have an antisense transcript. lncRNAs occur most often at the 3' end of their corresponding sense mRNA. Putative lncRNA regulatory regions were identified and many appear to encode bidirectional promoters. A positive correlation between lncRNA and upstream mRNA expression was observed. Evolutionary conservation and expression of lncRNA candidates was observed between C. parvum, C. hominis and C. baileyi. Ten C. parvum protein-encoding genes with antisense transcripts have P. falciparum orthologs that also have antisense transcripts. Three C. parvum lncRNAs with exceptional properties (e.g., intron splicing) were experimentally validated using RT-PCR and RT-qPCR. This initial characterization of the C. parvum non-coding transcriptome facilitates further investigations into the roles of lncRNAs in parasite development and host-pathogen interactions.
Collapse
Affiliation(s)
- Yiran Li
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Rodrigo P. Baptista
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, United States
| | - Adam Sateriale
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Boris Striepen
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Jessica C. Kissinger
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA, United States
- Department of Genetics, University of Georgia, Athens, GA, United States
| |
Collapse
|
7
|
Mathias C, Garcia LE, Teixeira MD, Kohler AF, Marchi RD, Barazetti JF, Gradia DF, de Oliveira JC. Polymorphism of lncRNAs in breast cancer: Meta-analysis shows no association with susceptibility. J Gene Med 2020; 22:e3271. [PMID: 32889751 DOI: 10.1002/jgm.3271] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 08/10/2020] [Accepted: 08/29/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) have been the target of considerable attention for their roles in many biological processes. Only a small portion of lncRNAs are functionally characterized, and several approaches have been proposed for investigating the roles of these molecules, including how polymorphisms in lncRNA genomic sites may interfere with their function. Allele frequency variation in single nucleotide polymorphisms (SNPs), for example, has been associated with several diseases, including breast cancer (BC), the most common type of cancer in women. METHODS In the present study, we performed a systematic review of lncRNA SNPs associated with BC and a meta-analysis of some lncRNA SNPs. We found 31 SNPs mapped in 12 lncRNAs associated with BC in 28 case-control studies. RESULTS Our meta-analysis showed an insignificant difference between the SNPs rs217727, rs3741219, rs2107425 and rs2839698 on H19, as well as rs920778, rs1899663, rs12826786 and rs4759314 on HOTAIR, and BC susceptibility. CONCLUSIONS The present analysis recognized the importance of extensive association studies, including different populations, and further evaluation of potential functional effects caused by lncRNA SNPs. Nevertheless, genetic variants such as SNPs in lncRNAs may play many other essential roles, although this field is still under explored.
Collapse
Affiliation(s)
- Carolina Mathias
- Department of Genetics, Federal University of Parana, Curitiba, PR, Brazil
| | - Leandro E Garcia
- Department of Genetics, Federal University of Parana, Curitiba, PR, Brazil
| | | | - Ana Flávia Kohler
- Department of Genetics, Federal University of Parana, Curitiba, PR, Brazil
| | - Rafael D Marchi
- Department of Genetics, Federal University of Parana, Curitiba, PR, Brazil
| | | | | | | |
Collapse
|
8
|
Variation Profile of the Orthotospovirus Genome. Pathogens 2020; 9:pathogens9070521. [PMID: 32610472 PMCID: PMC7400459 DOI: 10.3390/pathogens9070521] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 06/26/2020] [Accepted: 06/26/2020] [Indexed: 12/13/2022] Open
Abstract
Orthotospoviruses are plant-infecting members of the family Tospoviridae (order Bunyavirales), have a broad host range and are vectored by polyphagous thrips in a circulative-propagative manner. Because diverse hosts and vectors impose heterogeneous selection constraints on viral genomes, the evolutionary arms races between hosts and their pathogens might be manifested as selection for rapid changes in key genes. These observations suggest that orthotospoviruses contain key genetic components that rapidly mutate to mediate host adaptation and vector transmission. Using complete genome sequences, we profiled genomic variation in orthotospoviruses. Results show that the three genomic segments contain hypervariable areas at homologous locations across species. Remarkably, the highest nucleotide variation mapped to the intergenic region of RNA segments S and M, which fold into a hairpin. Secondary structure analyses showed that the hairpin is a dynamic structure with multiple functional shapes formed by stems and loops, contains sites under positive selection and covariable sites. Accumulation and tolerance of mutations in the intergenic region is a general feature of orthotospoviruses and might mediate adaptation to host plants and insect vectors.
Collapse
|
9
|
Tourasse NJ, Darfeuille F. Structural Alignment and Covariation Analysis of RNA Sequences. Bio Protoc 2020; 10:e3511. [PMID: 33654736 PMCID: PMC7842705 DOI: 10.21769/bioprotoc.3511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 12/29/2019] [Accepted: 12/29/2019] [Indexed: 11/02/2022] Open
Abstract
RNA molecules adopt defined structural conformations that are essential to exert their function. During the course of evolution, the structure of a given RNA can be maintained via compensatory base-pair changes that occur among covarying nucleotides in paired regions. Therefore, for comparative, structural, and evolutionary studies of RNA molecules, numerous computational tools have been developed to incorporate structural information into sequence alignments and a number of tools have been developed to study covariation. The bioinformatic protocol presented here explains how to use some of these tools to generate a secondary-structure-aware multiple alignment of RNA sequences and to annotate the alignment to examine the conservation and covariation of structural elements among the sequences.
Collapse
Affiliation(s)
- Nicolas J. Tourasse
- ARNA Laboratory, INSERM U1212, CNRS UMR5320, University of Bordeaux, Bordeaux, France
| | - Fabien Darfeuille
- ARNA Laboratory, INSERM U1212, CNRS UMR5320, University of Bordeaux, Bordeaux, France
| |
Collapse
|