1
|
Beniston E, Skittrall JP. Locations and structures of influenza A virus packaging-associated signals and other functional elements via an in silico pipeline for predicting constrained features in RNA viruses. PLoS Comput Biol 2024; 20:e1012009. [PMID: 38648223 PMCID: PMC11034665 DOI: 10.1371/journal.pcbi.1012009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 03/18/2024] [Indexed: 04/25/2024] Open
Abstract
Influenza A virus contains regions of its segmented genome associated with ability to package the segments into virions, but many such regions are poorly characterised. We provide detailed predictions of the key locations within these packaging-associated regions, and their structures, by applying a recently-improved pipeline for delineating constrained regions in RNA viruses and applying structural prediction algorithms. We find and characterise other known constrained regions within influenza A genomes, including the region associated with the PA-X frameshift, regions associated with alternative splicing, and constraint around the initiation motif for a truncated PB1 protein, PB1-N92, associated with avian viruses. We further predict the presence of constrained regions that have not previously been described. The extra characterisation our work provides allows investigation of these key regions for drug target potential, and points towards determinants of packaging compatibility between segments.
Collapse
Affiliation(s)
- Emma Beniston
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | | |
Collapse
|
2
|
Skittrall JP, Irigoyen N, Brierley I, Gog JR. A novel approach to finding conserved features in low-variability gene alignments characterises RNA motifs in SARS-CoV and SARS-CoV-2. Sci Rep 2023; 13:12079. [PMID: 37495730 PMCID: PMC10372003 DOI: 10.1038/s41598-023-39207-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 07/21/2023] [Indexed: 07/28/2023] Open
Abstract
Collections of genetic sequences belonging to related organisms contain information on the evolutionary constraints to which the organisms have been subjected. Heavily constrained regions can be investigated to understand their roles in an organism's life cycle, and drugs can be sought to disrupt these roles. In organisms with low genetic diversity, such as newly-emerged pathogens, it is key to obtain this information early to develop new treatments. Here, we present methods that ensure we can leverage all the information available in a low-signal, low-noise set of sequences, to find contiguous regions of relatively conserved nucleic acid. We demonstrate the application of these methods by analysing over 5 million genome sequences of the recently-emerged RNA virus SARS-CoV-2 and correlating these results with an analysis of 119 genome sequences of SARS-CoV. We propose the precise location of a previously described packaging signal, and discuss explanations for other regions of high conservation.
Collapse
Affiliation(s)
- Jordan P Skittrall
- Department of Pathology, Division of Virology, Addenbrooke's Hospital, University of Cambridge, Hills Road, Cambridge, CB2 0QQ, UK.
| | - Nerea Irigoyen
- Department of Pathology, Division of Virology, Addenbrooke's Hospital, University of Cambridge, Hills Road, Cambridge, CB2 0QQ, UK
| | - Ian Brierley
- Department of Pathology, Division of Virology, Addenbrooke's Hospital, University of Cambridge, Hills Road, Cambridge, CB2 0QQ, UK
| | - Julia R Gog
- Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, UK
| |
Collapse
|
3
|
Zhou X, Du Z, Huang X. A potential long-range RNA-RNA interaction in the HIV-1 RNA. J Biomol Struct Dyn 2023; 41:14968-14976. [PMID: 36863767 DOI: 10.1080/07391102.2023.2184639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 02/19/2023] [Indexed: 03/04/2023]
Abstract
It is well-established that viral and cellular mRNAs alike harbour functional long-range intra-molecular RNA-RNA interactions. Despite the biological importance of such interactions, their identification and characterization remain challenging. Here we present a computational method for the identification of certain kinds of long-range intra-molecular RNA-RNA interactions involving the loop nucleotides of a hairpin loop. Using the computational method, we analysed 4272 HIV-1 genomic mRNAs. A potential long-range intra-molecular RNA-RNA interaction within the HIV-1 genomic RNA was identified. The long-range interaction is mediated by a kissing loop structure between two stem-loops of the previously reported SHAPE-based secondary structure of the entire HIV-1 genome. Structural modelling studies were carried out to show that the kissing loop structure not only is sterically feasible, but also contains a conserved RNA structural motif often found in compact RNA pseudoknots. The computational method should be generally applicable to the identification of potential long-range intra-molecular RNA-RNA interactions in any viral or cellular mRNA sequence.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Xia Zhou
- School of Chemical and Biomolecular Sciences, Southern Illinois University at Carbondale, Carbondale, IL, USA
| | - Zhihua Du
- School of Chemical and Biomolecular Sciences, Southern Illinois University at Carbondale, Carbondale, IL, USA
| | - Xiaolan Huang
- School of Computing, Southern Illinois University at Carbondale, Carbondale, IL, USA
| |
Collapse
|
4
|
Xu G, Reboud J, Guo Y, Yang H, Gu H, Fan C, Qian X, Cooper JM. Programmable design of isothermal nucleic acid diagnostic assays through abstraction-based models. Nat Commun 2022; 13:1635. [PMID: 35347157 PMCID: PMC8960814 DOI: 10.1038/s41467-022-29101-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 02/24/2022] [Indexed: 02/07/2023] Open
Abstract
Accelerating the design of nucleic acid amplification methods remains a critical challenge in the development of molecular tools to identify biomarkers to diagnose both infectious and non-communicable diseases. Many of the principles that underpin these mechanisms are often complex and can require iterative optimisation. Here we focus on creating a generalisable isothermal nucleic acid amplification methodology, describing the systematic implementation of abstraction-based models for the algorithmic design and application of assays. We demonstrate the simplicity, ease and flexibility of our approach using a software tool that provides amplification schemes de novo, based upon a user-input target sequence. The abstraction of reaction network predicts multiple reaction pathways across different strategies, facilitating assay optimisation for specific applications, including the ready design of multiplexed tests for short nucleic acid sequence miRNAs or for difficult pathogenic targets, such as highly mutating viruses.
Collapse
Affiliation(s)
- Gaolian Xu
- Nano Biomedical Research Centre, Nano Biomedical Research Centre, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Julien Reboud
- Division of Biomedical Engineering, James Watt School of Engineering, University of Glasgow, Oakfield Avenue, Glasgow, G12 8LT, UK
| | - Yunfei Guo
- Nano Biomedical Research Centre, Nano Biomedical Research Centre, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Hao Yang
- Nano Biomedical Research Centre, Nano Biomedical Research Centre, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Hongchen Gu
- Nano Biomedical Research Centre, Nano Biomedical Research Centre, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Chunhai Fan
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Xiaohua Qian
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Jonathan M Cooper
- Division of Biomedical Engineering, James Watt School of Engineering, University of Glasgow, Oakfield Avenue, Glasgow, G12 8LT, UK.
| |
Collapse
|
5
|
Ekpenyong ME, Adegoke AA, Edoho ME, Inyang UG, Udo IJ, Ekaidem IS, Osang F, Uto NP, Geoffery JI. Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery. Curr HIV Res 2022; 20:163-183. [PMID: 35142269 DOI: 10.2174/1570162x20666220210142209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 11/30/2021] [Accepted: 12/20/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND Effective global antiretroviral vaccines and therapeutic strategies depend on the diversity, evolution, and epidemiology of their various strains as well as their transmission and pathogenesis. Most viral disease-causing particles are clustered into a taxonomy of subtypes to suggest pointers toward nucleotide-specific vaccines or therapeutic applications of clinical significance sufficient for sequence-specific diagnosis and homologous viral studies. These are very useful to formulate predictors to induce cross-resistance to some retroviral control drugs being used across study areas. OBJECTIVE This research proposed a collaborative framework of hybridized (Machine Learning and Natural Language Processing) techniques to discover hidden genome patterns and feature predictors, for HIV-1 genome sequences mining. METHOD 630 human HIV-1 genome sequences above 8500 bps were excavated from the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov) for 21 countries across different continents, Antarctica exempt. These sequences were transformed and learned using a self-organizing map (SOM). To discriminate emerging/new sub-strain(s), the HIV-1 reference genome was included as part of the input isolates/samples during the training. After training the SOM, component planes defining pattern clusters of the input datasets were generated, for cognitive knowledge mining and subsequent labelling of the datasets. Additional genome features including dinucleotide transmission recurrences, codon recurrences, and mutation recurrences, were finally extracted from the raw genomes to construct output classification targets for supervised learning. RESULTS SOM training explains the inherent pattern diversity of HIV-1 genomes as well as inter- and intra-country transmissions in which mobility might play an active role, as corroborated by literature. Nine sub-strains were discovered after disassembling the SOM correlation hunting matrix space attributed to disparate clusters. Cognitive knowledge mining separated similar pattern clusters bounded by a certain degree of correlation range, discovered by the SOM. A Kruskal-Wallis rank-sum test and Wilcoxon rank-sum test showed statistically significant variations in dinucleotide, codon, and mutation patterns. CONCLUSION Results of the discovered sub-strains and response clusters visualizations corroborate existing literature, with significant haplotype variations. The proposed framework would assist in the development of decision support systems for easy contact tracing, infectious disease surveillance, and studying the progressive evolution of the reference HIV-1 genome.
Collapse
Affiliation(s)
- Moses E Ekpenyong
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
- Centre for Research and Development, University of Uyo, Uyo, Nigeria
| | - Anthony A Adegoke
- Department of Microbiology, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Mercy E Edoho
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Udoinyang G Inyang
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Ifiok J Udo
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Itemobong S Ekaidem
- Department of Chemical Pathology, College of Health Sciences, University of Uyo, Uyo, Nigeria
| | - Francis Osang
- Department of Computer Science, Faculty of Science, National Open University, Abuja, Nigeria
| | - Nseobong P Uto
- School of Mathematics and Statistics, University of St Andrews, Scotland, United Kingdom
| | - Joseph I Geoffery
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| |
Collapse
|
6
|
Liao H, Cai D, Sun Y. VirStrain: a strain identification tool for RNA viruses. Genome Biol 2022; 23:38. [PMID: 35101081 PMCID: PMC8801933 DOI: 10.1186/s13059-022-02609-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 01/12/2022] [Indexed: 12/18/2022] Open
Abstract
Viruses change constantly during replication, leading to high intra-species diversity. Although many changes are neutral or deleterious, some can confer on the virus different biological properties such as better adaptability. In addition, viral genotypes often have associated metadata, such as host residence, which can help with inferring viral transmission during pandemics. Thus, subspecies analysis can provide important insights into virus characterization. Here, we present VirStrain, a tool taking short reads as input with viral strain composition as output. We rigorously test VirStrain on multiple simulated and real virus sequencing datasets. VirStrain outperforms the state-of-the-art tools in both sensitivity and accuracy.
Collapse
Affiliation(s)
- Herui Liao
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, China
| | - Dehan Cai
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, China.
| |
Collapse
|
7
|
Jordan-Paiz A, Franco S, Martinez MA. Reducing HIV-1 env gene CpG frequency increases the replication capacity of the HXB2 virus strain. Virus Res 2022; 310:198685. [PMID: 35041864 DOI: 10.1016/j.virusres.2022.198685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 01/12/2022] [Accepted: 01/13/2022] [Indexed: 11/27/2022]
Abstract
Synonymous replacement of CpG dinucleotides in the HIV-1 envelope (env) coding region has been correlated with evasion of the antiviral activity of the zinc-finger antiviral protein (ZAP). We aimed to explore the effect of depleting HIV-1 env CpG dinucleotides by synonymous substitution on ex vivo viral replication capacity. To this end, we eliminated 11 env CpG dinucleotides through synonymous substitutions in the CXCR4-tropic HXB2 strain. The replication kinetics in MT-4 cells and peripheral blood mononuclear cells (PBMCs) of the WT and synonymously recoded mutant viruses were indistinguishable. However, virus competition assays in MT4 cells between the WT and recoded viruses showed that the mutant with fewer CpG dinucleotides quickly overgrew the WT virus. These results demonstrate that a reduction in HIV-1 env CpG dinucleotide frequency can improve viral replication capacity in cell culture. Our results support the previous observation that the frequency of CpGs in the HIV-1 env region correlates with differences in clinical progression rates in infected individuals.
Collapse
Affiliation(s)
- Ana Jordan-Paiz
- IrsiCaixa, Hospital Universitari Germans Trias i Pujol, Universitat Autònoma de Barcelona (UAB), 08916 Badalona, Spain.
| | - Sandra Franco
- IrsiCaixa, Hospital Universitari Germans Trias i Pujol, Universitat Autònoma de Barcelona (UAB), 08916 Badalona, Spain.
| | - Miguel Angel Martinez
- IrsiCaixa, Hospital Universitari Germans Trias i Pujol, Universitat Autònoma de Barcelona (UAB), 08916 Badalona, Spain.
| |
Collapse
|
8
|
Kmiec D, Nchioua R, Sherrill-Mix S, Stürzel CM, Heusinger E, Braun E, Gondim MVP, Hotter D, Sparrer KMJ, Hahn BH, Sauter D, Kirchhoff F. CpG Frequency in the 5' Third of the env Gene Determines Sensitivity of Primary HIV-1 Strains to the Zinc-Finger Antiviral Protein. mBio 2020; 11:e02903-19. [PMID: 31937644 PMCID: PMC6960287 DOI: 10.1128/mbio.02903-19] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 11/27/2019] [Indexed: 02/07/2023] Open
Abstract
CpG dinucleotide suppression has been reported to allow HIV-1 to evade inhibition by the zinc-finger antiviral protein (ZAP). Here, we show that primate lentiviruses display marked differences in CpG frequencies across their genome, ranging from 0.44% in simian immunodeficiency virus SIVwrc from Western red colobus to 2.3% in SIVmon infecting mona monkeys. Moreover, functional analyses of a large panel of human and simian immunodeficiency viruses revealed that the magnitude of CpG suppression does not correlate with their susceptibility to ZAP. However, we found that the number of CpG dinucleotides within a region of ∼700 bases at the 5' end of the env gene determines ZAP sensitivity of primary HIV-1 strains but not of HIV-2. Increased numbers of CpGs in this region were associated with reduced env mRNA expression and viral protein production. ZAP sensitivity profiles of chimeric simian-human immunodeficiency viruses (SHIVs) expressing different HIV-1 env genes were highly similar to those of the corresponding HIV-1 strains. The frequency of CpGs in the identified env region correlated with differences in clinical progression rates. Thus, the CpG frequency in a specific part of env, rather than the overall genomic CpG content, governs the susceptibility of HIV-1 to ZAP and might affect viral pathogenicity in vivoIMPORTANCE Evasion of the zinc-finger antiviral protein (ZAP) may drive CpG dinucleotide suppression in HIV-1 and many other viral pathogens but the viral determinants of ZAP sensitivity are poorly defined. Here, we examined CpG suppression and ZAP sensitivity in a large number of primate lentiviruses and demonstrate that their genomic frequency of CpGs varies substantially and does not correlate with ZAP sensitivity. We further show that the number of CpG residues in a defined region at the 5' end of the env gene together with structural features plays a key role in HIV-1 susceptibility to ZAP and correlates with differences in clinical progression rates in HIV-1-infected individuals. Our identification of a specific part of env as a major determinant of HIV-1 susceptibility to ZAP restriction provides a basis for future studies of the underlying inhibitory mechanisms and their potential relevance in the pathogenesis of AIDS.
Collapse
Affiliation(s)
- Dorota Kmiec
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Rayhane Nchioua
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Scott Sherrill-Mix
- Department of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Microbiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Christina M Stürzel
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Elena Heusinger
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Elisabeth Braun
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Marcos V P Gondim
- Department of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Microbiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Dominik Hotter
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | | | - Beatrice H Hahn
- Department of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Microbiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Daniel Sauter
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| | - Frank Kirchhoff
- Institute of Molecular Virology, Ulm University Medical Center, Ulm, Germany
| |
Collapse
|