1
|
Zhang C, Forsdyke DR. Potential Achilles heels of SARS-CoV-2 are best displayed by the base order-dependent component of RNA folding energy. Comput Biol Chem 2021; 94:107570. [PMID: 34500325 PMCID: PMC8410225 DOI: 10.1016/j.compbiolchem.2021.107570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 08/29/2021] [Accepted: 08/30/2021] [Indexed: 11/29/2022]
Abstract
The base order-dependent component of folding energy has revealed a highly conserved region in HIV-1 genomes that associates with RNA structure. This corresponds to a packaging signal that is recognized by the nucleocapsid domain of the Gag polyprotein. Long viewed as a potential HIV-1 "Achilles heel," the signal can be targeted by a new antiviral compound. Although SARS-CoV-2 differs in many respects from HIV-1, the same technology displays regions with a high base order-dependent folding energy component, which are also highly conserved. This indicates structural invariance (SI) sustained by natural selection. While the regions are often also protein-encoding (e. g. NSP3, ORF3a), we suggest that their nucleic acid level functions can be considered potential "Achilles heels" for SARS-CoV-2, perhaps susceptible to therapies like those envisaged for AIDS. The ribosomal frameshifting element scored well, but higher SI scores were obtained in other regions, including those encoding NSP13 and the nucleocapsid (N) protein.
Collapse
Affiliation(s)
- Chiyu Zhang
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Donald R Forsdyke
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario K7L3N6, Canada.
| |
Collapse
|
2
|
Neutralism versus selectionism: Chargaff's second parity rule, revisited. Genetica 2021; 149:81-88. [PMID: 33880685 PMCID: PMC8057000 DOI: 10.1007/s10709-021-00119-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 04/09/2021] [Indexed: 11/03/2022]
Abstract
Of Chargaff's four "rules" on DNA base frequencies, the functional interpretation of his second parity rule (PR2) is the most contentious. Thermophile base compositions (GC%) were taken by Galtier and Lobry (1997) as favoring Sueoka's neutral PR2 hypothesis over Forsdyke's selective PR2 hypothesis, namely that mutations improving local within-species recombination efficiency had generated a genome-wide potential for the strands of duplex DNA to separate and initiate recombination through the "kissing" of the tips of stem-loops. However, following Chargaff's GC rule, base composition mainly reflects a species-specific, genome-wide, evolutionary pressure. GC% could not have consistently followed the dictates of temperature, since it plays fundamental roles in both sustaining species integrity and, through primarily neutral genome-wide mutation, fostering speciation. Evidence for a local within-species recombination-initiating role of base order was obtained with a novel technology that masked the contribution of base composition to nucleic acid folding energy. Forsdyke's results were consistent with his PR2 hypothesis, appeared to resolve some root problems in biology and provided a theoretical underpinning for alignment-free taxonomic analyses using relative oligonucleotide frequencies (k-mer analysis). Moreover, consistent with Chargaff's cluster rule, discovery of the thermoadaptive role of the "purine-loading" of open reading frames made less tenable the Galtier-Lobry anti-selectionist arguments.
Collapse
|
3
|
Forsdyke DR. Exons and Introns. Evol Bioinform Online 2016. [DOI: 10.1007/978-3-319-28755-3_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
4
|
Hu X, Karasev AV, Brown CJ, Lorenzen JH. Sequence characteristics of potato virus Y recombinants. J Gen Virol 2009; 90:3033-3041. [PMID: 19692546 DOI: 10.1099/vir.0.014142-0] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Potato virus Y (PVY) is one of the most economically important plant pathogens. The PVY genome has a high degree of genetic variability and is also subject to recombination. New recombinants have been reported in many countries since the 1980s, but the origin of these recombinant strains and the physical and evolutionary mechanisms driving their emergence are not clear at the moment. The replicase-mediated template-switching model is considered the most likely mechanism for forming new RNA virus recombinants. Two factors, RNA secondary structure (especially stem-loop structures) and AU-rich regions, have been reported to affect recombination in this model. In this study, we investigated the influence of these two factors on PVY recombination from two perspectives: their distribution along the whole genome and differences between regions flanking the recombination junctions (RJs). Based on their distributions, only a few identified RJs in PVY genomes were located in lower negative FORS-D, i.e. having greater secondary-structure potential and higher AU-content regions, but most RJs had more negative FORS-D values upstream and/or higher AU content downstream. Our whole-genome analyses showed that RNA secondary structures and/or AU-rich regions at some sites may have affected PVY recombination, but in general they were not the main forces driving PVY recombination.
Collapse
Affiliation(s)
- Xiaojun Hu
- Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID 83844, USA
- Department of Plant, Soil, and Entomological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Alexander V Karasev
- Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID 83844, USA
- Department of Plant, Soil, and Entomological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Celeste J Brown
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
- Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID 83844, USA
| | - Jim H Lorenzen
- International Institute of Tropical Agriculture, Kampala, Uganda
- Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID 83844, USA
| |
Collapse
|
5
|
Microsatellites that violate Chargaff's second parity rule have base order-dependent asymmetries in the folding energies of complementary DNA strands and may not drive speciation. J Theor Biol 2008; 254:168-77. [DOI: 10.1016/j.jtbi.2008.05.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Revised: 05/16/2008] [Accepted: 05/16/2008] [Indexed: 11/21/2022]
|
6
|
Zhang C, Ding N, Wei JF. Different sliding window sizes and inappropriate subtype references result in discordant mosaic maps and breakpoint locations of HIV-1 CRFs. INFECTION GENETICS AND EVOLUTION 2008; 8:693-7. [PMID: 18482874 DOI: 10.1016/j.meegid.2008.04.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 03/31/2008] [Accepted: 04/01/2008] [Indexed: 11/16/2022]
Abstract
Different sliding window sizes and inappropriate subtype references are often selected for identifying HIV-1 recombination, which results in discordant recombination maps even for the same HIV-1 recombinant and affects the tracking of the epidemic of HIV-1 recombinants. Here, we re-analyzed 11 previously characterized HIV-1 CRFs using SimPlot software (version 3.5) with several sliding window sizes (200, 250, 300, 350 and 400 nt), moving in a step of 10 nt, respectively. We found that the crossovers determined under 250 and 350 nt windows, especially under 300 nt window are significantly closer to hypothetical breakpoint than crossovers obtained under 200 and 400 nt windows (P < 0.01). These suggest that 300 nt window is a preferential selection for HIV-1 recombination analysis. In addition, instead of one bootscan analysis, three bootscanning plots with sliding window sizes of 250, 300 and 350 nt are also recommended. The comparison between crossovers determined under different moving steps showed that a small moving step (e.g. 10 nt) is better than a larger step (e.g. 50 nt) (P < 0.05), suggesting that a small moving step should be used in bootscan analysis. Moreover, we found that inappropriate usage of subtype references in bootscan analysis resulted in misleading recombination maps. HIV-1 strains prevailing in the same geographic areas with HIV-1 inter-subtype recombinants are believed to have chance to participate in recombination events. When HIV-1 reference strains from recombinant-prevailing areas were applied, identified recombination patterns were well supported by phylogenetic analyses. So, in bootscan analysis, HIV-1 subtype references should be selected from recombinant-prevailing areas.
Collapse
Affiliation(s)
- Chiyu Zhang
- Department of Biochemistry and Molecular Biology, Jiangsu University School of Medical Technology, 301 Xuefu Road, Zhenjiang, Jiangsu 212013, China.
| | | | | |
Collapse
|
7
|
Zhang CY, Wei JF, Wu JS, Xu WR, Sun X, He SH. Evaluation of FORS-D analysis: a comparison with the statistically significant stem-loop potential. Biochem Genet 2007; 46:29-40. [PMID: 17955360 DOI: 10.1007/s10528-007-9126-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2007] [Accepted: 05/26/2007] [Indexed: 11/28/2022]
Abstract
The stem-loop potential of a nucleic acid segment (expressed as a FONS value), decomposes into base composition-dependent and base order-dependent components. The latter, expressed as a FORS-D value, is derived by subtracting the value of the base composition-dependent component (FORS-M) from the FONS value. FORS-D analysis is the use of FORS-D values to estimate the potential of local base order to contribute to a stem-loop structure, and it has been used to investigate the relationship between stem-loop structure and other selective pressures on genomes. In the present study, we evaluated the reliability of FORS-D analysis by comparing it with statistically significant stem-loop potential, another robust method developed by Le and Maizel for examining stem-loop structure. We found that FORS-M values calculated using 10 randomized sequences are as reliable as those calculated using 100 randomized sequences. The resulting FORS-D values have a similar trend and distribution as statistically significant stem-loop potential, implying that FORS-D analysis is as reliable as the latter in measuring the distribution of base order-dependent stem-loop potential. Since the calculation of the FORS-M values is time consuming, the integrated program Bodslp developed by us will become a convenient tool for large-scale FORS-D analysis. The results also suggest that for some purposes the online program SigStb developed by Le and Maizel may be used as an alternative tool for FORS-D analysis.
Collapse
Affiliation(s)
- Chi-Yu Zhang
- Department of Biochemistry and Molecular Biology, Jiangsu University School of Medical Technology, Zhenjiang, Jiangsu, P.R. China.
| | | | | | | | | | | |
Collapse
|
8
|
Forsdyke DR. Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J Theor Biol 2007; 248:745-53. [PMID: 17698086 DOI: 10.1016/j.jtbi.2007.07.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2007] [Revised: 07/05/2007] [Accepted: 07/09/2007] [Indexed: 12/16/2022]
Abstract
The stability of a folded single-stranded nucleic acid depends on the composition and order of its constituent bases and may be assessed by taking into account the pairing energies of its constituent dinucleotides. To assess the possible biological significance of a computed structure, Maizel and coworkers in the 1980s compared the energy of folding of a natural single-stranded RNA sequence with the energies of several versions of the same sequence produced by shuffling base order. However, in the 2000s many took as self-evident the view that shuffling at the mononucleotide level (single bases) was conceptual wrong and should be replaced by shuffling at the level of dinucleotides (retaining pairs of adjacent bases). Folding energies then became indistinguishable from those of corresponding shuffled sequences and doubt was cast on the importance of secondary structures. Nevertheless, some continued productively to employ the single base shuffling approach, the justification for which is the topic of this paper. Because dinucleotide pairing energies are needed to calculate structure, it does not follow that shuffling should not disrupt dinucleotides. Base shuffling allows determination of the relative contributions of base composition and base order to total folding energy. The potential for secondary structure arises from pressures acting at both DNA and RNA levels, and is abundant throughout genomes-with a probable primary role in recombination. Within a gene the potential can often be accommodated, and base order and composition work together (values have the same negative sign) in contributing to total folding energy. But sometimes protein-coding pressure on base order conflicts with the pressure for secondary structure and the values have opposite signs. Total folding energy can be deemed of potential biological significance when the average of several readings is significantly less than zero.
Collapse
Affiliation(s)
- Donald R Forsdyke
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L3N6.
| |
Collapse
|
9
|
Minin VN, Dorman KS, Fang F, Suchard MA. Phylogenetic mapping of recombination hotspots in human immunodeficiency virus via spatially smoothed change-point processes. Genetics 2006; 175:1773-85. [PMID: 17194781 PMCID: PMC1855141 DOI: 10.1534/genetics.106.066258] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a Bayesian framework for inferring spatial preferences of recombination from multiple putative recombinant nucleotide sequences. Phylogenetic recombination detection has been an active area of research for the last 15 years. However, only recently attempts to summarize information from several instances of recombination have been made. We propose a hierarchical model that allows for simultaneous inference of recombination breakpoint locations and spatial variation in recombination frequency. The dual multiple change-point model for phylogenetic recombination detection resides at the lowest level of our hierarchy under the umbrella of a common prior on breakpoint locations. The hierarchical prior allows for information about spatial preferences of recombination to be shared among individual data sets. To overcome the sparseness of breakpoint data, dictated by the modest number of available recombinant sequences, we a priori impose a biologically relevant correlation structure on recombination location log odds via a Gaussian Markov random field hyperprior. To examine the capabilities of our model to recover spatial variation in recombination frequency, we simulate recombination from a predefined distribution of breakpoint locations. We then proceed with the analysis of 42 human immunodeficiency virus (HIV) intersubtype gag recombinants and identify a putative recombination hotspot.
Collapse
Affiliation(s)
- Vladimir N Minin
- Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| | | | | | | |
Collapse
|
10
|
Forsdyke DR. Exons and Introns. Evol Bioinform Online 2006. [DOI: 10.1007/978-0-387-33419-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|