1
|
Xiao Y, Huang H, Chen Y, Zheng S, Chen J, Zou Z, Mehmood N, Ullah I, Liao X, Wang J. Insight on genetic features prevalent in five Ipomoea species using comparative codon pattern analysis reveals differences in major codons and reduced GC content at the 5’ end of CDS. Biochem Biophys Res Commun 2023; 657:92-99. [PMID: 37001285 DOI: 10.1016/j.bbrc.2023.03.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/10/2023] [Accepted: 03/10/2023] [Indexed: 03/30/2023]
Abstract
Ipomoea plants possess important commercial, medicinal, and ornamental value. Molecular and morphological studies have confirmed that most species of this genus exhibit similar phenotypes but complex phylogenetic relationships. To date, limited information is available on these evolutionary relationships. In this study, systematic analysis of diverse species from Ipomoea was used to elucidate the relationships in this genus. To this end, we employed the concept of codon usage bias (CUB) to analyze the codon usage bias of five Ipomoea species such as effective number of codons (ENC) and GC content at the third synonym codon position (GC3s). Three types of plots including ENC-GC3s, parity rule 2 (PR2) and neutrality plots were employed to discover the factors determining CUB, and the frequency of hydrogen bonds and nucleotide were calculated to dissect changes in GC content at the 5'-end of the coding sequence. Our results showed little distinctness in CUB among the five species, with a reduction of hydrogen bonds content at the 5'-end (with similar changes in cytosines). In addition, optimal codons of Ipomoea aquatica ended with G or C, different from those of the other four species, which ended in A or T. These results may be useful for exploring the evolutionary relationships among this group, and for understanding the reasons for the variation among Ipomoea species.
Collapse
|
2
|
Li G, Shi L, Zhang L, Xu B. Componential usage patterns in dengue 4 viruses reveal their better evolutionary adaptation to humans. Front Microbiol 2022; 13:935678. [PMID: 36204606 PMCID: PMC9530264 DOI: 10.3389/fmicb.2022.935678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/22/2022] [Indexed: 11/15/2022] Open
Abstract
There have been at least four types of dengue outbreaks in the past few years. The evolutionary characteristics of dengue viruses have aroused great concerns. The evolutionary characteristics of dengue 4 viruses are studied in the present study based on their base usage patterns and codon usage patterns. The effective number of codons and relative synonymous codon usage (RSCU) values of four types of dengue viruses were counted or calculated. The Kullback–Leibler (K–L) divergences of relative synonymous codon usage from dengue viruses to humans and the Kullback–Leibler divergences of amino acid usage patterns from dengue viruses to humans were calculated to explore the adaptation levels of dengue viruses. The results suggested that: (1) codon adaptation in dengue 4 viruses occurred through an evolutionary process from 1956 to 2021, (2) overall relative synonymous codon usage values of dengue 4 viruses showed more similarities to humans than those of other subtypes of dengue viruses, and (3) the smaller Kullback–Leibler divergence of amino acid usage and relative synonymous codon usage from dengue viruses to humans indicated that the dengue 4 viruses adapted to human hosts better. All results indicated that both mutation pressure and natural selection pressure contributed to the codon usage pattern of dengue 4 viruses more obvious than to other subtypes of dengue viruses and that the dengue 4 viruses adapted to human hosts better than other types of dengue viruses during their evolutionary process.
Collapse
Affiliation(s)
- Gun Li
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
- *Correspondence: Gun Li
| | - Liang Shi
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
- Key Laboratory of Analytical Chemistry for Life Science of Shaanxi Province, School of Chemistry and Chemical Engineering, Shaanxi Normal University, Xi'an, China
- Liang Shi
| | - Liang Zhang
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
| | - Bingyi Xu
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
| |
Collapse
|
3
|
Zhu X, Zhang H, Tang Q. pyRSD-CoEv: A python package for selective sweep detection and co-evolutionary gene cluster identification. Anim Genet 2021; 53:161-165. [PMID: 34729801 DOI: 10.1111/age.13151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 09/14/2021] [Accepted: 10/13/2021] [Indexed: 11/26/2022]
Abstract
Genes undergo distinct selective sweeps, and also interact and coevolve, forming the bases of complex phenotypic traits. Therefore, the identification of genes that coevolve or are under artificial selective sweeps is of great importance. However, previous computational methods have been designed for either populations of closely related breeds or individuals of distinct species. Approaches intended specifically for closely related individuals without replicate (i.e. each breed/strain is represented by only one individual) are long overdue. We present a free, powerful, open source package, pyRSD-CoEv, that allows the identification of genes undergoing coevolution and/or selection-based sweeps. pyRSD-CoEv includes two main analysis workflows for genomic variant data: (i) the identification of selective sweeps using relative homozygous single nucleotide variant density (RSD); and (ii) the identification of coevolutionary gene clusters based on correlated evolutionary rates. The python package pyRSD-CoEv is written using python 3.7 and is freely available from the github website at https://github.com/QianZiTang/pyRSD-CoEv. It runs on Linux.
Collapse
Affiliation(s)
- Xingxing Zhu
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Haoyu Zhang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Qianzi Tang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| |
Collapse
|
4
|
Wisotsky SR, Kosakovsky Pond SL, Shank SD, Muse SV. Synonymous Site-to-Site Substitution Rate Variation Dramatically Inflates False Positive Rates of Selection Analyses: Ignore at Your Own Peril. Mol Biol Evol 2020; 37:2430-2439. [PMID: 32068869 PMCID: PMC7403620 DOI: 10.1093/molbev/msaa037] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Most molecular evolutionary studies of natural selection maintain the decades-old assumption that synonymous substitution rate variation (SRV) across sites within genes occurs at levels that are either nonexistent or negligible. However, numerous studies challenge this assumption from a biological perspective and show that SRV is comparable in magnitude to that of nonsynonymous substitution rate variation. We evaluated the impact of this assumption on methods for inferring selection at the molecular level by incorporating SRV into an existing method (BUSTED) for detecting signatures of episodic diversifying selection in genes. Using simulated data we found that failing to account for even moderate levels of SRV in selection testing is likely to produce intolerably high false positive rates. To evaluate the effect of the SRV assumption on actual inferences we compared results of tests with and without the assumption in an empirical analysis of over 13,000 Euteleostomi (bony vertebrate) gene alignments from the Selectome database. This exercise reveals that close to 50% of positive results (i.e., evidence for selection) in empirical analyses disappear when SRV is modeled as part of the statistical analysis and are thus candidates for being false positives. The results from this work add to a growing literature establishing that tests of selection are much more sensitive to certain model assumptions than previously believed.
Collapse
Affiliation(s)
- Sadie R Wisotsky
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | | | - Stephen D Shank
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Spencer V Muse
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC
- Department of Statistics, North Carolina State University, Raleigh, NC
| |
Collapse
|
5
|
Beaulieu JM, O’Meara BC, Zaretzki R, Landerer C, Chai J, Gilchrist MA. Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach. Mol Biol Evol 2019; 36:834-851. [PMID: 30521036 PMCID: PMC6445302 DOI: 10.1093/molbev/msy222] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost-benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 104-105 Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical (r=0.33-0.48) and other theoretical predictions (r=0.45-0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.
Collapse
Affiliation(s)
- Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR
- Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN
- National Institute for Mathematical and Biological Synthesis, Knoxville, TN
| | - Brian C O’Meara
- Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN
- National Institute for Mathematical and Biological Synthesis, Knoxville, TN
| | | | - Cedric Landerer
- Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN
- National Institute for Mathematical and Biological Synthesis, Knoxville, TN
| | - Juanjuan Chai
- National Institute for Mathematical and Biological Synthesis, Knoxville, TN
- Suite 1039, White Plains, NY
| | - Michael A Gilchrist
- Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN
- National Institute for Mathematical and Biological Synthesis, Knoxville, TN
| |
Collapse
|
6
|
Abstract
Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become "fixed" (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genomes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary "arms races" with pathogens. In recent years genome-wide scans for selection have enlarged our understanding of the genome evolution of various species. In this chapter, we will focus on methods to detect selection on the genome. In particular, we will discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.
Collapse
Affiliation(s)
- Carolin Kosiol
- Centre of Biological Diversity, School of Biology, University of St Andrews, Fife, UK.
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.
| | - Maria Anisimova
- Institute of Applied Simulation, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
7
|
Gun L, Haixian P, Yumiao R, Han T, Jingqi L, Liguang Z. Codon usage characteristics of PB2 gene in influenza A H7N9 virus from different host species. INFECTION GENETICS AND EVOLUTION 2018; 65:430-435. [PMID: 30179716 DOI: 10.1016/j.meegid.2018.08.028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 08/02/2018] [Accepted: 08/30/2018] [Indexed: 12/13/2022]
Abstract
The influenza A H7N9 virus is a highly contagious virus which can only infect poultry before early 2013. But after that time, it widely caused human infections in China and brought Southeast Asia great threaten in the public health area. The coding gene for polymerase basic protein 2 (PB2) in influenza A H7N9 virus encodes the PB2 protein, which is a part of the RNA polymerase. The enzyme lacks a correction function during its own replication process, so the mutation frequency of the influenza A H7N9 virus gene is high and the PB2 gene is also included. To investigate the codon usages characteristics of PB2 gene, gene sequences of 12 kinds of poultry are downloaded form the gene bank (NCBI) and their codon usage characteristics such as the effective number of codons (ENC), the evolutionary relationship of the sequences, the codon adaptation index (CAI), the correspondence analysis (COA), the relative synonymous codon usage (RSCU) and their PR2-bias are compared and studied. The value of these reults showed that there is a low codon usage bias in the PB2 gene. Then, the differences between the codon usages of PB2 gene from 12 kinds of poultry are compared and their potential applications are discussed. These results could lay a foundation for other further study on the evolution of H7N9.
Collapse
Affiliation(s)
- Li Gun
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China.
| | - Pan Haixian
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China
| | - Ren Yumiao
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China
| | - Tian Han
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China
| | - Lu Jingqi
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China
| | - Zhang Liguang
- Department of Biomedical Engineering, School of Electronics and Information Engineering, Xi'an Technological University, Xi'an, Shaanxi Province, China
| |
Collapse
|
8
|
Accelerating Wright-Fisher Forward Simulations on the Graphics Processing Unit. G3-GENES GENOMES GENETICS 2017; 7:3229-3236. [PMID: 28768689 PMCID: PMC5592947 DOI: 10.1534/g3.117.300103] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/.
Collapse
|