1
|
Nielsen BF, Berrig C, Grenfell BT, Andreasen V. One hundred years of influenza A evolution. Theor Popul Biol 2024; 159:25-34. [PMID: 39094981 DOI: 10.1016/j.tpb.2024.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 07/05/2024] [Accepted: 07/30/2024] [Indexed: 08/04/2024]
Abstract
Leveraging the simplicity of nucleotide mismatch distributions, we provide an intuitive window into the evolution of the human influenza A 'nonstructural' (NS) gene segment. In an analysis suggested by the eminent Danish biologist Freddy B. Christiansen, we illustrate the existence of a continuous genetic "backbone" of influenza A NS sequences, steadily increasing in nucleotide distance to the 1918 root over more than a century. The 2009 influenza A/H1N1 pandemic represents a clear departure from this enduring genetic backbone. Utilizing nucleotide distance maps and phylogenetic analyses, we illustrate remaining uncertainties regarding the origin of the 2009 pandemic, highlighting the complexity of influenza evolution. The NS segment is interesting precisely because it experiences less pervasive positive selection, and departs less strongly from neutral evolution than e.g. the HA antigen. Consequently, sudden deviations from neutral diversification can indicate changes in other genes via the hitchhiking effect. Our approach employs two measures based on nucleotide mismatch counts to analyze the evolutionary dynamics of the NS gene segment. The rooted Hamming map of distances between a reference sequence and all other sequences over time, and the unrooted temporal Hamming distribution which captures the distribution of genotypic distances between simultaneously circulating viruses, thereby revealing patterns of nucleotide diversity and epi-evolutionary dynamics.
Collapse
Affiliation(s)
- Bjarke Frost Nielsen
- High Meadows Environmental Institute, Princeton University, Princeton, NJ, United States of America; Department of Science and Environment, Roskilde University, Roskilde, Denmark; Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark.
| | - Christian Berrig
- Department of Science and Environment, Roskilde University, Roskilde, Denmark.
| | - Bryan T Grenfell
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, United States of America.
| | - Viggo Andreasen
- Department of Science and Environment, Roskilde University, Roskilde, Denmark.
| |
Collapse
|
2
|
Murat Ş. Potential role of peste des petits ruminants virus in small ruminant abortions. Vet J 2024; 306:106185. [PMID: 38908779 DOI: 10.1016/j.tvjl.2024.106185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 05/30/2024] [Accepted: 06/19/2024] [Indexed: 06/24/2024]
Abstract
The aim of the present study was to investigate the frequency, genetic variability, and phylogeny of the peste des petits ruminants virus (PPRV) in ovine and caprine fetuses. During 2014 and 2017, a total of 1054 embryos/fetuses were collected in Turkey. A real-time RT-PCR assay was used for the detection of the PPRV RNA. Genetic characterization and phylogenetic analysis of the PPRV field isolates were conducted by sequencing fusion (F) protein and nucleoprotein (N) gene segments. Samples were also collected from ewes (n = 83) and nanny goats (n = 3) that had aborted and whose embryos/fetuses were found to be PPRV positive. PPRV positive embryos/fetuses were also tested for the presence of Listeria monocytogenes, Campylobacter spp., Coxiella burnetii, Chlamydophila abortus, Brucella spp., akabane virus, aino virus, bluetongue virus, border disease virus, bovine viral diarrhea virus, Cache Valley virus, and Schmallenberg virus. PPRV RNA was detected in 123 (11.7%) of the 1054 embryos/fetuses, 78 of the 83 (94%) ewes and 3 (100%) nanny goats. Border disease virus RNA and Chlamydophila abortus DNA were detected in 7 and 12 PPRV positive sheep fetuses, respectively, while other bacterial and viral agents were not detected. Phylogenetically, the field isolates in this study belong to lineage IV, and compared to other strains of lineage IV considered in this study, they showed 1 and 5 new amino acid substitutions in the F and N gene sequences, respectively. The results of the study suggest that PPRV plays an important role in abortion. Therefore, PPRV needs to be taken into consideration in sheep and goats abortions.
Collapse
Affiliation(s)
- Ş Murat
- Department of Virology, Veterinary Faculty, Necmettin Erbakan University, Ereğli, 42310 Konya, Turkey.
| |
Collapse
|
3
|
Shi C, Xie Y, Guan D, Qin G. Transcriptomic Analysis Reveals Adaptive Evolution and Conservation Implications for the Endangered Magnolia lotungensis. Genes (Basel) 2024; 15:787. [PMID: 38927723 PMCID: PMC11203017 DOI: 10.3390/genes15060787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/03/2024] [Accepted: 06/13/2024] [Indexed: 06/28/2024] Open
Abstract
Magnolia lotungensis is an extremely endangered endemic tree in China. To elucidate the genetic basis of M. lotungensis, we performed a comprehensive transcriptome analysis using a sample integrating the plant's bark, leaves, and flowers. De novo transcriptome assembly yielded 177,046 transcripts and 42,518 coding sequences. Notably, we identified 796 species-specific genes enriched in organelle gene regulation and defense responses. A codon usage bias analysis revealed that mutation bias appears to be the primary driver of selection in shaping the species' genetic architecture. An evolutionary analysis based on dN/dS values of paralogous and orthologous gene pairs indicated a predominance of purifying selection, suggesting strong evolutionary constraints on most genes. A comparative transcriptomic analysis with Magnolia sinica identified approximately 1000 ultra-conserved genes, enriched in essential cellular processes such as transcriptional regulation, protein synthesis, and genome stability. Interestingly, only a limited number of 511 rapidly evolving genes under positive selection were detected compared to M. sinica and Magnolia kuangsiensis. These genes were enriched in metabolic processes associated with adaptation to specific environments, potentially limiting the species' ability to expand its range. Our findings contribute to understanding the genetic architecture of M. lotungensis and suggest that an insufficient number of adaptive genes contribute to its endangered status.
Collapse
Affiliation(s)
- Chenyu Shi
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, Hechi University, Hechi 546300, China; (C.S.); (Y.X.)
- Guangxi Collaborative Innovation Center of Modern Sericulture and Silk, Hechi University, Hechi 546300, China
| | - Yanjun Xie
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, Hechi University, Hechi 546300, China; (C.S.); (Y.X.)
- Guangxi Collaborative Innovation Center of Modern Sericulture and Silk, Hechi University, Hechi 546300, China
| | - Delong Guan
- Guangxi Collaborative Innovation Center of Modern Sericulture and Silk, Hechi University, Hechi 546300, China
- School of Chemistry and Bioengineering, Hechi University, Hechi 546300, China
| | - Guole Qin
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, Hechi University, Hechi 546300, China; (C.S.); (Y.X.)
- Guangxi Collaborative Innovation Center of Modern Sericulture and Silk, Hechi University, Hechi 546300, China
| |
Collapse
|
4
|
Gorlov IP, Gorlova OY, Tsavachidis S, Amos CI. Strength of selection in lung tumors correlates with clinical features better than tumor mutation burden. Sci Rep 2024; 14:12732. [PMID: 38831004 PMCID: PMC11148192 DOI: 10.1038/s41598-024-63468-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 05/29/2024] [Indexed: 06/05/2024] Open
Abstract
Single nucleotide substitutions are the most common type of somatic mutations in cancer genome. The goal of this study was to use publicly available somatic mutation data to quantify negative and positive selection in individual lung tumors and test how strength of directional and absolute selection is associated with clinical features. The analysis found a significant variation in strength of selection (both negative and positive) among tumors, with median selection tending to be negative even though tumors with strong positive selection also exist. Strength of selection estimated as the density of missense mutations relative to the density of silent mutations showed only a weak correlation with tumor mutation burden. In the "all histology together" analysis we found that absolute strength of selection was strongly correlated with all clinically relevant features analyzed. In histology-stratified analysis selection was strongest in small cell lung cancer. Selection in adenocarcinoma was somewhat higher compared to squamous cell carcinoma. The study suggests that somatic mutation- based quantifying of directional and absolute selection in individual tumors can be a useful biomarker of tumor aggressiveness.
Collapse
Affiliation(s)
- Ivan P Gorlov
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA.
| | - Olga Y Gorlova
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| | - Spyridon Tsavachidis
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| | - Christopher I Amos
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| |
Collapse
|
5
|
Joseph J. Increased Positive Selection in Highly Recombining Genes Does not Necessarily Reflect an Evolutionary Advantage of Recombination. Mol Biol Evol 2024; 41:msae107. [PMID: 38829800 PMCID: PMC11173204 DOI: 10.1093/molbev/msae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/08/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
It is commonly thought that the long-term advantage of meiotic recombination is to dissipate genetic linkage, allowing natural selection to act independently on different loci. It is thus theoretically expected that genes with higher recombination rates evolve under more effective selection. On the other hand, recombination is often associated with GC-biased gene conversion (gBGC), which theoretically interferes with selection by promoting the fixation of deleterious GC alleles. To test these predictions, several studies assessed whether selection was more effective in highly recombining genes (due to dissipation of genetic linkage) or less effective (due to gBGC), assuming a fixed distribution of fitness effects (DFE) for all genes. In this study, I directly derive the DFE from a gene's evolutionary history (shaped by mutation, selection, drift, and gBGC) under empirical fitness landscapes. I show that genes that have experienced high levels of gBGC are less fit and thus have more opportunities for beneficial mutations. Only a small decrease in the genome-wide intensity of gBGC leads to the fixation of these beneficial mutations, particularly in highly recombining genes. This results in increased positive selection in highly recombining genes that is not caused by more effective selection. Additionally, I show that the death of a recombination hotspot can lead to a higher dN/dS than its birth, but with substitution patterns biased towards AT, and only at selected positions. This shows that controlling for a substitution bias towards GC is therefore not sufficient to rule out the contribution of gBGC to signatures of accelerated evolution. Finally, although gBGC does not affect the fixation probability of GC-conservative mutations, I show that by altering the DFE, gBGC can also significantly affect nonsynonymous GC-conservative substitution patterns.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
| |
Collapse
|
6
|
Kotari I, Kosiol C, Borges R. The Patterns of Codon Usage between Chordates and Arthropods are Different but Co-evolving with Mutational Biases. Mol Biol Evol 2024; 41:msae080. [PMID: 38667829 PMCID: PMC11108087 DOI: 10.1093/molbev/msae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 03/22/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
Different frequencies amongst codons that encode the same amino acid (i.e. synonymous codons) have been observed in multiple species. Studies focused on uncovering the forces that drive such codon usage showed that a combined effect of mutational biases and translational selection works to produce different frequencies of synonymous codons. However, only few have been able to measure and distinguish between these forces that may leave similar traces on the coding regions. Here, we have developed a codon model that allows the disentangling of mutation, selection on amino acids and synonymous codons, and GC-biased gene conversion (gBGC) which we employed on an extensive dataset of 415 chordates and 191 arthropods. We found that chordates need 15 more synonymous codon categories than arthropods to explain the empirical codon frequencies, which suggests that the extent of codon usage can vary greatly between animal phyla. Moreover, methylation at CpG sites seems to partially explain these patterns of codon usage in chordates but not in arthropods. Despite the differences between the two phyla, our findings demonstrate that in both, GC-rich codons are disfavored when mutations are GC-biased, and the opposite is true when mutations are AT-biased. This indicates that selection on the genomic coding regions might act primarily to stabilize its GC/AT content on a genome-wide level. Our study shows that the degree of synonymous codon usage varies considerably among animals, but is likely governed by a common underlying dynamic.
Collapse
Affiliation(s)
- Ioanna Kotari
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Carolin Kosiol
- Centre for Biological Diversity, School of Biology, University of St Andrews, Fife KY16 9TH, UK
| | - Rui Borges
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
| |
Collapse
|
7
|
de Jong MJ, van Oosterhout C, Hoelzel AR, Janke A. Moderating the neutralist-selectionist debate: exactly which propositions are we debating, and which arguments are valid? Biol Rev Camb Philos Soc 2024; 99:23-55. [PMID: 37621151 DOI: 10.1111/brv.13010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/04/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Half a century after its foundation, the neutral theory of molecular evolution continues to attract controversy. The debate has been hampered by the coexistence of different interpretations of the core proposition of the neutral theory, the 'neutral mutation-random drift' hypothesis. In this review, we trace the origins of these ambiguities and suggest potential solutions. We highlight the difference between the original, the revised and the nearly neutral hypothesis, and re-emphasise that none of them equates to the null hypothesis of strict neutrality. We distinguish the neutral hypothesis of protein evolution, the main focus of the ongoing debate, from the neutral hypotheses of genomic and functional DNA evolution, which for many species are generally accepted. We advocate a further distinction between a narrow and an extended neutral hypothesis (of which the latter posits that random non-conservative amino acid substitutions can cause non-ecological phenotypic divergence), and we discuss the implications for evolutionary biology beyond the domain of molecular evolution. We furthermore point out that the debate has widened from its initial focus on point mutations, and also concerns the fitness effects of large-scale mutations, which can alter the dosage of genes and regulatory sequences. We evaluate the validity of neutralist and selectionist arguments and find that the tested predictions, apart from being sensitive to violation of underlying assumptions, are often derived from the null hypothesis of strict neutrality, or equally consistent with the opposing selectionist hypothesis, except when assuming molecular panselectionism. Our review aims to facilitate a constructive neutralist-selectionist debate, and thereby to contribute to answering a key question of evolutionary biology: what proportions of amino acid and nucleotide substitutions and polymorphisms are adaptive?
Collapse
Affiliation(s)
- Menno J de Jong
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
| | - Cock van Oosterhout
- Centre for Ecology, Evolution and Conservation, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
| | - A Rus Hoelzel
- Department of Biosciences, Durham University, South Road, Durham, DH1 3LE, UK
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Strasse 9, Frankfurt am Main, 60438, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (TBG), Senckenberg Nature Research Society, Georg-Voigt-Straße 14-16, Frankfurt am Main, 60325, Germany
| |
Collapse
|
8
|
Xu KL, Zhang ZM, Fang WL, Wang YD, Jin HY, Wei F, Ma SC. Comparative analyses of complete chloroplast genomes reveal interspecific difference and intraspecific variation of Tripterygium genus. FRONTIERS IN PLANT SCIENCE 2024; 14:1288943. [PMID: 38264022 PMCID: PMC10803662 DOI: 10.3389/fpls.2023.1288943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 12/14/2023] [Indexed: 01/25/2024]
Abstract
The genus Tripterygium was of great medicinal value and attracted much attention on the taxonomic study using morphological and molecular methods. In this study, we assembled 12 chloroplast genomes of Tripterygium to reveal interspecific difference and intraspecific variation. The sequence length (156,692-157,061 bp) and structure of Tripterygium were conserved. Comparative analyses presented abundant variable regions for further study. Meanwhile, we determined the ndhB gene under positive selection through adaptive evolution analysis. And the phylogenetic analyses based on 15 chloroplast genomes supported the monophyly of Tripterygium hypoglaucum and the potential sister relationship between Tripterygium wilfordii and Tripterygium regelii. Molecular dating analysis indicated that the divergence time within Tripterygium was approximately 5.99 Ma (95% HPD = 3.11-8.68 Ma). The results in our study provided new insights into the taxonomy, evolution process, and phylogenetic construction of Tripterygium using complete plastid genomes.
Collapse
Affiliation(s)
- Kai-Ling Xu
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| | - Zhong-Mou Zhang
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
- School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Wen-Liang Fang
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| | - Ya-Dan Wang
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| | - Hong-Yu Jin
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| | - Feng Wei
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| | - Shuang-Cheng Ma
- Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, National Institutes for Food and Drug Control, Beijing, China
| |
Collapse
|
9
|
Bloom JD, Neher RA. Fitness effects of mutations to SARS-CoV-2 proteins. Virus Evol 2023; 9:vead055. [PMID: 37727875 PMCID: PMC10506532 DOI: 10.1093/ve/vead055] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/08/2023] [Accepted: 08/22/2023] [Indexed: 09/21/2023] Open
Abstract
Knowledge of the fitness effects of mutations to SARS-CoV-2 can inform assessment of new variants, design of therapeutics resistant to escape, and understanding of the functions of viral proteins. However, experimentally measuring effects of mutations is challenging: we lack tractable lab assays for many SARS-CoV-2 proteins, and comprehensive deep mutational scanning has been applied to only two SARS-CoV-2 proteins. Here, we develop an approach that leverages millions of publicly available SARS-CoV-2 sequences to estimate effects of mutations. We first calculate how many independent occurrences of each mutation are expected to be observed along the SARS-CoV-2 phylogeny in the absence of selection. We then compare these expected observations to the actual observations to estimate the effect of each mutation. These estimates correlate well with deep mutational scanning measurements. For most genes, synonymous mutations are nearly neutral, stop-codon mutations are deleterious, and amino acid mutations have a range of effects. However, some viral accessory proteins are under little to no selection. We provide interactive visualizations of effects of mutations to all SARS-CoV-2 proteins (https://jbloomlab.github.io/SARS2-mut-fitness/). The framework we describe is applicable to any virus for which the number of available sequences is sufficiently large that many independent occurrences of each neutral mutation are observed.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences and Computational Biology, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Spitalstrasse 41, Basel 4056, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerl
| |
Collapse
|
10
|
N’Guessan A, Kailasam S, Mostefai F, Poujol R, Grenier JC, Ismailova N, Contini P, De Palma R, Haber C, Stadler V, Bourque G, Hussin JG, Shapiro BJ, Fritz JH, Piccirillo CA. Selection for immune evasion in SARS-CoV-2 revealed by high-resolution epitope mapping and sequence analysis. iScience 2023; 26:107394. [PMID: 37599818 PMCID: PMC10433132 DOI: 10.1016/j.isci.2023.107394] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 02/10/2023] [Accepted: 07/10/2023] [Indexed: 08/22/2023] Open
Abstract
Here, we exploit a deep serological profiling strategy coupled with an integrated, computational framework for the analysis of SARS-CoV-2 humoral immune responses. Applying a high-density peptide array (HDPA) spanning the entire proteomes of SARS-CoV-2 and endemic human coronaviruses allowed identification of B cell epitopes and relate them to their evolutionary and structural properties. We identify hotspots of pre-existing immunity and identify cross-reactive epitopes that contribute to increasing the overall humoral immune response to SARS-CoV-2. Using a public dataset of over 38,000 viral genomes from the early phase of the pandemic, capturing both inter- and within-host genetic viral diversity, we determined the evolutionary profile of epitopes and the differences across proteins, waves, and SARS-CoV-2 variants. Lastly, we show that mutations in spike and nucleocapsid epitopes are under stronger selection between than within patients, suggesting that most of the selective pressure for immune evasion occurs upon transmission between hosts.
Collapse
Affiliation(s)
- Arnaud N’Guessan
- Department of Microbiology and Immunology, McGill University, Montréal, QC, Canada
- McGill Genome Centre, McGill University, Montréal, QC, Canada
| | - Senthilkumar Kailasam
- Canadian Center for Computational Genomics, Montréal, QC, Canada
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| | - Fatima Mostefai
- Research Centre, Montreal Heart Institute, Montreal, QC, Canada
- Département de Biochimie et Médecine Moléculaire, Université de Montréal, Montréal, QC, Canada
| | - Raphaël Poujol
- Research Centre, Montreal Heart Institute, Montreal, QC, Canada
| | | | - Nailya Ismailova
- Department of Microbiology and Immunology, McGill University, Montréal, QC, Canada
- McGill University Research Center on Complex Traits (MRCCT), McGill University, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| | - Paola Contini
- Department of Internal Medicine, University of Genoa and IRCCS IST-Ospedale San Martino, Genoa, Italy
| | - Raffaele De Palma
- Department of Internal Medicine, University of Genoa and IRCCS IST-Ospedale San Martino, Genoa, Italy
| | | | | | - Guillaume Bourque
- Canadian Center for Computational Genomics, Montréal, QC, Canada
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| | - Julie G. Hussin
- Research Centre, Montreal Heart Institute, Montreal, QC, Canada
- Département de Médecine, Université de Montréal, Montréal, QC, Canada
| | - B. Jesse Shapiro
- Department of Microbiology and Immunology, McGill University, Montréal, QC, Canada
- McGill Genome Centre, McGill University, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| | - Jörg H. Fritz
- Department of Microbiology and Immunology, McGill University, Montréal, QC, Canada
- McGill University Research Center on Complex Traits (MRCCT), McGill University, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| | - Ciriaco A. Piccirillo
- Department of Microbiology and Immunology, McGill University, Montréal, QC, Canada
- McGill University Research Center on Complex Traits (MRCCT), McGill University, Montréal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program of the Research Institute of McGill Health Center, Montréal, QC, Canada
- Dahdaleh Institute of Genomic Medicine (DIgM), McGill University, Montréal, QC, Canada
| |
Collapse
|
11
|
Bukur T, Riesgo-Ferreiro P, Sorn P, Gudimella R, Hausmann J, Rösler T, Löwer M, Schrörs B, Sahin U. CoVigator-A Knowledge Base for Navigating SARS-CoV-2 Genomic Variants. Viruses 2023; 15:1391. [PMID: 37376690 DOI: 10.3390/v15061391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/15/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023] Open
Abstract
BACKGROUND The outbreak of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) resulted in the global COVID-19 pandemic. The urgency for an effective SARS-CoV-2 vaccine has led to the development of the first series of vaccines at unprecedented speed. The discovery of SARS-CoV-2 spike-glycoprotein mutants, however, and consequentially the potential to escape vaccine-induced protection and increased infectivity, demonstrates the persisting importance of monitoring SARS-CoV-2 mutations to enable early detection and tracking of genomic variants of concern. RESULTS We developed the CoVigator tool with three components: (1) a knowledge base that collects new SARS-CoV-2 genomic data, processes it and stores its results; (2) a comprehensive variant calling pipeline; (3) an interactive dashboard highlighting the most relevant findings. The knowledge base routinely downloads and processes virus genome assemblies or raw sequencing data from the COVID-19 Data Portal (C19DP) and the European Nucleotide Archive (ENA), respectively. The results of variant calling are visualized through the dashboard in the form of tables and customizable graphs, making it a versatile tool for tracking SARS-CoV-2 variants. We put a special emphasis on the identification of intrahost mutations and make available to the community what is, to the best of our knowledge, the largest dataset on SARS-CoV-2 intrahost mutations. In the spirit of open data, all CoVigator results are available for download. The CoVigator dashboard is accessible via covigator.tron-mainz.de. CONCLUSIONS With increasing demand worldwide in genome surveillance for tracking the spread of SARS-CoV-2, CoVigator will be a valuable resource of an up-to-date list of mutations, which can be incorporated into global efforts.
Collapse
Affiliation(s)
- Thomas Bukur
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Pablo Riesgo-Ferreiro
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Patrick Sorn
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Ranganath Gudimella
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Johannes Hausmann
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Thomas Rösler
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Martin Löwer
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Barbara Schrörs
- TRON-Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz Gemeinnützige GmbH, 55131 Mainz, Germany
| | - Ugur Sahin
- BioNTech SE, 55131 Mainz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes Gutenberg University Mainz, 55099 Mainz, Germany
| |
Collapse
|
12
|
Xie H, Hu J, Wang Y, Wang X. Identification of the matrix metalloproteinase (MMP) gene family in Japanese flounder (Paralichthys olivaceus): Involved in immune response regulation to temperature stress and Edwardsiella tarda infection. FISH & SHELLFISH IMMUNOLOGY 2023:108878. [PMID: 37271328 DOI: 10.1016/j.fsi.2023.108878] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 05/30/2023] [Accepted: 06/01/2023] [Indexed: 06/06/2023]
Abstract
The Matrix metalloproteinase (MMP) gene family is responsible for regulating the degradation of Extra Cellular Matrix (ECM) proteins, which are important for physiological processes such as wound healing, tissue remodeling, and stress response. Although MMPs have been studied in many species, their role in immune response in Japanese flounder (Paralichthys olivaceus) is still not fully understood. This study conducted a comprehensive analysis of MMPs in flounder, including gene structures, evolutionary relationships, conserved domains, molecular evolution, and expression patterns. Analysis revealed that MMP genes could be grouped into 17 subfamilies and were evolutionarily conserved and functionally-constrained. Meanwhile, MMP genes were found to express in different embryonic and larval stages and might play the role of sentinel in healthy tissues. Furthermore, expression profiling showed that MMPs had diverse functions in environmental stress, with 60% (9/15) and 73% (11/15) of MMPs showing differential expression patterns under temperature stress and Edwardsiella tarda (E. tarda) infection, respectively. These findings provide a useful resource for understanding the immune functions of MMP genes in Japanese flounder.
Collapse
Affiliation(s)
- Huihui Xie
- National Engineering Research Laboratory of Marine Biotechnology and Engineering, Ningbo University, China; Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, China; Collaborative Innovation Center for Zhejiang Marine High-efficiency and Healthy Aquaculture, Ningbo University, China; Key Laboratory of Marine Biotechnology of Zhejiang Province, Ningbo University, Ningbo, China; Key Laboratory of Green Mariculture (Co-construction By Ministry and Province), Ministry of Agriculture and Rural, Ningbo University, China
| | - Jiabao Hu
- National Engineering Research Laboratory of Marine Biotechnology and Engineering, Ningbo University, China; Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, China; Collaborative Innovation Center for Zhejiang Marine High-efficiency and Healthy Aquaculture, Ningbo University, China; Key Laboratory of Marine Biotechnology of Zhejiang Province, Ningbo University, Ningbo, China; Key Laboratory of Green Mariculture (Co-construction By Ministry and Province), Ministry of Agriculture and Rural, Ningbo University, China; School of Civil & Environmental Engineering and Geography Science, Ningbo University, Ningbo, China
| | - Yajun Wang
- National Engineering Research Laboratory of Marine Biotechnology and Engineering, Ningbo University, China; Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, China; Collaborative Innovation Center for Zhejiang Marine High-efficiency and Healthy Aquaculture, Ningbo University, China; Key Laboratory of Marine Biotechnology of Zhejiang Province, Ningbo University, Ningbo, China; Key Laboratory of Green Mariculture (Co-construction By Ministry and Province), Ministry of Agriculture and Rural, Ningbo University, China.
| | - Xubo Wang
- National Engineering Research Laboratory of Marine Biotechnology and Engineering, Ningbo University, China; Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, China; Collaborative Innovation Center for Zhejiang Marine High-efficiency and Healthy Aquaculture, Ningbo University, China; Key Laboratory of Marine Biotechnology of Zhejiang Province, Ningbo University, Ningbo, China; Key Laboratory of Green Mariculture (Co-construction By Ministry and Province), Ministry of Agriculture and Rural, Ningbo University, China.
| |
Collapse
|
13
|
Guo S, Gao W, Zeng M, Liu F, Yang Q, Chen L, Wang Z, Jin Y, Xiang P, Chen H, Wen Z, Shi Q, Song Z. Characterization of TLR1 and expression profiling of TLR signaling pathway related genes in response to Aeromonas hydrophila challenge in hybrid yellow catfish (Pelteobagrus fulvidraco ♀ × P. vachelli ♂). Front Immunol 2023; 14:1163781. [PMID: 37056759 PMCID: PMC10086376 DOI: 10.3389/fimmu.2023.1163781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 03/17/2023] [Indexed: 03/30/2023] Open
Abstract
Toll‐like receptor 1 (TLR1) mediates the innate immune response to a variety of microbes through recognizing cell wall components (such as bacterial lipoproteins) in mammals. However, the detailed molecular mechanism of TLR1 involved in pathogen immunity in the representative hybrid yellow catfish (Pelteobagrus fulvidraco ♀ × P. vachelli ♂) has not been well studied. In the present study, we identified the TLR1 gene from the hybrid yellow catfish, and further comparative synteny data from multiple species confirmed that the TLR1 gene is highly conserved in teleosts. Phylogenetic analysis revealed distinguishable TLR1s in diverse taxa, suggesting consistence in evolution of the TLR1 proteins with various species. Structural prediction indicated that the three-dimensional structures of TLR1 proteins are relatively conserved among different taxa. Positive selection analysis showed that purifying selection dominated the evolutionary process of TLR1s and TLR1-TIR domain in both vertebrates and invertebrates. Expression pattern analysis based on the tissue distribution showed that TLR1 mainly transcribed in the gonad, gallbladder and kidney, and the mRNA levels of TLR1 in kidney were remarkably up-regulated after Aeromonas hydrophila stimulation, indicating that TLR1 participates in the inflammatory responses to exogenous pathogen infection in hybrid yellow catfish. Homologous sequence alignment and chromosomal location indicated that the TLR signaling pathway is very conserved in the hybrid yellow catfish. The expression patterns of TLR signaling pathway related genes (TLR1- TLR2 - MyD88 - FADD - Caspase 8) were consistent after pathogen stimulation, revealing that the TLR signaling pathway is triggered and activated after A. hydrophila infection. Our findings will lay a solid foundation for better understanding the immune roles of TLR1 in teleosts, as well as provide basic data for developing strategies to control disease outbreak in hybrid yellow catfish.
Collapse
Affiliation(s)
- Shengtao Guo
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Wenxue Gao
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Mengsha Zeng
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Fenglin Liu
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Qingzhuoma Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Lei Chen
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Zesong Wang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Yanjun Jin
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Peng Xiang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Hanxi Chen
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
| | - Zhengyong Wen
- Key Laboratory of Sichuan for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, College of Life Science, Neijiang Normal University, Neijiang, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
- *Correspondence: Zhengyong Wen, ; Qiong Shi, ; Zhaobin Song,
| | - Qiong Shi
- Key Laboratory of Sichuan for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, College of Life Science, Neijiang Normal University, Neijiang, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen, China
- *Correspondence: Zhengyong Wen, ; Qiong Shi, ; Zhaobin Song,
| | - Zhaobin Song
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- *Correspondence: Zhengyong Wen, ; Qiong Shi, ; Zhaobin Song,
| |
Collapse
|
14
|
Latrille T, Rodrigue N, Lartillot N. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale. Proc Natl Acad Sci U S A 2023; 120:e2214977120. [PMID: 36897968 PMCID: PMC10089192 DOI: 10.1073/pnas.2214977120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/11/2023] [Indexed: 03/12/2023] Open
Abstract
Adaptation in protein-coding sequences can be detected from multiple sequence alignments across species or alternatively by leveraging polymorphism data within a population. Across species, quantification of the adaptive rate relies on phylogenetic codon models, classically formulated in terms of the ratio of nonsynonymous over synonymous substitution rates. Evidence of an accelerated nonsynonymous substitution rate is considered a signature of pervasive adaptation. However, because of the background of purifying selection, these models are potentially limited in their sensitivity. Recent developments have led to more sophisticated mutation-selection codon models aimed at making a more detailed quantitative assessment of the interplay between mutation, purifying, and positive selection. In this study, we conducted a large-scale exome-wide analysis of placental mammals with mutation-selection models, assessing their performance at detecting proteins and sites under adaptation. Importantly, mutation-selection codon models are based on a population-genetic formalism and thus are directly comparable to the McDonald and Kreitman test at the population level to quantify adaptation. Taking advantage of this relationship between phylogenetic and population genetics analyses, we integrated divergence and polymorphism data across the entire exome for 29 populations across 7 genera and showed that proteins and sites detected to be under adaptation at the phylogenetic scale are also under adaptation at the population-genetic scale. Altogether, our exome-wide analysis shows that phylogenetic mutation-selection codon models and the population-genetic test of adaptation can be reconciled and are congruent, paving the way for integrative models and analyses across individuals and populations.
Collapse
Affiliation(s)
- Thibault Latrille
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
- École Normale Supérieure de Lyon, Université de Lyon, 69342Lyon, France
- Department of Computational Biology, Université de Lausanne, 1015Lausanne, Switzerland
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, K1S 5B6Ottawa, Canada
| | - Nicolas Lartillot
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
| |
Collapse
|
15
|
Evaluating the Potential Fitness Effects of Chinook Salmon ( Oncorhynchus tshawytscha) Aquaculture Using Non-Invasive Population Genomic Analyses of MHC Nucleotide Substitution Spectra. Animals (Basel) 2023; 13:ani13040593. [PMID: 36830380 PMCID: PMC9951711 DOI: 10.3390/ani13040593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 01/27/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023] Open
Abstract
Genetic diversity plays a vital role in the adaptability of salmon to changing environmental conditions that can introduce new selective pressures on populations. Variability among local subpopulations may increase the chance that certain advantageous genes are passed down to future generations to mitigate susceptibility to novel diseases, warming oceans, loss of genetic stocks, and ocean acidification. Class I and II genes of the major histocompatibility complex (MHC) are crucial for the fitness of Chinook salmon due to the role they play in disease and pathogen resistance. The objective of this study was to assess the DNA sequence variability among wild and hatchery populations of Alaskan Chinook salmon at the class I α1 and class II β1 exons of the MHC. We hypothesized that the 96 wild samples taken from the Deshka River would display greater levels of observed heterozygosity (Ho) relative to expected heterozygosity (He) in suggesting that individuals with similar phenotypes mate with one another more frequently than would be expected under random mating patterns. Conversely, since no mate selection occurs in the William Jack Hernandez Sport Fish hatchery, we would not expect to see this discrepancy (He = Ho) in the 96 hatchery fish tested in this study. Alternatively, we hypothesized that post-mating selection is driving higher levels of observed heterozygosity as opposed to mate selection. If this is the case, we will observe higher than expected levels of heterozygosity among hatchery salmon. Both populations displayed higher levels of observed heterozygosity than expected heterozygosity at the Class I and II loci but genetic differentiation between the spatially distinct communities was minimal. Class I sequences showed evidence of balancing selection, despite high rates of non-synonymous substitutions observed, specifically at the peptide binding regions of both MHC genes.
Collapse
|
16
|
Guo S, Zeng M, Gao W, Li F, Wei X, Shi Q, Wen Z, Song Z. Toll-like Receptor 3 in the Hybrid Yellow Catfish ( Pelteobagrus fulvidraco ♀ × P. vachelli ♂): Protein Structure, Evolution and Immune Response to Exogenous Aeromonas hydrophila and Poly (I:C) Stimuli. Animals (Basel) 2023; 13:ani13020288. [PMID: 36670828 PMCID: PMC9854889 DOI: 10.3390/ani13020288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 01/08/2023] [Accepted: 01/10/2023] [Indexed: 01/17/2023] Open
Abstract
As a major mediator of cellular response to viral infection in mammals, Toll-like receptor 3 (TLR3) was proved to respond to double-stranded RNA (dsRNA). However, the molecular mechanism by which TLR3 functions in the viral infection response in teleosts remains to be investigated. In this study, the Toll-like receptor 3 gene of the hybrid yellow catfish was identified and characterized by comparative genomics. Furthermore, multiple sequence alignment, genomic synteny and phylogenetic analysis suggested that the homologous TLR3 genes were unique to teleosts. Gene structure analysis showed that five exons and four introns were common components of TLR3s in the 12 examined species, and interestingly the third exon in teleosts was the same length of 194 bp. Genomic synteny analysis indicated that TLR3s were highly conserved in various teleosts, with similar organizations of gene arrangement. De novo predictions showed that TLR3s were horseshoe-shaped in multiple taxa except for avian (with a round-shaped structure). Phylogenetic topology showed that the evolution of TLR3 was consistent with the evolution of the studied species. Selection analysis showed that the evolution rates of TLR3 proteins were usually higher than those of TLR3-TIR domains, indicating that the latter were more conserved. Tissue distribution analysis showed that TLR3s were widely distributed in the 12 tested tissues, with the highest transcriptions in liver and intestine. In addition, the transcription levels of TLR3 were significantly increased in immune-related tissues after infection of exogenous Aeromonas hydrophila and poly (I:C). Molecular docking showed that TLR3 in teleosts forms a complex with poly (I:C). In summary, our present results suggest that TLR3 is a pattern recognition receptor (PRR) gene in the immune response to pathogen infections in hybrid yellow catfish.
Collapse
Affiliation(s)
- Shengtao Guo
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Mengsha Zeng
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Wenxue Gao
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Fan Li
- Key Laboratory of Sichuan Province for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, Neijiang Normal University, Neijiang 641100, China
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Xiuying Wei
- Key Laboratory of Sichuan Province for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, Neijiang Normal University, Neijiang 641100, China
| | - Qiong Shi
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Zhengyong Wen
- Key Laboratory of Sichuan Province for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, Neijiang Normal University, Neijiang 641100, China
- Correspondence: (Z.W.); (Z.S.)
| | - Zhaobin Song
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
- Correspondence: (Z.W.); (Z.S.)
| |
Collapse
|
17
|
Duchemin L, Lanore V, Veber P, Boussau B. Evaluation of Methods to Detect Shifts in Directional Selection at the Genome Scale. Mol Biol Evol 2022; 40:6889995. [PMID: 36510704 PMCID: PMC9940701 DOI: 10.1093/molbev/msac247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying the footprints of selection in coding sequences can inform about the importance and function of individual sites. Analyses of the ratio of nonsynonymous to synonymous substitutions (dN/dS) have been widely used to pinpoint changes in the intensity of selection, but cannot distinguish them from changes in the direction of selection, that is, changes in the fitness of specific amino acids at a given position. A few methods that rely on amino-acid profiles to detect changes in directional selection have been designed, but their performances have not been well characterized. In this paper, we investigate the performance of six of these methods. We evaluate them on simulations along empirical phylogenies in which transition events have been annotated and compare their ability to detect sites that have undergone changes in the direction or intensity of selection to that of a widely used dN/dS approach, codeml's branch-site model A. We show that all methods have reduced performance in the presence of biased gene conversion but not CpG hypermutability. The best profile method, Pelican, a new implementation of Tamuri AU, Hay AJ, Goldstein RA. (2009. Identifying changes in selective constraints: host shifts in influenza. PLoS Comput Biol. 5(11):e1000564), performs as well as codeml in a range of conditions except for detecting relaxations of selection, and performs better when tree length increases, or in the presence of persistent positive selection. It is fast, enabling genome-scale searches for site-wise changes in the direction of selection associated with phenotypic changes.
Collapse
Affiliation(s)
| | - Vincent Lanore
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | - Philippe Veber
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | | |
Collapse
|
18
|
Li NK, Corander J, Grad YH, Chang HH. Discovering recent selection forces shaping the evolution of dengue viruses based on polymorphism data across geographic scales. Virus Evol 2022; 8:veac108. [PMID: 36601300 PMCID: PMC9789396 DOI: 10.1093/ve/veac108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 09/23/2022] [Accepted: 11/28/2022] [Indexed: 11/30/2022] Open
Abstract
Incomplete selection makes it challenging to infer selection on genes at short time scales, especially for microorganisms, due to stronger linkage between loci. However, in many cases, the selective force changes with environment, time, or other factors, and it is of great interest to understand selective forces at this level to answer relevant biological questions. We developed a new method that uses the change in dN /dS , instead of the absolute value of dN /dS , to infer the dominating selective force based on sequence data across geographical scales. If a gene was under positive selection, dN /dS was expected to increase through time, whereas if a gene was under negative selection, dN /dS was expected to decrease through time. Assuming that the migration rate decreased and the divergence time between samples increased from between-continent, within-continent different-country, to within-country level, dN /dS of a gene dominated by positive selection was expected to increase with increasing geographical scales, and the opposite trend was expected in the case of negative selection. Motivated by the McDonald-Kreitman (MK) test, we developed a pairwise MK test to assess the statistical significance of detected trends in dN /dS . Application of the method to a global sample of dengue virus genomes identified multiple significant signatures of selection in both the structural and non-structural proteins. Because this method does not require allele frequency estimates and uses synonymous mutations for comparison, it is less prone to sampling error, providing a way to infer selection forces within species using publicly available genomic data from locations over broad geographical scales.
Collapse
Affiliation(s)
- Nien-Kung Li
- Department of Life Science & Institute of Bioinformatics and Structural Biology, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, Hsinchu 300044, Taiwan
| | - Jukka Corander
- Helsinki Institute for Information Technology, Department of Mathematics and Statistics, University of Helsinki, Yliopistonkatu 3, Helsinki 00014, Finland,Department of Biostatistics, University of Oslo, Domus Medica Gaustad Sognsvannsveien 9, Oslo 0372, Norway,Parasites and Microbes, The Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | | | | |
Collapse
|
19
|
Han P, Qiao Y, He J, Men Y, Liu Y, Liu X, Wang X. Identification and functional analysis of dual-specificity phosphatases (DUSP) genes in Japanese flounder (Paralichthys olivaceus) against temperature and Edwardsiella tarda stress. FISH & SHELLFISH IMMUNOLOGY 2022; 130:453-461. [PMID: 36162775 DOI: 10.1016/j.fsi.2022.09.051] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 09/16/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Dual-specificity Phosphatases (DUSPs) are not only the key regulators of dephosphorylating and inactivating mitogen-activated protein kinases (MAPKs), but play a crucial role in the immune response. However, the role of DUSP genes in Japanese flounder (PoDUSPs) is still unclear. In this study, 28 DUSP genes in Japanese flounder were identified and classified based on the whole genome database. Phylogenetic analysis and protein structure analysis revealed that DUSPs had highly conserved domains in teleosts. Molecular evolution analysis indicated that the PoDUSP genes were conservative during evolution and were functional-constrained. Meanwhile, PoDUSP genes were found to express in different embryonic and larval stages which might play the role of sentinel in healthy organisms. Furthermore, PoDUSP genes' expression profiles after temperature stress and Edwardsiella tarda (E. tarda) infection were determined in Japanese flounder without precedent, and the results demonstrated that Podusp1, Podusp2 and Podusp16 were more respective to temperature variation whereas Podusp1 and Podusp6 were more respective to E. tarda infection. In summary, our results provide useful resources for understanding the immune responsibilities of DUSP genes in flatfish.
Collapse
Affiliation(s)
- Ping Han
- Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, Ningbo, Zhejiang, China; Key Laboratory of Marine Genetics and Breeding, Ministry of Education, Ocean University of China, Qingdao, Shandong, China.
| | - Yingjie Qiao
- Key Laboratory of Marine Genetics and Breeding, Ministry of Education, Ocean University of China, Qingdao, Shandong, China.
| | - Jiayi He
- Key Laboratory of Marine Genetics and Breeding, Ministry of Education, Ocean University of China, Qingdao, Shandong, China.
| | - Yu Men
- Key Laboratory of Marine Genetics and Breeding, Ministry of Education, Ocean University of China, Qingdao, Shandong, China.
| | - Yuxiang Liu
- Key Laboratory of Marine Genetics and Breeding, Ministry of Education, Ocean University of China, Qingdao, Shandong, China.
| | - Xiumei Liu
- College of Life Sciences, Yantai University, Yantai, 264005, China.
| | - Xubo Wang
- Key Laboratory of Aquacultural Biotechnology (Ningbo University), Ministry of Education, Ningbo, Zhejiang, China.
| |
Collapse
|
20
|
Aziz R, Sen P, Beura PK, Das S, Tula D, Dash M, Namsa ND, Deka RC, Feil EJ, Satapathy SS, Ray SK. Incorporation of transition to transversion ratio and nonsense mutations, improves the estimation of the number of synonymous and non-synonymous sites in codons. DNA Res 2022; 29:6654588. [PMID: 35920776 PMCID: PMC9358017 DOI: 10.1093/dnares/dsac023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
A common approach to estimate the strength and direction of selection acting on protein coding sequences is to calculate the dN/dS ratio. The method to calculate dN/dS has been widely used by many researchers and many critical reviews have been made on its application after the proposition by Nei and Gojobori in 1986. However, the method is still evolving considering the non-uniform substitution rates and pretermination codons. In our study of SNPs in 586 genes across 156 Escherichia coli strains, synonymous polymorphism in 2-fold degenerate codons were higher in comparison to that in 4-fold degenerate codons, which could be attributed to the difference between transition (Ti) and transversion (Tv) substitution rates where the average rate of a transition is four times more than that of a transversion in general. We considered both the Ti/Tv ratio, and nonsense mutation in pretermination codons, to improve estimates of synonymous (S) and non-synonymous (NS) sites. The accuracy of estimating dN/dS has been improved by considering the Ti/Tv ratio and nonsense substitutions in pretermination codons. We showed that applying the modified approach based on Ti/Tv ratio and pretermination codons results in higher values of dN/dS in 29 common genes of equal reading-frames between E. coli and Salmonella enterica. This study emphasizes the robustness of amino acid composition with varying codon degeneracy, as well as the pretermination codons when calculating dN/dS values.
Collapse
Affiliation(s)
- Ruksana Aziz
- Department of Molecular Biology and Biotechnology, Tezpur University , Tezpur, 784028 Assam, India
| | - Piyali Sen
- Department of Computer Science and Engineering, Tezpur University , Tezpur, 784028 Assam, India
| | - Pratyush Kumar Beura
- Department of Molecular Biology and Biotechnology, Tezpur University , Tezpur, 784028 Assam, India
| | - Saurav Das
- Department of Molecular Biology and Biotechnology, Tezpur University , Tezpur, 784028 Assam, India
| | - Debapriya Tula
- TCS Innovation, Tata Consultancy Services , Hyderabad, 500081 Telangana, India
| | - Madhusmita Dash
- Department of Electronics and Communication Engineering, NIT , Papum Pare, 791113 Arunachal Pradesh, India
| | - Nima Dondu Namsa
- Department of Molecular Biology and Biotechnology, Tezpur University , Tezpur, 784028 Assam, India
- Center for Multidisciplinary Research, Tezpur University , Tezpur, 784028 Assam, India
| | - Ramesh Chandra Deka
- Center for Multidisciplinary Research, Tezpur University , Tezpur, 784028 Assam, India
- Department of Chemical Sciences, Tezpur University , Tezpur, 784028 Assam, India
| | - Edward J Feil
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath , Bath BA2 7AY, UK
| | - Siddhartha Sankar Satapathy
- Department of Computer Science and Engineering, Tezpur University , Tezpur, 784028 Assam, India
- Center for Multidisciplinary Research, Tezpur University , Tezpur, 784028 Assam, India
| | - Suvendra Kumar Ray
- Department of Molecular Biology and Biotechnology, Tezpur University , Tezpur, 784028 Assam, India
- Center for Multidisciplinary Research, Tezpur University , Tezpur, 784028 Assam, India
| |
Collapse
|
21
|
Cope AL, Shah P. Intragenomic variation in non-adaptive nucleotide biases causes underestimation of selection on synonymous codon usage. PLoS Genet 2022; 18:e1010256. [PMID: 35714134 PMCID: PMC9246145 DOI: 10.1371/journal.pgen.1010256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 06/30/2022] [Accepted: 05/13/2022] [Indexed: 11/20/2022] Open
Abstract
Patterns of non-uniform usage of synonymous codons vary across genes in an organism and between species across all domains of life. This codon usage bias (CUB) is due to a combination of non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most models quantify the effects of mutation bias and selection on CUB assuming uniform mutational and other non-adaptive forces across the genome. However, non-adaptive nucleotide biases can vary within a genome due to processes such as biased gene conversion (BGC), potentially obfuscating signals of selection on codon usage. Moreover, genome-wide estimates of non-adaptive nucleotide biases are lacking for non-model organisms. We combine an unsupervised learning method with a population genetics model of synonymous coding sequence evolution to assess the impact of intragenomic variation in non-adaptive nucleotide bias on quantification of natural selection on synonymous codon usage across 49 Saccharomycotina yeasts. We find that in the absence of a priori information, unsupervised learning can be used to identify genes evolving under different non-adaptive nucleotide biases. We find that the impact of intragenomic variation in non-adaptive nucleotide bias varies widely, even among closely-related species. We show that the overall strength and direction of translational selection can be underestimated by failing to account for intragenomic variation in non-adaptive nucleotide biases. Interestingly, genes falling into clusters identified by machine learning are also physically clustered across chromosomes. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable non-adaptive nucleotide biases on codon frequencies.
Collapse
Affiliation(s)
- Alexander L. Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, United States of America
| | - Premal Shah
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
| |
Collapse
|
22
|
Chakraborty C, Sharma AR, Bhattacharya M, Lee SS. A Detailed Overview of Immune Escape, Antibody Escape, Partial Vaccine Escape of SARS-CoV-2 and Their Emerging Variants With Escape Mutations. Front Immunol 2022; 13:801522. [PMID: 35222380 PMCID: PMC8863680 DOI: 10.3389/fimmu.2022.801522] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 01/05/2022] [Indexed: 01/08/2023] Open
Abstract
The infective SARS-CoV-2 is more prone to immune escape. Presently, the significant variants of SARS-CoV-2 are emerging in due course of time with substantial mutations, having the immune escape property. Simultaneously, the vaccination drive against this virus is in progress worldwide. However, vaccine evasion has been noted by some of the newly emerging variants. Our review provides an overview of the emerging variants' immune escape and vaccine escape ability. We have illustrated a broad view related to viral evolution, variants, and immune escape ability. Subsequently, different immune escape approaches of SARS-CoV-2 have been discussed. Different innate immune escape strategies adopted by the SARS-CoV-2 has been discussed like, IFN-I production dysregulation, cytokines related immune escape, immune escape associated with dendritic cell function and macrophages, natural killer cells and neutrophils related immune escape, PRRs associated immune evasion, and NLRP3 inflammasome associated immune evasion. Simultaneously we have discussed the significant mutations related to emerging variants and immune escape, such as mutations in the RBD region (N439K, L452R, E484K, N501Y, K444R) and other parts (D614G, P681R) of the S-glycoprotein. Mutations in other locations such as NSP1, NSP3, NSP6, ORF3, and ORF8 have also been discussed. Finally, we have illustrated the emerging variants' partial vaccine (BioNTech/Pfizer mRNA/Oxford-AstraZeneca/BBIBP-CorV/ZF2001/Moderna mRNA/Johnson & Johnson vaccine) escape ability. This review will help gain in-depth knowledge related to immune escape, antibody escape, and partial vaccine escape ability of the virus and assist in controlling the current pandemic and prepare for the next.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, India
| | - Ashish Ranjan Sharma
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, South Korea
| | | | - Sang-Soo Lee
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, South Korea
| |
Collapse
|
23
|
Latrille T, Lartillot N. An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias. Mol Biol Evol 2022; 39:6503505. [PMID: 35021218 PMCID: PMC8831783 DOI: 10.1093/molbev/msac005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.
Collapse
Affiliation(s)
- T Latrille
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France.,École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France
| | - N Lartillot
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France
| |
Collapse
|
24
|
Gao F, Nan F, Feng J, Lv J, Liu Q, Liu X, Xie S. Comparative morphological, physiological, biochemical and genomic studies reveal novel genes of Dunaliella bioculata and D. quartolecta in response to salt stress. Mol Biol Rep 2021; 49:1749-1761. [PMID: 34813000 DOI: 10.1007/s11033-021-06984-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 11/18/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND Salinity is an essential abiotic stress in plants. Dunaliella is a genus of high-salt-tolerant microalgae. The present study aimed to compare the characterizations of D. bioculata and D. quartolecta at different levels and investigate novel genes response to salt stress. METHODS AND RESULTS High chlorophyll contents were detected in D. bioculata on the 35th d of salt stress, while high lipid and carotenoid contents were detected in D. quartolecta via morphological and biochemical analyses. Physiological analysis showed that D. quartolecta cells had a smaller increase in osmotic potential, a smaller decrease in the Na+/K+ ratio and photochemical efficiency (Fv/Fm), and a lower relative conductivity than D. bioculata cells. The genomic lengths of D. quartolecta and D. bioculata were 396,013,629 bp (scaffold N50 = 1954 bp) and 427,667,563 bp (scaffold N50 = 3093 bp) via high-throughput sequencing and de novo assembly, respectively. Altogether, 25,751 and 26,620 genes were predicted in their genomes by annotation analysis with various biodatabases. The D. bioculata genome showed more segmental duplication events via collinearity analysis. More single nucleotide polymorphisms and insertion-deletion variants were detected in the D. bioculata genome. Both algae, which showed a close phylogenetic relationship, may undergo positive selection via bioinformatics analysis. A total of 382 and 85 novel genes were screened in D. bioculata and D. quartolecta, with 138 and 51 enriched KEGG pathways, respectively. Unlike the novel genes adh1, hprA and serA, the relative expression of livF and phbB in D. bioculata was markedly downregulated as salinity increased, as determined by qPCR analysis. The relative expression of leuB, asd, pstC and proA in D. quartolecta was markedly upregulated with the same salinity increase. CONCLUSION Dunaliella quartolecta is more halophilic than D. bioculata, with more effective responses to high salt stress based on the multiphase comparative data.
Collapse
Affiliation(s)
- Fan Gao
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Fangru Nan
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Jia Feng
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Junping Lv
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Qi Liu
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Xudong Liu
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China
| | - Shulian Xie
- School of Life Science, Shanxi Key Laboratory for Research and Development of Regional Plants, Shanxi University, No. 92 Wucheng Road, Taiyuan, 030006, China.
| |
Collapse
|
25
|
Tamuri AU, Dos Reis M. A mutation-selection model of protein evolution under persistent positive selection. Mol Biol Evol 2021; 39:6409866. [PMID: 34694387 PMCID: PMC8760937 DOI: 10.1093/molbev/msab309] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We use first principles of population genetics to model the evolution of proteins under persistent positive selection (PPS). PPS may occur when organisms are subjected to persistent environmental change, during adaptive radiations, or in host–pathogen interactions. Our mutation–selection model indicates protein evolution under PPS is an irreversible Markov process, and thus proteins under PPS show a strongly asymmetrical distribution of selection coefficients among amino acid substitutions. Our model shows the criteria ω>1 (where ω is the ratio of nonsynonymous over synonymous codon substitution rates) to detect positive selection is conservative and indeed arbitrary, because in real proteins many mutations are highly deleterious and are removed by selection even at positively selected sites. We use a penalized-likelihood implementation of the PPS model to successfully detect PPS in plant RuBisCO and influenza HA proteins. By directly estimating selection coefficients at protein sites, our inference procedure bypasses the need for using ω as a surrogate measure of selection and improves our ability to detect molecular adaptation in proteins.
Collapse
Affiliation(s)
- Asif U Tamuri
- Centre for Advanced Research Computing, University College London, Gower St, London, WC1E 6BT, UK.,EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Mario Dos Reis
- School of Biological and Behavioural Sciences, Queen Mary University of London, Mile End Road, London, E1 4NS, UK
| |
Collapse
|
26
|
Stark TL, Liberles DA. Characterizing Amino Acid Substitution with Complete Linkage of Sites on a Lineage. Genome Biol Evol 2021; 13:6377338. [PMID: 34581792 PMCID: PMC8557849 DOI: 10.1093/gbe/evab225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/17/2021] [Indexed: 11/16/2022] Open
Abstract
Amino acid substitution models are commonly used for phylogenetic inference, for ancestral sequence reconstruction, and for the inference of positive selection. All commonly used models explicitly assume that each site evolves independently, an assumption that is violated by both linkage and protein structural and functional constraints. We introduce two new models for amino acid substitution which incorporate linkage between sites, each based on the (population-genetic) Moran model. The first model is a generalized population process tracking arbitrarily many sites which undergo mutation, with individuals replaced according to their fitnesses. This model provides a reasonably complete framework for simulations but is numerically and analytically intractable. We also introduce a second model which includes several simplifying assumptions but for which some theoretical results can be derived. We analyze the simplified model to determine conditions where linkage is likely to have meaningful effects on sitewise substitution probabilities, as well as conditions under which the effects are likely to be negligible. These findings are an important step in the generation of tractable phylogenetic models that parameterize selective coefficients for amino acid substitution while accounting for linkage of sites leading to both hitchhiking and background selection.
Collapse
Affiliation(s)
- Tristan L Stark
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, USA
| |
Collapse
|
27
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Shifts in amino acid preferences as proteins evolve: A synthesis of experimental and theoretical work. Protein Sci 2021; 30:2009-2028. [PMID: 34322924 PMCID: PMC8442975 DOI: 10.1002/pro.4161] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/19/2021] [Accepted: 07/26/2021] [Indexed: 11/08/2022]
Abstract
Amino acid preferences vary across sites and time. While variation across sites is widely accepted, the extent and frequency of temporal shifts are contentious. Our understanding of the drivers of amino acid preference change is incomplete: To what extent are temporal shifts driven by adaptive versus nonadaptive evolutionary processes? We review phenomena that cause preferences to vary (e.g., evolutionary Stokes shift, contingency, and entrenchment) and clarify how they differ. To determine the extent and prevalence of shifted preferences, we review experimental and theoretical studies. Analyses of natural sequence alignments often detect decreases in homoplasy (convergence and reversions) rates, and variation in replacement rates with time-signals that are consistent with temporally changing preferences. While approaches inferring shifts in preferences from patterns in natural alignments are valuable, they are indirect since multiple mechanisms (both adaptive and nonadaptive) could lead to the observed signal. Alternatively, site-directed mutagenesis experiments allow for a more direct assessment of shifted preferences. They corroborate evidence from multiple sequence alignments, revealing that the preference for an amino acid at a site varies depending on the background sequence. However, shifts in preferences are usually minor in magnitude and sites with significantly shifted preferences are low in frequency. The small yet consistent perturbations in preferences could, nevertheless, jeopardize the accuracy of inference procedures, which assume constant preferences. We conclude by discussing if and how such shifts in preferences might influence widely used time-homogenous inference procedures and potential ways to mitigate such effects.
Collapse
Affiliation(s)
- Noor Youssef
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Edward Susko
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| | - Andrew J. Roger
- Department of Biochemistry and Molecular BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Joseph P. Bielawski
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| |
Collapse
|
28
|
Latrille T, Lartillot N. Quantifying the impact of changes in effective population size and expression level on the rate of coding sequence evolution. Theor Popul Biol 2021; 142:57-66. [PMID: 34563555 DOI: 10.1016/j.tpb.2021.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 09/08/2021] [Accepted: 09/11/2021] [Indexed: 02/07/2023]
Abstract
Molecular sequences are shaped by selection, where the strength of selection relative to drift is determined by effective population size (Ne). Populations with high Ne are expected to undergo stronger purifying selection, and consequently to show a lower substitution rate for selected mutations relative to the substitution rate for neutral mutations (ω). However, computational models based on biophysics of protein stability have suggested that ω can also be independent of Ne. Together, the response of ω to changes in Ne depends on the specific mapping from sequence to fitness. Importantly, an increase in protein expression level has been found empirically to result in decrease of ω, an observation predicted by theoretical models assuming selection for protein stability. Here, we derive a theoretical approximation for the response of ω to changes in Ne and expression level, under an explicit genotype-phenotype-fitness map. The method is generally valid for additive traits and log-concave fitness functions. We applied these results to protein undergoing selection for their conformational stability and corroborate out findings with simulations under more complex models. We predict a weak response of ω to changes in either Ne or expression level, which are interchangeable. Based on empirical data, we propose that fitness based on the conformational stability may not be a sufficient mechanism to explain the empirically observed variation in ω across species. Other aspects of protein biophysics might be explored, such as protein-protein interactions, which can lead to a stronger response of ω to changes in Ne.
Collapse
Affiliation(s)
- T Latrille
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR 5558, F-69622 Villeurbanne, France; École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France.
| | - N Lartillot
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
29
|
Catania F, Ujvari B, Roche B, Capp JP, Thomas F. Bridging Tumorigenesis and Therapy Resistance With a Non-Darwinian and Non-Lamarckian Mechanism of Adaptive Evolution. Front Oncol 2021; 11:732081. [PMID: 34568068 PMCID: PMC8462274 DOI: 10.3389/fonc.2021.732081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 08/25/2021] [Indexed: 12/13/2022] Open
Abstract
Although neo-Darwinian (and less often Lamarckian) dynamics are regularly invoked to interpret cancer's multifarious molecular profiles, they shine little light on how tumorigenesis unfolds and often fail to fully capture the frequency and breadth of resistance mechanisms. This uncertainty frames one of the most problematic gaps between science and practice in modern times. Here, we offer a theory of adaptive cancer evolution, which builds on a molecular mechanism that lies outside neo-Darwinian and Lamarckian schemes. This mechanism coherently integrates non-genetic and genetic changes, ecological and evolutionary time scales, and shifts the spotlight away from positive selection towards purifying selection, genetic drift, and the creative-disruptive power of environmental change. The surprisingly simple use-it or lose-it rationale of the proposed theory can help predict molecular dynamics during tumorigenesis. It also provides simple rules of thumb that should help improve therapeutic approaches in cancer.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Beata Ujvari
- Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Deakin, VIC, Australia
| | - Benjamin Roche
- CREEC/CANECEV, MIVEGEC (CREES), Centre de Recherches Ecologiques et Evolutives sur le Cancer, University of Montpellier, CNRS, IRD, Montpellier, France
| | - Jean-Pascal Capp
- Toulouse Biotechnology Institute, University of Toulouse, INSA, CNRS, INRAE, Toulouse, France
| | - Frédéric Thomas
- CREEC/CANECEV, MIVEGEC (CREES), Centre de Recherches Ecologiques et Evolutives sur le Cancer, University of Montpellier, CNRS, IRD, Montpellier, France
| |
Collapse
|
30
|
Latrille T, Lanore V, Lartillot N. Inferring long-term effective population size with Mutation-Selection Models. Mol Biol Evol 2021; 38:4573-4587. [PMID: 34191010 PMCID: PMC8476147 DOI: 10.1093/molbev/msab160] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Mutation–selection phylogenetic codon models are grounded on population genetics first principles and represent a principled approach for investigating the intricate interplay between mutation, selection, and drift. In their current form, mutation–selection codon models are entirely characterized by the collection of site-specific amino-acid fitness profiles. However, thus far, they have relied on the assumption of a constant genetic drift, translating into a unique effective population size (Ne) across the phylogeny, clearly an unrealistic assumption. This assumption can be alleviated by introducing variation in Ne between lineages. In addition to Ne, the mutation rate (μ) is susceptible to vary between lineages, and both should covary with life-history traits (LHTs). This suggests that the model should more globally account for the joint evolutionary process followed by all of these lineage-specific variables (Ne, μ, and LHTs). In this direction, we introduce an extended mutation–selection model jointly reconstructing in a Bayesian Monte Carlo framework the fitness landscape across sites and long-term trends in Ne, μ, and LHTs along the phylogeny, from an alignment of DNA coding sequences and a matrix of observed LHTs in extant species. The model was tested against simulated data and applied to empirical data in mammals, isopods, and primates. The reconstructed history of Ne in these groups appears to correlate with LHTs or ecological variables in a way that suggests that the reconstruction is reasonable, at least in its global trends. On the other hand, the range of variation in Ne inferred across species is surprisingly narrow. This last point suggests that some of the assumptions of the model, in particular concerning the assumed absence of epistatic interactions between sites, are potentially problematic.
Collapse
Affiliation(s)
- T Latrille
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, 5558, F-69622, Villeurbanne, France.,École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France,
| | - V Lanore
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, 5558, F-69622, Villeurbanne, France
| | - N Lartillot
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, 5558, F-69622, Villeurbanne, France
| |
Collapse
|
31
|
Rodrigue N, Latrille T, Lartillot N. A Bayesian Mutation-Selection Framework for Detecting Site-Specific Adaptive Evolution in Protein-Coding Genes. Mol Biol Evol 2021; 38:1199-1208. [PMID: 33045094 PMCID: PMC7947879 DOI: 10.1093/molbev/msaa265] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
In recent years, codon substitution models based on the mutation–selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes—across the entire gene—or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program.
Collapse
Affiliation(s)
- Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, Ottawa, Canada
| | - Thibault Latrille
- Université de Lyon, Université Lyon 1, CNRS; UMR 5558, Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, F-69622, France
| | - Nicolas Lartillot
- Université de Lyon, Université Lyon 1, CNRS; UMR 5558, Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, F-69622, France
| |
Collapse
|
32
|
Spielman SJ. Relative Model Fit Does Not Predict Topological Accuracy in Single-Gene Protein Phylogenetics. Mol Biol Evol 2021; 37:2110-2123. [PMID: 32191313 PMCID: PMC7306691 DOI: 10.1093/molbev/msaa075] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
It is regarded as best practice in phylogenetic reconstruction to perform relative model selection to determine an appropriate evolutionary model for the data. This procedure ranks a set of candidate models according to their goodness of fit to the data, commonly using an information theoretic criterion. Users then specify the best-ranking model for inference. Although it is often assumed that better-fitting models translate to increase accuracy, recent studies have shown that the specific model employed may not substantially affect inferences. We examine whether there is a systematic relationship between relative model fit and topological inference accuracy in protein phylogenetics, using simulations and real sequences. Simulations employed site-heterogeneous mechanistic codon models that are distinct from protein-level phylogenetic inference models, allowing us to investigate how protein models performs when they are misspecified to the data, as will be the case for any real sequence analysis. We broadly find that phylogenies inferred across models with vastly different fits to the data produce highly consistent topologies. We additionally find that all models infer similar proportions of false-positive splits, raising the possibility that all available models of protein evolution are similarly misspecified. Moreover, we find that the parameter-rich GTR (general time reversible) model, whose amino acid exchangeabilities are free parameters, performs similarly to models with fixed exchangeabilities, although the inference precision associated with GTR models was not examined. We conclude that, although relative model selection may not hinder phylogenetic analysis on protein data, it may not offer specific predictable improvements and is not a reliable proxy for accuracy.
Collapse
|
33
|
Ritchie AM, Stark TL, Liberles DA. Inferring the number and position of changes in selective regime in a non-equilibrium mutation-selection framework. BMC Ecol Evol 2021; 21:39. [PMID: 33691618 PMCID: PMC7944921 DOI: 10.1186/s12862-021-01770-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 02/25/2021] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Recovering the historical patterns of selection acting on a protein coding sequence is a major goal of evolutionary biology. Mutation-selection models address this problem by explicitly modelling fixation rates as a function of site-specific amino acid fitness values.However, they are restricted in their utility for investigating directional evolution because they require prior knowledge of the locations of fitness changes in the lineages of a phylogeny. RESULTS We apply a modified mutation-selection methodology that relaxes assumptions of equlibrium and time-reversibility. Our implementation allows us to identify branches where adaptive or compensatory shifts in the fitness landscape have taken place, signalled by a change in amino acid fitness profiles. Through simulation and analysis of an empirical data set of [Formula: see text]-lactamase genes, we test our ability to recover the position of adaptive events within the tree and successfully reconstruct initial codon frequencies and fitness profile parameters generated under the non-stationary model. CONCLUSION We demonstrate successful detection of selective shifts and identification of the affected branch on partitions of 300 codons or more. We successfully reconstruct fitness parameters and initial codon frequencies in simulated data and demonstrate that failing to account for non-equilibrium evolution can increase the error in fitness profile estimation. We also demonstrate reconstruction of plausible shifts in amino acid fitnesses in the bacterial [Formula: see text]-lactamase family and discuss some caveats for interpretation.
Collapse
Affiliation(s)
- Andrew M Ritchie
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA
| | - Tristan L Stark
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA
| | - David A Liberles
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA.
| |
Collapse
|
34
|
Dashti M, Alsaleh H, Eaaswarkhanth M, John SE, Nizam R, Melhem M, Hebbar P, Sharma P, Al-Mulla F, Thanaraj TA. Delineation of Mitochondrial DNA Variants From Exome Sequencing Data and Association of Haplogroups With Obesity in Kuwait. Front Genet 2021; 12:626260. [PMID: 33659027 PMCID: PMC7920096 DOI: 10.3389/fgene.2021.626260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 01/13/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND/OBJECTIVES Whole-exome sequencing is a valuable tool to determine genetic variations that are associated with rare and common health conditions. A limited number of studies demonstrated that mitochondrial DNA can be captured using whole-exome sequencing. Previous studies have suggested that mitochondrial DNA variants and haplogroup lineages are associated with obesity. Therefore, we investigated the role of mitochondrial variants and haplogroups contributing to the risk of obesity in Arabs in Kuwait using exome sequencing data. SUBJECTS/METHODS Indirect mitochondrial genomes were extracted from exome sequencing data from 288 unrelated native Arab individuals from Kuwait. The cohort was divided into obese [body mass index (BMI) ≥ 30 kg/m2] and non-obese (BMI < 30 kg/m2) groups. Mitochondrial variants were identified, and haplogroups were classified and compared with other sequencing technologies. Statistical analysis was performed to determine associations and identify mitochondrial variants and haplogroups affecting obesity. RESULTS Haplogroup R showed a protective effect on obesity [odds ratio (OR) = 0.311; P = 0.006], whereas haplogroup L individuals were at high risk of obesity (OR = 2.285; P = 0.046). Significant differences in mitochondrial variants between the obese and non-obese groups were mainly haplogroup-defining mutations and were involved in processes in energy generation. The majority of mitochondrial variants and haplogroups extracted from exome were in agreement with technical replica from Sanger and whole-genome sequencing. CONCLUSIONS This is the first to utilize whole-exome data to extract entire mitochondrial haplogroups to study its association with obesity in an Arab population.
Collapse
Affiliation(s)
- Mohammed Dashti
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Hussain Alsaleh
- Kuwait Identification DNA Laboratory, General Department of Criminal Evidence, Ministry of Interior, Kuwait City, Kuwait
| | | | - Sumi Elsa John
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Rasheeba Nizam
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Motasem Melhem
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Prashantha Hebbar
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Prem Sharma
- Department Special Services Facilities, Dasman Diabetes Institute, Kuwait City, Kuwait
| | - Fahd Al-Mulla
- Genetics and Bioinformatics Department, Dasman Diabetes Institute, Kuwait City, Kuwait
| | | |
Collapse
|
35
|
Del Amparo R, Branco C, Arenas J, Vicens A, Arenas M. Analysis of selection in protein-coding sequences accounting for common biases. Brief Bioinform 2021; 22:6105943. [PMID: 33479739 DOI: 10.1093/bib/bbaa431] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 12/17/2020] [Accepted: 12/22/2020] [Indexed: 12/16/2022] Open
Abstract
The evolution of protein-coding genes is usually driven by selective processes, which favor some evolutionary trajectories over others, optimizing the subsequent protein stability and activity. The analysis of selection in this type of genetic data is broadly performed with the metric nonsynonymous/synonymous substitution rate ratio (dN/dS). However, most of the well-established methodologies to estimate this metric make crucial assumptions, such as lack of recombination or invariable codon frequencies along genes, which can bias the estimation. Here, we review the most relevant biases in the dN/dS estimation and provide a detailed guide to estimate this metric using state-of-the-art procedures that account for such biases, along with illustrative practical examples and recommendations. We also discuss the traditional interpretation of the estimated dN/dS emphasizing the importance of considering complementary biological information such as the role of the observed substitutions on the stability and function of proteins. This review is oriented to help evolutionary biologists that aim to accurately estimate selection in protein-coding sequences.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO (Biomedical Research Center), University of Vigo, 36310 Vigo, Spain.,Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain
| | - Catarina Branco
- CINBIO (Biomedical Research Center), University of Vigo, 36310 Vigo, Spain.,Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain
| | - Jesús Arenas
- Unit of Microbiology and Immunology, University of Zaragoza, 50013 Zaragoza, Spain
| | - Alberto Vicens
- CINBIO (Biomedical Research Center), University of Vigo, 36310 Vigo, Spain.,Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain
| | - Miguel Arenas
- CINBIO (Biomedical Research Center), University of Vigo, 36310 Vigo, Spain.,Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain
| |
Collapse
|
36
|
Jones CT, Youssef N, Susko E, Bielawski JP. A Phenotype-Genotype Codon Model for Detecting Adaptive Evolution. Syst Biol 2021; 69:722-738. [PMID: 31730199 DOI: 10.1093/sysbio/syz075] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 11/09/2019] [Accepted: 11/11/2019] [Indexed: 01/03/2023] Open
Abstract
A central objective in biology is to link adaptive evolution in a gene to structural and/or functional phenotypic novelties. Yet most analytic methods make inferences mainly from either phenotypic data or genetic data alone. A small number of models have been developed to infer correlations between the rate of molecular evolution and changes in a discrete or continuous life history trait. But such correlations are not necessarily evidence of adaptation. Here, we present a novel approach called the phenotype-genotype branch-site model (PG-BSM) designed to detect evidence of adaptive codon evolution associated with discrete-state phenotype evolution. An episode of adaptation is inferred under standard codon substitution models when there is evidence of positive selection in the form of an elevation in the nonsynonymous-to-synonymous rate ratio $\omega$ to a value $\omega > 1$. As it is becoming increasingly clear that $\omega > 1$ can occur without adaptation, the PG-BSM was formulated to infer an instance of adaptive evolution without appealing to evidence of positive selection. The null model makes use of a covarion-like component to account for general heterotachy (i.e., random changes in the evolutionary rate at a site over time). The alternative model employs samples of the phenotypic evolutionary history to test for phenomenological patterns of heterotachy consistent with specific mechanisms of molecular adaptation. These include 1) a persistent increase/decrease in $\omega$ at a site following a change in phenotype (the pattern) consistent with an increase/decrease in the functional importance of the site (the mechanism); and 2) a transient increase in $\omega$ at a site along a branch over which the phenotype changed (the pattern) consistent with a change in the site's optimal amino acid (the mechanism). Rejection of the null is followed by post hoc analyses to identify sites with strongest evidence for adaptation in association with changes in the phenotype as well as the most likely evolutionary history of the phenotype. Simulation studies based on a novel method for generating mechanistically realistic signatures of molecular adaptation show that the PG-BSM has good statistical properties. Analyses of real alignments show that site patterns identified post hoc are consistent with the specific mechanisms of adaptation included in the alternate model. Further simulation studies show that the covarion-like component of the PG-BSM plays a crucial role in mitigating recently discovered statistical pathologies associated with confounding by accounting for heterotachy-by-any-cause. [Adaptive evolution; branch-site model; confounding; mutation-selection; phenotype-genotype.].
Collapse
Affiliation(s)
- Christopher T Jones
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Noor Youssef
- Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Joseph P Bielawski
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| |
Collapse
|
37
|
Johnson MM, Wilke CO. Site-Specific Amino Acid Distributions Follow a Universal Shape. J Mol Evol 2020; 88:731-741. [PMID: 33230664 PMCID: PMC7717668 DOI: 10.1007/s00239-020-09976-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 11/17/2020] [Indexed: 11/25/2022]
Abstract
In many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a coarse-grained description that lacks biophysical realism (e.g., dN/dS models), or they require a large number of parameters to be fitted (e.g., mutation-selection models). Here, we ask whether a middle ground is possible: Can we obtain a realistic description of site-specific amino acid frequencies while severely restricting the number of free parameters in the model? We show that a distribution with a single free parameter can accurately capture the variation in amino acid frequency at most sites in an alignment, as long as we are willing to restrict our analysis to predicting amino acid frequencies by rank rather than by amino acid identity. This result holds equally well both in alignments of empirical protein sequences and of sequences evolved under a biophysically realistic all-atom force field. Our analysis reveals a near universal shape of the frequency distributions of amino acids. This insight has the potential to lead to new models of evolution that have both increased realism and a limited number of free parameters.
Collapse
Affiliation(s)
- Mackenzie M Johnson
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, 78712, USA
- Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
38
|
Nizovoy P, Bellora N, Haridas S, Sun H, Daum C, Barry K, Grigoriev IV, Libkind D, Connell LB, Moliné M. Unique genomic traits for cold adaptation in Naganishia vishniacii, a polyextremophile yeast isolated from Antarctica. FEMS Yeast Res 2020; 21:6000217. [PMID: 33232451 DOI: 10.1093/femsyr/foaa056] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 10/15/2020] [Indexed: 12/15/2022] Open
Abstract
Cold environments impose challenges to organisms. Polyextremophile microorganisms can survive in these conditions thanks to an array of counteracting mechanisms. Naganishia vishniacii, a yeast species hitherto only isolated from McMurdo Dry Valleys, Antarctica, is an example of a polyextremophile. Here we present the first draft genomic sequence of N. vishniacii. Using comparative genomics, we unraveled unique characteristics of cold associated adaptations. 336 putative genes (total: 6183) encoding solute transfers and chaperones, among others, were absent in sister species. Among genes shared by N. vishniacii and its closest related species we found orthologs encompassing possible evidence of positive selection (dN/dS > 1). Genes associated with photoprotection were found in agreement with high solar irradiation exposure. Also genes coding for desaturases and genomic features associated with cold tolerance (i.e. trehalose synthesis and lipid metabolism) were explored. Finally, biases in amino acid usage (namely an enrichment of glutamine and a trend in proline reduction) were observed, possibly conferring increased protein flexibility. To the best of our knowledge, such a combination of mechanisms for cold tolerance has not been previously reported in fungi, making N. vishniacii a unique model for the study of the genetic basis and evolution of cold adaptation strategies.
Collapse
Affiliation(s)
- Paula Nizovoy
- Centro de Referencia en Levaduras y Tecnologı́a Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologı́as Biológicas y Geoambientales (IPATEC) - CONICET / Universidad Nacional del Comahue, San Carlos de Bariloche, Rı́o Negro 8400, Argentina
| | - Nicolás Bellora
- Centro de Referencia en Levaduras y Tecnologı́a Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologı́as Biológicas y Geoambientales (IPATEC) - CONICET / Universidad Nacional del Comahue, San Carlos de Bariloche, Rı́o Negro 8400, Argentina
| | - Sajeet Haridas
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94598, USA
| | - Hui Sun
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94598, USA
| | - Chris Daum
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94598, USA
| | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94598, USA
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94598, USA.,Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Diego Libkind
- Centro de Referencia en Levaduras y Tecnologı́a Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologı́as Biológicas y Geoambientales (IPATEC) - CONICET / Universidad Nacional del Comahue, San Carlos de Bariloche, Rı́o Negro 8400, Argentina
| | - Laurie B Connell
- School of Marine Sciences, University of Maine, Orono, ME 04469, USA
| | - Martín Moliné
- Centro de Referencia en Levaduras y Tecnologı́a Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologı́as Biológicas y Geoambientales (IPATEC) - CONICET / Universidad Nacional del Comahue, San Carlos de Bariloche, Rı́o Negro 8400, Argentina
| |
Collapse
|
39
|
Youssef N, Susko E, Bielawski JP. Consequences of Stability-Induced Epistasis for Substitution Rates. Mol Biol Evol 2020; 37:3131-3148. [DOI: 10.1093/molbev/msaa151] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
AbstractDo interactions between residues in a protein (i.e., epistasis) significantly alter evolutionary dynamics? If so, what consequences might they have on inference from traditional codon substitution models which assume site-independence for the sake of computational tractability? To investigate the effects of epistasis on substitution rates, we employed a mechanistic mutation-selection model in conjunction with a fitness framework derived from protein stability. We refer to this as the stability-informed site-dependent (S-SD) model and developed a new stability-informed site-independent (S-SI) model that captures the average effect of stability constraints on individual sites of a protein. Comparison of S-SI and S-SD offers a novel and direct method for investigating the consequences of stability-induced epistasis on protein evolution. We developed S-SI and S-SD models for three natural proteins and showed that they generate sequences consistent with real alignments. Our analyses revealed that epistasis tends to increase substitution rates compared with the rates under site-independent evolution. We then assessed the epistatic sensitivity of individual site and discovered a counterintuitive effect: Highly connected sites were less influenced by epistasis relative to exposed sites. Lastly, we show that, despite the unrealistic assumptions, traditional models perform comparably well in the presence and absence of epistasis and provide reasonable summaries of average selection intensities. We conclude that epistatic models are critical to understanding protein evolutionary dynamics, but epistasis might not be required for reasonable inference of selection pressure when averaging over time and sites.
Collapse
Affiliation(s)
- Noor Youssef
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Joseph P Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Centre for Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
40
|
Northover DE, Shank SD, Liberles DA. Characterizing lineage-specific evolution and the processes driving genomic diversification in chordates. BMC Evol Biol 2020; 20:24. [PMID: 32046633 PMCID: PMC7011509 DOI: 10.1186/s12862-020-1585-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 01/16/2020] [Indexed: 11/21/2022] Open
Abstract
Background Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. By examining genome evolution through the guise of lineage-specific evolution, it is possible to make inferences about the evolutionary events that have given rise to species-specific diversification. Here we characterize the evolutionary trends found in chordate species using The Adaptive Evolution Database (TAED). TAED is a database of phylogenetically indexed gene families designed to detect episodes of directional or diversifying selection across chordates. Gene families within the database have been assessed for lineage-specific estimates of dN/dS and have been reconciled to the chordate species to identify retained duplicates. Gene families have also been mapped to the functional pathways and amino acid changes which occurred on high dN/dS lineages have been mapped to protein structures. Results An analysis of this exhaustive database has enabled a characterization of the processes of lineage-specific diversification in chordates. A pathway level enrichment analysis of TAED determined that pathways most commonly found to have elevated rates of evolution included those involved in metabolism, immunity, and cell signaling. An analysis of protein fold presence on proteins, after normalizing for frequency in the database, found common folds such as Rossmann folds, Jelly Roll folds, and TIM barrels were overrepresented on proteins most likely to undergo directional selection. A set of gene families which experience increased numbers of duplications within short evolutionary times are associated with pathways involved in metabolism, olfactory reception, and signaling. An analysis of protein secondary structure indicated more relaxed constraint in β-sheets and stronger constraint on alpha Helices, amidst a general preference for substitutions at exposed sites. Lastly a detailed analysis of the ornithine decarboxylase gene family, a key enzyme in the pathway for polyamine synthesis, revealed lineage-specific evolution along the lineage leading to Cetacea through rapid sequence evolution in a duplicate gene with amino acid substitutions causing active site rearrangement. Conclusion Episodes of lineage-specific evolution are frequent throughout chordate species. Both duplication and directional selection have played large roles in the evolution of the phylum. TAED is a powerful tool for facilitating this understanding of lineage-specific evolution.
Collapse
Affiliation(s)
- David E Northover
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Stephen D Shank
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA. .,Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA.
| |
Collapse
|
41
|
Lee IPA, Andam CP. Pan-genome diversification and recombination in Cronobacter sakazakii, an opportunistic pathogen in neonates, and insights to its xerotolerant lifestyle. BMC Microbiol 2019; 19:306. [PMID: 31881843 PMCID: PMC6935241 DOI: 10.1186/s12866-019-1664-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Accepted: 11/26/2019] [Indexed: 01/14/2023] Open
Abstract
Background Cronobacter sakazakii is an emerging opportunistic bacterial pathogen known to cause neonatal and pediatric infections, including meningitis, necrotizing enterocolitis, and bacteremia. Multiple disease outbreaks of C. sakazakii have been documented in the past few decades, yet little is known of its genomic diversity, adaptation, and evolution. Here, we analyzed the pan-genome characteristics and phylogenetic relationships of 237 genomes of C. sakazakii and 48 genomes of related Cronobacter species isolated from diverse sources. Results The C. sakazakii pan-genome contains 17,158 orthologous gene clusters, and approximately 19.5% of these constitute the core genome. Phylogenetic analyses reveal the presence of at least ten deep branching monophyletic lineages indicative of ancestral diversification. We detected enrichment of functions involved in proton transport and rotational mechanism in accessory genes exclusively found in human-derived strains. In environment-exclusive accessory genes, we detected enrichment for those involved in tryptophan biosynthesis and indole metabolism. However, we did not find significantly enriched gene functions for those genes exclusively found in food strains. The most frequently detected virulence genes are those that encode proteins associated with chemotaxis, enterobactin synthesis, ferrienterobactin transporter, type VI secretion system, galactose metabolism, and mannose metabolism. The genes fos which encodes resistance against fosfomycin, a broad-spectrum cell wall synthesis inhibitor, and mdf(A) which encodes a multidrug efflux transporter were found in nearly all genomes. We found that a total of 2991 genes in the pan-genome have had a history of recombination. Many of the most frequently recombined genes are associated with nutrient acquisition, metabolism and toxin production. Conclusions Overall, our results indicate that the presence of a large accessory gene pool, ability to switch between ecological niches, a diverse suite of antibiotic resistance, virulence and niche-specific genes, and frequent recombination partly explain the remarkable adaptability of C. sakazakii within and outside the human host. These findings provide critical insights that can help define the development of effective disease surveillance and control strategies for Cronobacter-related diseases.
Collapse
Affiliation(s)
- Isaiah Paolo A Lee
- Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH, 03824, USA
| | - Cheryl P Andam
- Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH, 03824, USA.
| |
Collapse
|
42
|
Jones CT, Youssef N, Susko E, Bielawski JP. Phenomenological Load on Model Parameters Can Lead to False Biological Conclusions. Mol Biol Evol 2019; 35:1473-1488. [PMID: 29596684 DOI: 10.1093/molbev/msy049] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
When a substitution model is fitted to an alignment using maximum likelihood, its parameters are adjusted to account for as much site-pattern variation as possible. A parameter might therefore absorb a substantial quantity of the total variance in an alignment (or more formally, bring about a substantial reduction in the deviance of the fitted model) even if the process it represents played no role in the generation of the data. When this occurs, we say that the parameter estimate carries phenomenological load (PL). Large PL in a parameter estimate is a concern because it not only invalidates its mechanistic interpretation (if it has one) but also increases the likelihood that it will be found to be statistically significant. The problem of PL was not identified in the past because most off-the-shelf substitution models make simplifying assumptions that preclude the generation of realistic levels of variation. In this study, we use the more realistic mutation-selection framework as the basis of a generating model formulated to produce data that mimic an alignment of mammalian mitochondrial DNA. We show that a parameter estimate can carry PL when 1) the substitution model is underspecified and 2) the parameter represents a process that is confounded with other processes represented in the data-generating model. We then provide a method that can be used to identify signal for the process that a given parameter represents despite the existence of PL.
Collapse
Affiliation(s)
- Christopher T Jones
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - Noor Youssef
- Department of Biology, Dalhousie University, Halifax, NS, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | | |
Collapse
|
43
|
Dunn KA, Kenney T, Gu H, Bielawski JP. Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates. BMC Evol Biol 2019; 19:22. [PMID: 30642241 PMCID: PMC6332903 DOI: 10.1186/s12862-018-1326-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/11/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An excess of nonsynonymous substitutions, over neutrality, is considered evidence of positive Darwinian selection. Inference for proteins often relies on estimation of the nonsynonymous to synonymous ratio (ω = dN/dS) within a codon model. However, to ease computational difficulties, ω is typically estimated assuming an idealized substitution process where (i) all nonsynonymous substitutions have the same rate (regardless of impact on organism fitness) and (ii) instantaneous double and triple (DT) nucleotide mutations have zero probability (despite evidence that they can occur). It follows that estimates of ω represent an imperfect summary of the intensity of selection, and that tests based on the ω > 1 threshold could be negatively impacted. RESULTS We developed a general-purpose parametric (GPP) modelling framework for codons. This novel approach allows specification of all possible instantaneous codon substitutions, including multiple nonsynonymous rates (MNRs) and instantaneous DT nucleotide changes. Existing codon models are specified as special cases of the GPP model. We use GPP models to implement likelihood ratio tests for ω > 1 that accommodate MNRs and DT mutations. Through both simulation and real data analysis, we find that failure to model MNRs and DT mutations reduces power in some cases and inflates false positives in others. False positives under traditional M2a and M8 models were very sensitive to DT changes. This was exacerbated by the choice of frequency parameterization (GY vs. MG), with rates sometimes > 90% under MG. By including MNRs and DT mutations, accuracy and power was greatly improved under the GPP framework. However, we also find that over-parameterized models can perform less well, and this can contribute to degraded performance of LRTs. CONCLUSIONS We suggest GPP models should be used alongside traditional codon models. Further, all codon models should be deployed within an experimental design that includes (i) assessing robustness to model assumptions, and (ii) investigation of non-standard behaviour of MLEs. As the goal of every analysis is to avoid false conclusions, more work is needed on model selection methods that consider both the increase in fit engendered by a model parameter and the degree to which that parameter is affected by un-modelled evolutionary processes.
Collapse
Affiliation(s)
- Katherine A. Dunn
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Toby Kenney
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Hong Gu
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Joseph P. Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Centre Comparative Genomics and Evolutionary Bioinformatics (CGEB) at Dalhousie University, Halifax, Canada
| |
Collapse
|
44
|
Looking for Darwin in Genomic Sequences: Validity and Success Depends on the Relationship Between Model and Data. Methods Mol Biol 2019; 1910:399-426. [PMID: 31278672 DOI: 10.1007/978-1-4939-9074-0_13] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Codon substitution models (CSMs) are commonly used to infer the history of natural section for a set of protein-coding sequences, often with the explicit goal of detecting the signature of positive Darwinian selection. However, the validity and success of CSMs used in conjunction with the maximum likelihood (ML) framework is sometimes challenged with claims that the approach might too often support false conclusions. In this chapter, we use a case study approach to identify four legitimate statistical difficulties associated with inference of evolutionary events using CSMs. These include: (1) model misspecification, (2) low information content, (3) the confounding of processes, and (4) phenomenological load, or PL. While past criticisms of CSMs can be connected to these issues, the historical critiques were often misdirected, or overstated, because they failed to recognize that the success of any model-based approach depends on the relationship between model and data. Here, we explore this relationship and provide a candid assessment of the limitations of CSMs to extract historical information from extant sequences. To aid in this assessment, we provide a brief overview of: (1) a more realistic way of thinking about the process of codon evolution framed in terms of population genetic parameters, and (2) a novel presentation of the ML statistical framework. We then divide the development of CSMs into two broad phases of scientific activity and show that the latter phase is characterized by increases in model complexity that can sometimes negatively impact inference of evolutionary mechanisms. Such problems are not yet widely appreciated by the users of CSMs. These problems can be avoided by using a model that is appropriate for the data; but, understanding the relationship between the data and a fitted model is a difficult task. We argue that the only way to properly understand that relationship is to perform in silico experiments using a generating process that can mimic the data as closely as possible. The mutation-selection modeling framework (MutSel) is presented as the basis of such a generating process. We contend that if complex CSMs continue to be developed for testing explicit mechanistic hypotheses, then additional analyses such as those described in here (e.g., penalized LRTs and estimation of PL) will need to be applied alongside the more traditional inferential methods.
Collapse
|
45
|
Yohe LR, Liu L, Dávalos LM, Liberles DA. Protocols for the Molecular Evolutionary Analysis of Membrane Protein Gene Duplicates. Methods Mol Biol 2019; 1851:49-62. [PMID: 30298391 DOI: 10.1007/978-1-4939-8736-8_3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Gene duplication is an important process in the evolution of gene content in eukaryotic genomes. Understanding when gene duplicates contribute new molecular functions to genomes through molecular adaptation is one important goal in comparative genomics. In large gene families, however, characterizing adaptation and neofunctionalization across species is challenging, as models have traditionally quantified the timing of duplications without considering underlying gene trees. This protocol combines multiple approaches to detect adaptation in protein duplicates at a phylogenetic scale. We include a description of models for gene tree-species tree reconciliation that enable different types of inference, as well as a practical guide to their use. Although simulation-based approaches successfully detect shifts in the rate of duplication/retention, the conflation between the duplication and retention processes, the distinct trajectories of duplicates under non-, sub-, and neofunctionalization, as well as dosage effects offer hitherto unexplored analytical avenues. We introduce mathematical descriptions of these probabilities and offer a road map to computational implementation whose starting point is parsimony reconciliation. Sequence evolution information based on the ratio of nonsynonymous to synonymous nucleotide substitution rates (dN/dS) can be combined with duplicate survival probabilities to better predict the emergence of new molecular functions in retained duplicates. Together, these methods enable characterization of potentially adaptive candidate duplicates whose neofunctionalization may contribute to phenotypic divergence across species.
Collapse
Affiliation(s)
- Laurel R Yohe
- Department of Geology & Geophysics, Yale University, New Haven, CT, USA.
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Liliana M Dávalos
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
46
|
Hilton SK, Bloom JD. Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral sequence divergence. Virus Evol 2018; 4:vey033. [PMID: 30425841 PMCID: PMC6220371 DOI: 10.1093/ve/vey033] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Molecular phylogenetics is often used to estimate the time since the divergence of modern gene sequences. For highly diverged sequences, such phylogenetic techniques sometimes estimate surprisingly recent divergence times. In the case of viruses, independent evidence indicates that the estimates of deep divergence times from molecular phylogenetics are sometimes too recent. This discrepancy is caused in part by inadequate models of purifying selection leading to branch-length underestimation. Here we examine the effect on branch-length estimation of using models that incorporate experimental measurements of purifying selection. We find that models informed by experimentally measured site-specific amino-acid preferences estimate longer deep branches on phylogenies of influenza virus hemagglutinin. This lengthening of branches is due to more realistic stationary states of the models, and is mostly independent of the branch-length extension from modeling site-to-site variation in amino-acid substitution rate. The branch-length extension from experimentally informed site-specific models is similar to that achieved by other approaches that allow the stationary state to vary across sites. However, the improvements from all of these site-specific but time homogeneous and site independent models are limited by the fact that a protein’s amino-acid preferences gradually shift as it evolves. Overall, our work underscores the importance of modeling site-specific amino-acid preferences when estimating deep divergence times—but also shows the inherent limitations of approaches that fail to account for how these preferences shift over time.
Collapse
Affiliation(s)
- Sarah K Hilton
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA
| | - Jesse D Bloom
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA.,Howard Hughes Medical Institute, Seattle, WA, USA
| |
Collapse
|
47
|
Spielman SJ, Kosakovsky Pond SL. Relative Evolutionary Rates in Proteins Are Largely Insensitive to the Substitution Model. Mol Biol Evol 2018; 35:2307-2317. [PMID: 29924340 PMCID: PMC6107055 DOI: 10.1093/molbev/msy127] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The relative evolutionary rates at individual sites in proteins are informative measures of conservation or adaptation. Often used as evolutionarily aware conservation scores, relative rates reveal key functional or strongly selected residues. Estimating rates in a phylogenetic context requires specifying a protein substitution model, which is typically a phenomenological model trained on a large empirical data set. A strong emphasis has traditionally been placed on selecting the "best-fit" model, with the implicit understanding that suboptimal or otherwise ill-fitting models might bias inferences. However, the pervasiveness and degree of such bias has not been systematically examined. We investigated how model choice impacts site-wise relative rates in a large set of empirical protein alignments. We compared models designed for use on any general protein, models designed for specific domains of life, and the simple equal-rates Jukes Cantor-style model (JC). As expected, information theoretic measures showed overwhelming evidence that some models fit the data decidedly better than others. By contrast, estimates of site-specific evolutionary rates were impressively insensitive to the substitution model used, revealing an unexpected degree of robustness to potential model misspecification. A deeper examination of the fewer than 5% of sites for which model inferences differed in a meaningful way showed that the JC model could uniquely identify rapidly evolving sites that models with empirically derived exchangeabilities failed to detect. We conclude that relative protein rates appear robust to the applied substitution model, and any sensible model of protein evolution, regardless of its fit to the data, should produce broadly consistent evolutionary rates.
Collapse
Affiliation(s)
- Stephanie J Spielman
- Department of Biology, Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Sergei L Kosakovsky Pond
- Department of Biology, Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| |
Collapse
|
48
|
Using the Mutation-Selection Framework to Characterize Selection on Protein Sequences. Genes (Basel) 2018; 9:genes9080409. [PMID: 30104502 PMCID: PMC6115872 DOI: 10.3390/genes9080409] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 08/02/2018] [Accepted: 08/09/2018] [Indexed: 12/13/2022] Open
Abstract
When mutational pressure is weak, the generative process of protein evolution involves explicit probabilities of mutations of different types coupled to their conditional probabilities of fixation dependent on selection. Establishing this mechanistic modeling framework for the detection of selection has been a goal in the field of molecular evolution. Building on a mathematical framework proposed more than a decade ago, numerous methods have been introduced in an attempt to detect and measure selection on protein sequences. In this review, we discuss the structure of the original model, subsequent advances, and the series of assumptions that these models operate under.
Collapse
|
49
|
Vadivel K, Mageshbabu R, Sankar S, Jain A, Perumal V, Srikanth P, Ranjan GA, Nair A, Simoes EAF, Nandagopal B, Sridharan G. Detection of parvovirus B19 in selected high-risk patient groups & their phylogenetic & selection analysis. Indian J Med Res 2018; 147:391-399. [PMID: 29998875 PMCID: PMC6057248 DOI: 10.4103/ijmr.ijmr_241_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background & objectives: Human parvovirus B19V (B19V) is known to be associated with erythema infectiosum commonly in children, aplastic crisis, especially in persons with underlying haemolytic disorders, hydrops fetalis in pregnancies and arthritis. This cross-sectional study was aimed to determine the presence of B19V infection in childhood febrile illnesses, association of B19V with arthropathies and in adult patients with end-stage renal disease (ESRD) on dialysis. The genetic diversity among the sequences was also analysed. Methods: A nested polymerase chain reaction (nPCR) assay was used for B19V DNA targeting VP1/VP2 region and used for testing 618 patients and 100 healthy controls. Phylogenetic analysis on nucleotide and amino acid sequences was carried out to compare our sequences with other Indian strains and global strains. Results: Among 618 samples tested, seven (1.13%) were found positive. The phylogenetic analysis revealed that all the seven sequences belonged to genotype 1 and showed low genetic diversity. The clustering pattern of seven sequences was similar both by nucleotide and by predicted amino acid sequences. The fixed effects likelihood analysis showed no positive or negatively selected sites. Interpretation & conclusions: Seven samples (4 from non-traumatic arthropathies, 2 from patients with ESRD and 1 from febrile illness patient) were found positive by nPCR. When our seven sequences were compared with global strains, the closest neighbour was other Indian strains followed by the Tunisian strains.
Collapse
Affiliation(s)
- Kumaran Vadivel
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| | - Ramamurthy Mageshbabu
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| | - Sathish Sankar
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| | - Amita Jain
- Department of Microbiology, King George Medical University, Lucknow, India
| | - Vivekanandan Perumal
- Kusuma School of Biological Sciences, Indian Institute of Technology, New Delhi, India
| | - Padma Srikanth
- Department of Microbiology, Sri Ramachandra Medical College & Research Institute, Sri Ramachandra University, Chennai, India
| | | | - Aravindan Nair
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| | - Eric A F Simoes
- School of Medicine & Professor of Pediatrics, University of Colorado, Aurora Colorado, USA
| | - Balaji Nandagopal
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| | - Gopalan Sridharan
- Sri Sakthi Amma Institute of Biomedical Research, Sri Narayani Hospital & Research Centre, Vellore, India
| |
Collapse
|
50
|
van der Kuyl AC, Vink M, Zorgdrager F, Bakker M, Wymant C, Hall M, Gall A, Blanquart F, Berkhout B, Fraser C, Cornelissen M. The evolution of subtype B HIV-1 tat in the Netherlands during 1985-2012. Virus Res 2018; 250:51-64. [PMID: 29654800 DOI: 10.1016/j.virusres.2018.04.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 04/05/2018] [Accepted: 04/05/2018] [Indexed: 12/18/2022]
Abstract
For the production of viral genomic RNA, HIV-1 is dependent on an early viral protein, Tat, which is required for high-level transcription. The quantity of viral RNA detectable in blood of HIV-1 infected individuals varies dramatically, and a factor involved could be the efficiency of Tat protein variants to stimulate RNA transcription. HIV-1 virulence, measured by set-point viral load, has been observed to increase over time in the Netherlands and elsewhere. Investigation of tat gene evolution in clinical isolates could discover a role of Tat in this changing virulence. A dataset of 291 Dutch HIV-1 subtype B tat genes, derived from full-length HIV-1 genome sequences from samples obtained between 1985-2012, was used to analyse the evolution of Tat. Twenty-two patient-derived tat genes, and the control TatHXB2 were analysed for their capacity to stimulate expression of an LTR-luciferase reporter gene construct in diverse cell lines, as well as for their ability to complement a tat-defective HIV-1LAI clone. Analysis of 291 historical tat sequences from the Netherlands showed ample amino acid (aa) variation between isolates, although no specific mutations were selected for over time. Of note, however, the encoded protein varied its length over the years through the loss or gain of stop codons in the second exon. In transmission clusters, a selection against the shorter Tat86 ORF was apparent in favour of the more common Tat101 version, likely due to negative selection against Tat86 itself, although random drift, transmission bottlenecks, or linkage to other variants could also explain the observation. There was no correlation between Tat length and set-point viral load; however, the number of non-intermediate variants in our study was small. In addition, variation in the length of Tat did not significantly change its capacity to stimulate transcription. From 1985 till 2012, variation in the length of the HIV-1 subtype B tat gene is increasingly found in the Dutch epidemic. However, as Tat proteins did not differ significantly in their capacity to stimulate transcription elongation in vitro, the increased HIV-1 virulence seen in recent years could not be linked to an evolving viral Tat protein.
Collapse
Affiliation(s)
- Antoinette C van der Kuyl
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands.
| | - Monique Vink
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | - Fokla Zorgdrager
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | - Margreet Bakker
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | - Chris Wymant
- Medical Research Council Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, W21PG, United Kingdom; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Matthew Hall
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Astrid Gall
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - François Blanquart
- Medical Research Council Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, W21PG, United Kingdom; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Ben Berkhout
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | - Christophe Fraser
- Medical Research Council Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, W21PG, United Kingdom; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Marion Cornelissen
- Laboratory of Experimental Virology, Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | | |
Collapse
|