Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ly-Trong N, Naser-Khdour S, Lanfear R, Minh BQ. AliSim: a fast and versatile phylogenetic sequence simulator for the genomic era. Mol Biol Evol 2022;39:6577219. [PMID: 35511713 PMCID: PMC9113491 DOI: 10.1093/molbev/msac092] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Ly-Trong N, Naser-Khdour S, Lanfear R, Minh BQ. AliSim: a fast and versatile phylogenetic sequence simulator for the genomic era. Mol Biol Evol 2022;39:6577219. [PMID: 35511713 PMCID: PMC9113491 DOI: 10.1093/molbev/msac092] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Banos H, Wong TKF, Daneau J, Susko E, Minh BQ, Lanfear R, Brown MW, Eme L, Roger AJ. GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models. Mol Biol Evol 2024;41:msae174. [PMID: 39158305 PMCID: PMC11371462 DOI: 10.1093/molbev/msae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/25/2024] [Accepted: 08/12/2024] [Indexed: 08/20/2024] Open

Abstract

Profile mixture models capture distinct biochemical constraints on the amino acid substitution process at different sites in proteins. These models feature a mixture of time-reversible models with a common matrix of exchangeabilities and distinct sets of equilibrium amino acid frequencies known as profiles. Combining the exchangeability matrix with each profile generates the matrix of instantaneous rates of amino acid exchange for that profile. Currently, empirically estimated exchangeability matrices (e.g. the LG matrix) are widely used for phylogenetic inference under profile mixture models. However, these were estimated using a single profile and are unlikely optimal for profile mixture models. Here, we describe the GTRpmix model that allows maximum likelihood estimation of a common exchangeability matrix under any profile mixture model. We show that exchangeability matrices estimated under profile mixture models differ from the LG matrix, dramatically improving model fit and topological estimation accuracy for empirical test cases. Because the GTRpmix model is computationally expensive, we provide two exchangeability matrices estimated from large concatenated phylogenomic-supermatrices to be used for phylogenetic analyses. One, called Eukaryotic Linked Mixture (ELM), is designed for phylogenetic analysis of proteins encoded by nuclear genomes of eukaryotes, and the other, Eukaryotic and Archaeal Linked mixture (EAL), for reconstructing relationships between eukaryotes and Archaea. These matrices, combined with profile mixture models, fit data better and have improved topology estimation relative to the LG matrix combined with the same mixture models. Starting with version 2.3.1, IQ-TREE2 allows users to estimate linked exchangeabilities (i.e. amino acid exchange rates) under profile mixture models.

Collapse

Redelings BD, Holmes I, Lunter G, Pupko T, Anisimova M. Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications. Mol Biol Evol 2024;41:msae177. [PMID: 39172750 PMCID: PMC11385596 DOI: 10.1093/molbev/msae177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 07/02/2024] [Accepted: 07/09/2024] [Indexed: 08/24/2024] Open

Redmond AK. Acoelomorph flatworm monophyly is a long-branch attraction artefact obscuring a clade of Acoela and Xenoturbellida. Proc Biol Sci 2024;291:20240329. [PMID: 39288803 PMCID: PMC11407873 DOI: 10.1098/rspb.2024.0329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 06/27/2024] [Accepted: 07/30/2024] [Indexed: 09/19/2024] Open

Kar C, Raghavan R, Ummath A, Puthiyaalikom N, Idreesbabu KK, Sureshkumar S. Resolving fusilier puzzles: The identity of Squamosicaesio marri and Pterocaesio flavifasciata, and a new record of Flavicaesio suevica from the Western Indian Ocean. JOURNAL OF FISH BIOLOGY 2024;105:993-997. [PMID: 38811354 DOI: 10.1111/jfb.15808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/24/2024] [Accepted: 05/09/2024] [Indexed: 05/31/2024]

Berv JS, Singhal S, Field DJ, Walker-Hale N, McHugh SW, Shipley JR, Miller ET, Kimball RT, Braun EL, Dornburg A, Parins-Fukuchi CT, Prum RO, Winger BM, Friedman M, Smith SA. Genome and life-history evolution link bird diversification to the end-Cretaceous mass extinction. SCIENCE ADVANCES 2024;10:eadp0114. [PMID: 39083615 PMCID: PMC11290531 DOI: 10.1126/sciadv.adp0114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/28/2024] [Indexed: 08/02/2024]

Affiliation(s)

Jacob S. Berv Department of Ecology and Evolutionary Biology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA Museum of Paleontology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA Museum of Zoology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA
Sonal Singhal Department of Biology, California State University, Dominguez Hills, Carson, CA 90747, USA
Daniel J. Field Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EQ, UK Museum of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
Nathanael Walker-Hale Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EA, UK
Sean W. McHugh Department of Evolution, Ecology, and Population Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
J. Ryan Shipley Department of Forest Dynamics, Swiss Federal Institute for Forest, Snow, and Landscape Research WSL, Zürcherstrasse 111 8903, Birmensdorf, Switzerland
Eliot T. Miller Center for Avian Population Studies, Cornell Lab of Ornithology, Cornell University, Ithaca, NY 14850, USA
Rebecca T. Kimball Department of Biology, University of Florida, Gainesville, FL 32611, USA
Edward L. Braun Department of Biology, University of Florida, Gainesville, FL 32611, USA
Alex Dornburg Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
C. Tomomi Parins-Fukuchi Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
Richard O. Prum Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA Peabody Museum of Natural History, Yale University, New Haven, CT 06520, USA
Benjamin M. Winger Department of Ecology and Evolutionary Biology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA Museum of Zoology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA
Matt Friedman Museum of Paleontology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA Department of Earth and Environmental Sciences, University of Michigan, 1100 North University Avenue, University of Michigan, Ann Arbor, MI 48109, USA
Stephen A. Smith Department of Ecology and Evolutionary Biology, University of Michigan, 1105 North University Avenue, Biological Sciences Building, University of Michigan, Ann Arbor, MI 48109, USA

Collapse

Suvorov A, Schrider DR. Reliable estimation of tree branch lengths using deep neural networks. PLoS Comput Biol 2024;20:e1012337. [PMID: 39102450 PMCID: PMC11326709 DOI: 10.1371/journal.pcbi.1012337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 08/15/2024] [Accepted: 07/18/2024] [Indexed: 08/07/2024] Open

Wong TKF, Cherryh C, Rodrigo AG, Hahn MW, Minh BQ, Lanfear R. MAST: Phylogenetic Inference with Mixtures Across Sites and Trees. Syst Biol 2024;73:375-391. [PMID: 38421146 PMCID: PMC11282360 DOI: 10.1093/sysbio/syae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 12/18/2023] [Accepted: 02/27/2024] [Indexed: 03/02/2024] Open

Abstract

Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting (ILS), introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call mixtures across sites and trees (MAST). This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of ILS in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of 4 Platyrrhine species for which standard concatenated maximum likelihood (ML) and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e., the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyze a concatenated alignment using ML while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.

Collapse

Fleming J, Eriksen PM, Struck TH. Scoutknife: A naïve, whole genome informed phylogenetic robusticity metric. F1000Res 2024;12:945. [PMID: 38799242 PMCID: PMC11128044 DOI: 10.12688/f1000research.139356.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/05/2024] [Indexed: 05/29/2024] Open

Abstract

Background: The phylogenetic bootstrap, first proposed by Felsenstein in 1985, is a critically important statistical method in assessing the robusticity of phylogenetic datasets. Core to its concept was the use of pseudo sampling - assessing the data by generating new replicates derived from the initial dataset that was used to generate the phylogeny. In this way, phylogenetic support metrics could overcome the lack of perfect, infinite data. With infinite data, however, it is possible to sample smaller replicates directly from the data to obtain both the phylogeny and its statistical robusticity in the same analysis. Due to the growth of whole genome sequencing, the depth and breadth of our datasets have greatly expanded and are set to only expand further. With genome-scale datasets comprising thousands of genes, we can now obtain a proxy for infinite data. Accordingly, we can potentially abandon the notion of pseudo sampling and instead randomly sample small subsets of genes from the thousands of genes in our analyses. Methods: We introduce Scoutknife, a jackknife-style subsampling implementation that generates 100 datasets by randomly sampling a small number of genes from an initial large-gene dataset to jointly establish both a phylogenetic hypothesis and assess its robusticity. We assess its effectiveness by using 18 previously published datasets and 100 simulation studies. Results: We show that Scoutknife is conservative and informative as to conflicts and incongruence across the whole genome, without the need for subsampling based on traditional model selection criteria. Conclusions: Scoutknife reliably achieves comparable results to selecting the best genes on both real and simulation datasets, while being resistant to the potential biases caused by selecting for model fit. As the amount of genome data grows, it becomes an even more exciting option to assess the robusticity of phylogenetic hypotheses.

Collapse

Efimenko B, Popadin K, Gunbin K. NeMu: a comprehensive pipeline for accurate reconstruction of neutral mutation spectra from evolutionary data. Nucleic Acids Res 2024;52:W108-W115. [PMID: 38795067 PMCID: PMC11223800 DOI: 10.1093/nar/gkae438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/23/2024] [Accepted: 05/09/2024] [Indexed: 05/27/2024] Open

Ecker N, Huchon D, Mansour Y, Mayrose I, Pupko T. A machine-learning-based alternative to phylogenetic bootstrap. Bioinformatics 2024;40:i208-i217. [PMID: 38940166 PMCID: PMC11211842 DOI: 10.1093/bioinformatics/btae255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open

Baños H, Susko E, Roger AJ. Is Over-parameterization a Problem for Profile Mixture Models? Syst Biol 2024;73:53-75. [PMID: 37843172 PMCID: PMC11129589 DOI: 10.1093/sysbio/syad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/12/2023] [Accepted: 10/13/2023] [Indexed: 10/17/2023] Open

Abstract

Biochemical constraints on the admissible amino acids at specific sites in proteins lead to heterogeneity of the amino acid substitution process over sites in alignments. It is well known that phylogenetic models of protein sequence evolution that do not account for site heterogeneity are prone to long-branch attraction (LBA) artifacts. Profile mixture models were developed to model heterogeneity of preferred amino acids at sites via a finite distribution of site classes each with a distinct set of equilibrium amino acid frequencies. However, it is unknown whether the large number of parameters in such models associated with the many amino acid frequency vectors can adversely affect tree topology estimates because of over-parameterization. Here, we demonstrate theoretically that for long sequences, over-parameterization does not create problems for estimation with profile mixture models. Under mild conditions, tree, amino acid frequencies, and other model parameters converge to true values as sequence length increases, even when there are large numbers of components in the frequency profile distributions. Because large sample theory does not necessarily imply good behavior for shorter alignments we explore the performance of these models with short alignments simulated with tree topologies that are prone to LBA artifacts. We find that over-parameterization is not a problem for complex profile mixture models even when there are many amino acid frequency vectors. In fact, simple models with few site classes behave poorly. Interestingly, we also found that misspecification of the amino acid frequency vectors does not lead to increased LBA artifacts as long as the estimated cumulative distribution function of the amino acid frequencies at sites adequately approximates the true one. In contrast, misspecification of the amino acid exchangeability rates can severely negatively affect parameter estimation. Finally, we explore the effects of including in the profile mixture model an additional "F-class" representing the overall frequencies of amino acids in the data set. Surprisingly, the F-class does not help parameter estimation significantly and can decrease the probability of correct tree estimation, depending on the scenario, even though it tends to improve likelihood scores.

Collapse

Bossert S, Pauly A, Danforth BN, Orr MC, Murray EA. Lessons from assembling UCEs: A comparison of common methods and the case of Clavinomia (Halictidae). Mol Ecol Resour 2024;24:e13925. [PMID: 38183389 DOI: 10.1111/1755-0998.13925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 12/08/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024]

Jiang Z, Zang W, Ericson PGP, Song G, Wu S, Feng S, Drovetski SV, Liu G, Zhang D, Saitoh T, Alström P, Edwards SV, Lei F, Qu Y. Gene flow and an anomaly zone complicate phylogenomic inference in a rapidly radiated avian family (Prunellidae). BMC Biol 2024;22:49. [PMID: 38413944 PMCID: PMC10900574 DOI: 10.1186/s12915-024-01848-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 02/15/2024] [Indexed: 02/29/2024] Open

Abstract

BACKGROUND

Resolving the phylogeny of rapidly radiating lineages presents a challenge when building the Tree of Life. An Old World avian family Prunellidae (Accentors) comprises twelve species that rapidly diversified at the Pliocene-Pleistocene boundary.

RESULTS

Here we investigate the phylogenetic relationships of all species of Prunellidae using a chromosome-level de novo assembly of Prunella strophiata and 36 high-coverage resequenced genomes. We use homologous alignments of thousands of exonic and intronic loci to build the coalescent and concatenated phylogenies and recover four different species trees. Topology tests show a large degree of gene tree-species tree discordance but only 40-54% of intronic gene trees and 36-75% of exonic genic trees can be explained by incomplete lineage sorting and gene tree estimation errors. Estimated branch lengths for three successive internal branches in the inferred species trees suggest the existence of an empirical anomaly zone. The most common topology recovered for species in this anomaly zone was not similar to any coalescent or concatenated inference phylogenies, suggesting presence of anomalous gene trees. However, this interpretation is complicated by the presence of gene flow because extensive introgression was detected among these species. When exploring tree topology distributions, introgression, and regional variation in recombination rate, we find that many autosomal regions contain signatures of introgression and thus may mislead phylogenetic inference. Conversely, the phylogenetic signal is concentrated to regions with low-recombination rate, such as the Z chromosome, which are also more resistant to interspecific introgression.

CONCLUSIONS

Collectively, our results suggest that phylogenomic inference should consider the underlying genomic architecture to maximize the consistency of phylogenomic signal.

Collapse

Affiliation(s)

Zhiyong Jiang Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Wenqing Zang Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Per G P Ericson Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden
Gang Song Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Shaoyuan Wu Jiangsu International Joint Center of Genomics, Jiangsu Key Laboratory of Phylogenomics & Comparative Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, 221116, Jiangsu, China
Shaohong Feng Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, 314102, China
Sergei V Drovetski National Museum of Natural History, Smithsonian Institution, Washington, DC, 20004, USA Present address: U.S. Geological Survey, Eastern Ecological Science Center at Patuxent Research Refuge, Laurel, MD, 20708, USA
Gang Liu Chinese Academy of Forestry, Institute of Ecological Conservation and Restoration, Beijing, 100091, China
Dezhi Zhang Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
Takema Saitoh Yamashina Institute for Ornithology, Abiko, Chiba, Japan
Per Alström Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China Animal Ecology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, 752 36, Uppsala, Sweden
Scott V Edwards Museum of Comparative Zoology and Department of Organismic & Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
Fumin Lei Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Yanhua Qu Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China. College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China. Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden.

Collapse

Trost J, Haag J, Höhler D, Jacob L, Stamatakis A, Boussau B. Simulations of Sequence Evolution: How (Un)realistic They Are and Why. Mol Biol Evol 2024;41:msad277. [PMID: 38124381 PMCID: PMC10768886 DOI: 10.1093/molbev/msad277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/17/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023] Open

Kar C, Mariyambi PC, Raghavan R, Sureshkumar S. Mitochondrial phylogeny of fusilier fishes (family Caesionidae) from the Laccadive archipelago reveals a new species and two new records from the Central Indian Ocean. JOURNAL OF FISH BIOLOGY 2023;103:1445-1451. [PMID: 37667092 DOI: 10.1111/jfb.15553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 08/30/2023] [Accepted: 08/31/2023] [Indexed: 09/06/2023]

Smith ML, Hahn MW. Phylogenetic inference using generative adversarial networks. Bioinformatics 2023;39:btad543. [PMID: 37669126 PMCID: PMC10500083 DOI: 10.1093/bioinformatics/btad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/25/2023] [Accepted: 09/04/2023] [Indexed: 09/07/2023] Open

Ly-Trong N, Barca GMJ, Minh BQ. AliSim-HPC: parallel sequence simulator for phylogenetics. Bioinformatics 2023;39:btad540. [PMID: 37656933 PMCID: PMC10534053 DOI: 10.1093/bioinformatics/btad540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 08/16/2023] [Accepted: 08/31/2023] [Indexed: 09/03/2023] Open

Rivera-Rivera CJ, Grbic D. CastNet: a systems-level sequence evolution simulator. BMC Bioinformatics 2023;24:247. [PMID: 37308829 DOI: 10.1186/s12859-023-05366-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 05/26/2023] [Indexed: 06/14/2023] Open

Abstract

BACKGROUND

Simulating DNA evolution has been done through coevolution-agnostic probabilistic frameworks for the past 3 decades. The most common implementation is by using the converse of the probabilistic approach used to infer phylogenies which, in the simplest form, simulates a single sequence at a time. However, biological systems are multi-genic, and gene products can affect each other's evolutionary paths through coevolution. These crucial evolutionary dynamics still remain to be simulated, and we believe that modelling them can lead to profound insights for comparative genomics.

RESULTS

Here we present CastNet, a genome evolution simulator that assumes each genome is a collection of genes with constantly evolving regulatory interactions in between them. The regulatory interactions produce a phenotype in the form of gene expression profiles, upon which fitness is calculated. A genetic algorithm is then used to evolve a population of such entities through a user-defined phylogeny. Importantly, the regulatory mutations are a response to sequence mutations, thus making a 1-1 relationship between the rate of evolution of sequences and of regulatory parameters. This is, to our knowledge, the first time the evolution of sequences and regulation have been explicitly linked in a simulation, despite there being a multitude of sequence evolution simulators, and a handful of models to simulate Gene Regulatory Network (GRN) evolution. In our test runs, we see a coevolutionary signal among genes that are active in the GRN, and neutral evolution in genes that are not included in the network, showing that selective pressures imposed on the regulatory output of the genes are reflected in their sequences.

CONCLUSION

We believe that CastNet represents a substantial step for developing new tools to study genome evolution, and more broadly, coevolutionary webs and complex evolving systems. This simulator also provides a new framework to study molecular evolution where sequence coevolution has a leading role.

Collapse

Smith CH, Pinto BJ, Kirkpatrick M, Hillis DM, Pfeiffer JM, Havird JC. A tale of two paths: The evolution of mitochondrial recombination in bivalves with doubly uniparental inheritance. J Hered 2023;114:199-206. [PMID: 36897956 PMCID: PMC10212130 DOI: 10.1093/jhered/esad004] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 01/19/2023] [Indexed: 03/12/2023] Open

Fleming JF, Struck TH. nRCFV: a new, dataset-size-independent metric to quantify compositional heterogeneity in nucleotide and amino acid datasets. BMC Bioinformatics 2023;24:145. [PMID: 37046225 PMCID: PMC10099917 DOI: 10.1186/s12859-023-05270-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 04/04/2023] [Indexed: 04/14/2023] Open

Sukhorukov GA, Paramonov AI, Lisak OV, Kozlova IV, Bazykin GA, Neverov AD, Karan LS. The Baikal subtype of tick-borne encephalitis virus is evident of recombination between Siberian and Far-Eastern subtypes. PLoS Negl Trop Dis 2023;17:e0011141. [PMID: 36972237 PMCID: PMC10079218 DOI: 10.1371/journal.pntd.0011141] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 04/06/2023] [Accepted: 02/06/2023] [Indexed: 03/29/2023] Open

Legall N, Salvador LCM. Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales. Front Microbiol 2022;13:787856. [PMID: 36160199 PMCID: PMC9489834 DOI: 10.3389/fmicb.2022.787856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 08/08/2022] [Indexed: 11/28/2022] Open

Abstract

Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions.

Collapse