1
|
Ceron-Noriega A, Schoonenberg VAC, Butter F, Levin M. AlexandrusPS: A User-Friendly Pipeline for the Automated Detection of Orthologous Gene Clusters and Subsequent Positive Selection Analysis. Genome Biol Evol 2023; 15:evad187. [PMID: 37831426 PMCID: PMC10612477 DOI: 10.1093/gbe/evad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/26/2023] [Accepted: 10/06/2023] [Indexed: 10/14/2023] Open
Abstract
The detection of adaptive selection in a system approach considering all protein-coding genes allows for the identification of mechanisms and pathways that enabled adaptation to different environments. Currently, available programs for the estimation of positive selection signals can be divided into two groups. They are either easy to apply but can analyze only one gene family at a time, restricting system analysis; or they can handle larger cohorts of gene families, but require considerable prerequisite data such as orthology associations, codon alignments, phylogenetic trees, and proper configuration files. All these steps require extensive computational expertise, restricting this endeavor to specialists. Here, we introduce AlexandrusPS, a high-throughput pipeline that overcomes technical challenges when conducting transcriptome-wide positive selection analyses on large sets of nucleotide and protein sequences. The pipeline streamlines 1) the execution of an accurate orthology prediction as a precondition for positive selection analysis, 2) preparing and organizing configuration files for CodeML, 3) performing positive selection analysis using CodeML, and 4) generating an output that is easy to interpret, including all maximum likelihood and log-likelihood test results. The only input needed from the user is the CDS and peptide FASTA files of proteins of interest. The pipeline is provided in a Docker image, requiring no program or module installation, enabling the application of the pipeline in any computing environment. AlexandrusPS and its documentation are available via GitHub (https://github.com/alejocn5/AlexandrusPS).
Collapse
Affiliation(s)
- Alejandro Ceron-Noriega
- Institute of Molecular Biology (IMB), Quantitative Proteomics, Mainz, Germany
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Department of Human Genetics, Mainz, Germany
| | - Vivien A C Schoonenberg
- Institute of Molecular Biology (IMB), Quantitative Proteomics, Mainz, Germany
- Present address: Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA.
- Present address: Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Department of Pathology, Harvard Medical School, Boston, Massachusetts, USA.
| | - Falk Butter
- Institute of Molecular Biology (IMB), Quantitative Proteomics, Mainz, Germany
- Institute of Molecular Virology and Cell Biology, Friedrich-Loeffler-Institute, Greifswald, Germany
| | - Michal Levin
- Institute of Molecular Biology (IMB), Quantitative Proteomics, Mainz, Germany
| |
Collapse
|
2
|
Cao T, Li Q, Huang Y, Li A. plotnineSeqSuite: a Python package for visualizing sequence data using ggplot2 style. BMC Genomics 2023; 24:585. [PMID: 37789265 PMCID: PMC10546746 DOI: 10.1186/s12864-023-09677-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 09/14/2023] [Indexed: 10/05/2023] Open
Abstract
BACKGROUND The visual sequence logo has been a hot area in the development of bioinformatics tools. ggseqlogo written in R language has been the most popular API since it was published. With the popularity of artificial intelligence and deep learning, Python is currently the most popular programming language. The programming language used by bioinformaticians began to shift to Python. Providing APIs in Python that are similar to those in R can reduce the learning cost of relearning a programming language. And compared to ggplot2 in R, drawing framework is not as easy to use in Python. The appearance of plotnine (ggplot2 in Python version) makes it possible to unify the programming methods of bioinformatics visualization tools between R and Python. RESULTS Here, we introduce plotnineSeqSuite, a new plotnine-based Python package provides a ggseqlogo-like API for programmatic drawing of sequence logos, sequence alignment diagrams and sequence histograms. To be more precise, it supports custom letters, color themes, and fonts. Moreover, the class for drawing layers is based on object-oriented design so that users can easily encapsulate and extend it. CONCLUSIONS plotnineSeqSuite is the first ggplot2-style package to implement visualization of sequence -related graphs in Python. It enhances the uniformity of programmatic plotting between R and Python. Compared with tools appeared already, the categories supported by plotnineSeqSuite are much more complete. The source code of plotnineSeqSuite can be obtained on GitHub ( https://github.com/caotianze/plotnineseqsuite ) and PyPI ( https://pypi.org/project/plotnineseqsuite ), and the documentation homepage is freely available on GitHub at ( https://caotianze.github.io/plotnineseqsuite/ ).
Collapse
Affiliation(s)
- Tianze Cao
- School of Mathematics, Hangzhou Normal University, Hangzhou, Zhejiang Province, China
| | - Qian Li
- Department of Rehabilitation, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China
| | - Yuexia Huang
- School of Mathematics, Hangzhou Normal University, Hangzhou, Zhejiang Province, China.
| | - Anshui Li
- Department of Statistics, Shaoxing University, Shaoxing, Zhejiang Province, China.
| |
Collapse
|
3
|
Lu GH, Xu JL, Zhong MX, Li DL, Chen M, Li KT, Wang YQ. Cytochemical and comparative transcriptome analyses elucidate the formation and ecological adaptation of three types of pollen coat in Zingiberaceae. BMC PLANT BIOLOGY 2022; 22:407. [PMID: 35987603 PMCID: PMC9392269 DOI: 10.1186/s12870-022-03796-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 08/08/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND The pollen ornate surface of flowering plants has long fascinated and puzzled evolutionary biologists for their variety. Each pollen grain is contained within a pollen wall consisting of intine and exine, over which the lipoid pollen coat lies. The cytology and molecular biology of the development of the intine and exine components of the pollen wall are relatively well characterised. However, little is known about the pollen coat, which confers species specificity. We demonstrate three types of pollen coat in Zingiberaceae, a mucilage-like pollen coat and a gum-like pollen coat, along with a pollen coat more typical of angiosperms. The morphological differences between the three types of pollen coat and the related molecular mechanisms of their formation were studied using an integrative approach of cytology, RNA-seq and positive selection analysis. RESULTS Contrary to the 'typical' pollen coat, in ginger species with a mucilage-like (Caulokaempferia coenobialis, Cco) or gum-like (Hornstedtia hainanensis, Hhn) pollen coat, anther locular fluid was still present at the bicellular pollen (BCP) stage of development. Nevertheless, there were marked differences between these species: there were much lower levels of anther locular fluid in Hhn at the BCP stage and it contained less polysaccharide, but more lipid, than the locular fluid of Cco. The set of specific highly-expressed (SHE) genes in Cco was enriched in the 'polysaccharide metabolic process' annotation term, while 'fatty acid degradation' and 'metabolism of terpenoids and polyketides' were significantly enriched in SHE-Hhn. CONCLUSIONS Our cytological and comparative transcriptome analysis showed that different types of pollen coat depend on the residual amount and composition of anther locular fluid at the BCP stage. The genes involved in 'polysaccharide metabolism' and 'transport' in the development of a mucilage-like pollen coat and in 'lipid metabolism' and 'transport' in the development of a gum-like pollen coat probably evolved under positive selection in both cases. We suggest that the shift from a typical pollen coat to a gum-like or mucilage-like pollen coat in flowering plants is an adaptation to habitats with high humidity and scarcity of pollinators.
Collapse
Affiliation(s)
- Guo-Hui Lu
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Jia-Ling Xu
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Man-Xiang Zhong
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Dong-Li Li
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Min Chen
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Ke-Ting Li
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China
| | - Ying-Qiang Wang
- Guangdong Provincial Key Laboratory of Biotechnology for Plant Development, School of Life Sciences, South China Normal University, Guangzhou, 510631, China.
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, School of Life Sciences, South China Normal University, Guangzhou, 510631, China.
| |
Collapse
|
4
|
Steffen R, Ogoniak L, Grundmann N, Pawluchin A, Soehnlein O, Schmitz J. paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences. Genes (Basel) 2022; 13:1090. [PMID: 35741852 PMCID: PMC9222883 DOI: 10.3390/genes13061090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 02/05/2023] Open
Abstract
Evolution is change over time. Although neutral changes promoted by drift effects are most reliable for phylogenetic reconstructions, selection-relevant changes are of only limited use to reconstruct phylogenies. On the other hand, comparative analyses of neutral and selected changes of protein-coding DNA sequences (CDS) retrospectively tell us about episodic constrained, relaxed, and adaptive incidences. The ratio of sites with nonsynonymous (amino acid altering) versus synonymous (not altering) mutations directly measures selection pressure and can be analysed by using the Phylogenetic Analysis by Maximum Likelihood (PAML) software package. We developed a CDS extractor for compiling protein-coding sequences (CDS-extractor) and parallel PAML (paPAML) to simplify, amplify, and accelerate selection analyses via parallel processing, including detection of negatively selected sites. paPAML compiles results of site, branch-site, and branch models and detects site-specific negative selection with the output of a codon list labelling significance values. The tool simplifies selection analyses for casual and inexperienced users and accelerates computing speeds up to the number of allocated computer threads. We then applied paPAML to examine the evolutionary impact on a new GINS Complex Subunit 3 exon, and neutrophil-associated as well as lysin and apolipoprotein genes. Compared with codeml (PAML version 4.9j) and HyPhy (HyPhy FEL version 2.5.26), all paPAML test runs performed with 10 computing threads led to identical selection pressure results, whereas the total selection analysis via paPAML, including all model comparisons, was about 3 to 5 times faster than the longest running codeml model and about 7 to 15 times faster than the entire processing time of these codeml runs.
Collapse
Affiliation(s)
- Raphael Steffen
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (L.O.); (A.P.); (O.S.)
| | - Lynn Ogoniak
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (L.O.); (A.P.); (O.S.)
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149 Münster, Germany;
| | - Norbert Grundmann
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149 Münster, Germany;
| | - Anna Pawluchin
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (L.O.); (A.P.); (O.S.)
| | - Oliver Soehnlein
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (L.O.); (A.P.); (O.S.)
- Department of Physiology and Pharmacology (FyFa), Karolinska Institutet, 17177 Stockholm, Sweden
| | - Jürgen Schmitz
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (L.O.); (A.P.); (O.S.)
| |
Collapse
|
5
|
Forni G, Ruggieri AA, Piccinini G, Luchetti A. BASE: A novel workflow to integrate nonubiquitous genes in comparative genomics analyses for selection. Ecol Evol 2021; 11:13029-13035. [PMID: 34646450 PMCID: PMC8495783 DOI: 10.1002/ece3.7959] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 06/29/2021] [Accepted: 07/12/2021] [Indexed: 11/07/2022] Open
Abstract
Inferring the selective forces that orthologous genes underwent across different lineages can help us understand the evolutionary processes that have shaped their extant diversity and the phenotypes they underlie. The most widespread metric to estimate the selection regimes of coding genes-across sites and phylogenies-is the ratio of nonsynonymous to synonymous substitutions (dN/dS, also known as ω). Nowadays, modern sequencing technologies and the large amount of already available sequence data allow the retrieval of thousands of orthologous genes across large numbers of species. Nonetheless, the tools available to explore selection regimes are not designed to automatically process all genes, and their practical usage is often restricted to the single-copy ones which are found across all species considered (i.e., ubiquitous genes). This approach limits the scale of the analysis to a fraction of single-copy genes, which can be as low as an order of magnitude in respect to those which are not consistently found in all species considered (i.e., nonubiquitous genes). Here, we present a workflow named BASE that-leveraging the CodeML framework-eases the inference and interpretation of gene selection regimes in the context of comparative genomics. Although a number of bioinformatics tools have already been developed to facilitate this kind of analyses, BASE is the first to be specifically designed to allow the integration of nonubiquitous genes in a straightforward and reproducible manner. The workflow-along with all relevant documentation-is available at github.com/for-giobbe/BASE.
Collapse
Affiliation(s)
- Giobbe Forni
- BiGeA Department University of Bologna Bologna Italy
| | | | | | | |
Collapse
|
6
|
Kakehashi R, Kurabayashi A. Patterns of Natural Selection on Mitochondrial Protein-Coding Genes in Lungless Salamanders: Relaxed Purifying Selection and Presence of Positively Selected Codon Sites in the Family Plethodontidae. Int J Genomics 2021; 2021:6671300. [PMID: 33928143 PMCID: PMC8053045 DOI: 10.1155/2021/6671300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 03/12/2021] [Accepted: 03/26/2021] [Indexed: 11/18/2022] Open
Abstract
There are two distinct lungless groups in caudate amphibians (salamanders and newts) (the family Plethodontidae and the genus Onychodactylus, from the family Hynobiidae). Lunglessness is considered to have evolved in response to environmental and/or ecological adaptation with respect to oxygen requirements. We performed selection analyses on lungless salamanders to elucidate the selective patterns of mitochondrial protein-coding genes associated with lunglessness. The branch model and RELAX analyses revealed the occurrence of relaxed selection (an increase of the dN/dS ratio = ω value) in most mitochondrial protein-coding genes of plethodontid salamander branches but not in those of Onychodactylus. Additional branch model and RELAX analyses indicated that direct-developing plethodontids showed the relaxed pattern for most mitochondrial genes, although metamorphosing plethodontids had fewer relaxed genes. Furthermore, aBSREL analysis detected positively selected codons in three plethodontid branches but not in Onychodactylus. One of these three branches corresponded to the most recent common ancestor, and the others corresponded with the most recent common ancestors of direct-developing branches within Hemidactyliinae. The positive selection of mitochondrial protein-coding genes in Plethodontidae is probably associated with the evolution of direct development.
Collapse
Affiliation(s)
- Ryosuke Kakehashi
- Faculty of Bio-Science, Nagahama Institute of Bio-Science and Technology, Shiga 526-0829, Japan
| | - Atsushi Kurabayashi
- Faculty of Bio-Science, Nagahama Institute of Bio-Science and Technology, Shiga 526-0829, Japan
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom 2520, South Africa
| |
Collapse
|
7
|
GWideCodeML: A Python Package for Testing Evolutionary Hypotheses at the Genome-Wide Level. G3-GENES GENOMES GENETICS 2020; 10:4369-4372. [PMID: 33093185 PMCID: PMC7718741 DOI: 10.1534/g3.120.401874] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
One of the most widely used programs for detecting positive selection, at the molecular level, is the program codeml, which is implemented in the Phylogenetic Analysis by Maximum Likelihood (PAML) package. However, it has a limitation when it comes to genome-wide studies, as it runs on a gene-by-gene basis. Furthermore, the size of such studies will depend on the number of orthologous genes the genomes have income and these are often restricted to only account for instances where a one-to-one relationship is observed between the genomes. In this work, we present GWideCodeML, a Python package, which runs a genome-wide codeml with the option of parallelization. To maximize the number of analyzed genes, the package allows for a variable number of taxa in the alignments and will automatically prune the topology to fit each of them, before running codeml.
Collapse
|
8
|
Borges R, Fonseca J, Gomes C, Johnson WE, O'Brien SJ, Zhang G, Gilbert MTP, Jarvis ED, Antunes A. Avian Binocularity and Adaptation to Nocturnal Environments: Genomic Insights from a Highly Derived Visual Phenotype. Genome Biol Evol 2020; 11:2244-2255. [PMID: 31386143 PMCID: PMC6735850 DOI: 10.1093/gbe/evz111] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/20/2019] [Indexed: 01/04/2023] Open
Abstract
Typical avian eyes are phenotypically engineered for photopic vision (daylight). In contrast, the highly derived eyes of the barn owl (Tyto alba) are adapted for scotopic vision (dim light). The dramatic modifications distinguishing barn owl eyes from other birds include: 1) shifts in frontal orientation to improve binocularity, 2) rod-dominated retina, and 3) enlarged corneas and lenses. Some of these features parallel mammalian eye patterns, which are hypothesized to have initially evolved in nocturnal environments. Here, we used an integrative approach combining phylogenomics and functional phenotypes of 211 eye-development genes across 48 avian genomes representing most avian orders, including the stem lineage of the scotopic-adapted barn owl. Overall, we identified 25 eye-development genes that coevolved under intensified or relaxed selection in the retina, lens, cornea, and optic nerves of the barn owl. The agtpbp1 gene, which is associated with the survival of photoreceptor populations, was pseudogenized in the barn owl genome. Our results further revealed that barn owl retinal genes responsible for the maintenance, proliferation, and differentiation of photoreceptors experienced an evolutionary relaxation. Signatures of relaxed selection were also observed in the lens and cornea morphology-associated genes, suggesting that adaptive evolution in these structures was essentially structural. Four eye-development genes (ephb1, phactr4, prph2, and rs1) evolved in positive association with the orbit convergence in birds and under relaxed selection in the barn owl lineage, likely contributing to an increased reliance on binocular vision in the barn owl. Moreover, we found evidence of coevolutionary interactions among genes that are expressed in the retina, lens, and optic nerve, suggesting synergetic adaptive events. Our study disentangles the genomic changes governing the binocularity and low-light perception adaptations of barn owls to nocturnal environments while revealing the molecular mechanisms contributing to the shift from the typical avian photopic vision to the more-novel scotopic-adapted eye.
Collapse
Affiliation(s)
- Rui Borges
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Portugal.,Department of Biology, Faculty of Sciences, University of Porto, Portugal
| | - João Fonseca
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Portugal
| | - Cidália Gomes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Portugal
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, Virginia.,Walter Reed Biosystematics Unit, Smithsonian Institution, Suitland, Maryland
| | - Stephen J O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, Russia.,Guy Harvey Oceanographic Center, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University
| | - Guojie Zhang
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark.,China National GeneBank, BGI-Shenzen, Shenzhen, China.,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - M Thomas P Gilbert
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Erich D Jarvis
- Laboratory of Neurogenetics of Language, Rockefeller University.,Howard Hughes Medical Institute, Chevy Chase, Maryland
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Portugal.,Department of Biology, Faculty of Sciences, University of Porto, Portugal
| |
Collapse
|
9
|
Maldonado E, Antunes A. LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation. BMC Bioinformatics 2019; 20:739. [PMID: 31888452 PMCID: PMC6937843 DOI: 10.1186/s12859-019-3292-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 11/26/2019] [Indexed: 01/22/2023] Open
Abstract
Background Recent advances in genome sequencing technologies and the cost drop in high-throughput sequencing continue to give rise to a deluge of data available for downstream analyses. Among others, evolutionary biologists often make use of genomic data to uncover phenotypic diversity and adaptive evolution in protein-coding genes. Therefore, multiple sequence alignments (MSA) and phylogenetic trees (PT) need to be estimated with optimal results. However, the preparation of an initial dataset of multiple sequence file(s) (MSF) and the steps involved can be challenging when considering extensive amount of data. Thus, it becomes necessary the development of a tool that removes the potential source of error and automates the time-consuming steps of a typical workflow with high-throughput and optimal MSA and PT estimations. Results We introduce LMAP_S (Lightweight Multigene Alignment and Phylogeny eStimation), a user-friendly command-line and interactive package, designed to handle an improved alignment and phylogeny estimation workflow: MSF preparation, MSA estimation, outlier detection, refinement, consensus, phylogeny estimation, comparison and editing, among which file and directory organization, execution, manipulation of information are automated, with minimal manual user intervention. LMAP_S was developed for the workstation multi-core environment and provides a unique advantage for processing multiple datasets. Our software, proved to be efficient throughout the workflow, including, the (unlimited) handling of more than 20 datasets. Conclusions We have developed a simple and versatile LMAP_S package enabling researchers to effectively estimate multiple datasets MSAs and PTs in a high-throughput fashion. LMAP_S integrates more than 25 software providing overall more than 65 algorithm choices distributed in five stages. At minimum, one FASTA file is required within a single input directory. To our knowledge, no other software combines MSA and phylogeny estimation with as many alternatives and provides means to find optimal MSAs and phylogenies. Moreover, we used a case study comparing methodologies that highlighted the usefulness of our software. LMAP_S has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP_S package is released under GPLv3 license and is freely available at https://lmap-s.sourceforge.io/.
Collapse
Affiliation(s)
- Emanuel Maldonado
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal. .,Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007, Porto, Portugal.
| |
Collapse
|
10
|
The Vertebrate TLR Supergene Family Evolved Dynamically by Gene Gain/Loss and Positive Selection Revealing a Host–Pathogen Arms Race in Birds. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11080131] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The vertebrate toll-like receptor (TLRs) supergene family is a first-line immune defense against viral and non-viral pathogens. Here, comparative evolutionary-genomics of 79 vertebrate species (8 mammals, 48 birds, 11 reptiles, 1 amphibian, and 11 fishes) revealed differential gain/loss of 26 TLRs, including 6 (TLR3, TLR7, TLR8, TLR14, TLR21, and TLR22) that originated early in vertebrate evolution before the diversification of Agnatha and Gnathostomata. Subsequent dynamic gene gain/loss led to lineage-specific diversification with TLR repertoires ranging from 8 subfamilies in birds to 20 in fishes. Lineage-specific loss of TLR8-9 and TLR13 in birds and gains of TLR6 and TLR10-12 in mammals and TLR19-20 and TLR23-27 in fishes. Among avian species, 5–10% of the sites were under positive selection (PS) (omega 1.5–2.5) with radical amino-acid changes likely affecting TLR structure/functionality. In non-viral TLR4 the 20 PS sites (posterior probability PP > 0.99) likely increased ability to cope with diversified ligands (e.g., lipopolysaccharide and lipoteichoic). For viral TLR7, 23 PS sites (PP > 0.99) possibly improved recognition of highly variable viral ssRNAs. Rapid evolution of the TLR supergene family reflects the host–pathogen arms race and the coevolution of ligands/receptors, which follows the premise that birds have been important vectors of zoonotic pathogens and reservoirs for viruses.
Collapse
|
11
|
Gao F, Chen C, Arab DA, Du Z, He Y, Ho SYW. EasyCodeML: A visual tool for analysis of selection using CodeML. Ecol Evol 2019; 9:3891-3898. [PMID: 31015974 PMCID: PMC6467853 DOI: 10.1002/ece3.5015] [Citation(s) in RCA: 244] [Impact Index Per Article: 48.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Revised: 02/01/2019] [Accepted: 02/11/2019] [Indexed: 01/01/2023] Open
Abstract
The genomic signatures of positive selection and evolutionary constraints can be detected by analyses of nucleotide sequences. One of the most widely used programs for this purpose is CodeML, part of the PAML package. Although a number of bioinformatics tools have been developed to facilitate the use of CodeML, these have various limitations. Here, we present a wrapper tool named EasyCodeML that provides a user-friendly graphical interface for using CodeML. EasyCodeML has a custom running mode in which parameters can be adjusted to meet different requirements. It also offers a preset running mode in which an evolutionary analysis pipeline and publication-quality tables can be exported by a single click. EasyCodeML allows visualized, interactive tree labelling, which greatly simplifies the use of the branch, branch-site, and clade models of selection. The program allows comparison of major codon-based models for analyses of selection. EasyCodeML is a stand-alone package that is supported in Windows, Mac, and Linux operating systems, and is freely available at https://github.com/BioEasy/EasyCodeML.
Collapse
Affiliation(s)
- Fangluan Gao
- Fujian Key Laboratory of Plant Virology, Institute of Plant VirologyFujian Agriculture and Forestry UniversityFuzhouChina
- School of Life and Environmental SciencesUniversity of SydneySydneyNew South WalesAustralia
| | - Chengjie Chen
- College of HorticultureSouth China Agricultural UniversityGuangzhouChina
| | - Daej A. Arab
- School of Life and Environmental SciencesUniversity of SydneySydneyNew South WalesAustralia
| | - Zhenguo Du
- Fujian Key Laboratory of Plant Virology, Institute of Plant VirologyFujian Agriculture and Forestry UniversityFuzhouChina
| | - Yehua He
- College of HorticultureSouth China Agricultural UniversityGuangzhouChina
| | - Simon Y. W. Ho
- School of Life and Environmental SciencesUniversity of SydneySydneyNew South WalesAustralia
| |
Collapse
|
12
|
Pavlovich SS, Lovett SP, Koroleva G, Guito JC, Arnold CE, Nagle ER, Kulcsar K, Lee A, Thibaud-Nissen F, Hume AJ, Mühlberger E, Uebelhoer LS, Towner JS, Rabadan R, Sanchez-Lockhart M, Kepler TB, Palacios G. The Egyptian Rousette Genome Reveals Unexpected Features of Bat Antiviral Immunity. Cell 2018; 173:1098-1110.e18. [PMID: 29706541 PMCID: PMC7112298 DOI: 10.1016/j.cell.2018.03.070] [Citation(s) in RCA: 161] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 01/22/2018] [Accepted: 03/27/2018] [Indexed: 12/27/2022]
Abstract
Bats harbor many viruses asymptomatically, including several notorious for causing extreme virulence in humans. To identify differences between antiviral mechanisms in humans and bats, we sequenced, assembled, and analyzed the genome of Rousettus aegyptiacus, a natural reservoir of Marburg virus and the only known reservoir for any filovirus. We found an expanded and diversified KLRC/KLRD family of natural killer cell receptors, MHC class I genes, and type I interferons, which dramatically differ from their functional counterparts in other mammals. Such concerted evolution of key components of bat immunity is strongly suggestive of novel modes of antiviral defense. An evaluation of the theoretical function of these genes suggests that an inhibitory immune state may exist in bats. Based on our findings, we hypothesize that tolerance of viral infection, rather than enhanced potency of antiviral defenses, may be a key mechanism by which bats asymptomatically host viruses that are pathogenic in humans.
Collapse
Affiliation(s)
- Stephanie S Pavlovich
- Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA
| | - Sean P Lovett
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA
| | - Galina Koroleva
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA
| | - Jonathan C Guito
- Viral Special Pathogens Branch, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Catherine E Arnold
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA
| | - Elyse R Nagle
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA
| | - Kirsten Kulcsar
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA
| | - Albert Lee
- Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20892, USA
| | - Adam J Hume
- Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA; National Emerging Infectious Diseases Laboratory, Boston University, Boston, MA 02118, USA
| | - Elke Mühlberger
- Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA; National Emerging Infectious Diseases Laboratory, Boston University, Boston, MA 02118, USA
| | - Luke S Uebelhoer
- Viral Special Pathogens Branch, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Jonathan S Towner
- Viral Special Pathogens Branch, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA
| | - Raul Rabadan
- Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Mariano Sanchez-Lockhart
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA; Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Thomas B Kepler
- Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA; Department of Mathematics and Statistics, Boston University, Boston, MA 02215, USA; National Emerging Infectious Diseases Laboratory, Boston University, Boston, MA 02118, USA.
| | - Gustavo Palacios
- Center for Genome Sciences, United States Army Research Institute of Infectious Diseases (USAMRIID), Frederick, MD 21702, USA.
| |
Collapse
|