1
|
Ayala-Berdon J, Medina-Bello KI. Torpor energetics are related to the interaction between body mass and climate in bats of the family Vespertilionidae. J Exp Biol 2024; 227:jeb246824. [PMID: 39206564 DOI: 10.1242/jeb.246824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 08/16/2024] [Indexed: 09/04/2024]
Abstract
Torpor is an adaptive strategy allowing heterothermic animals to cope with energy limitations. In birds and mammals, intrinsic and extrinsic factors, such as body mass and ambient temperature, are the main variables influencing torpor use. A theoretical model of the relationship between metabolic rate during torpor and ambient temperature has been proposed. Nevertheless, no empirical attempts have been made to assess the model predictions under different climates. Using open-flow respirometry, we evaluated the ambient temperature at which bats entered torpor and when torpid metabolic rate reached its minimum, the reduction in metabolic rate below basal values, and minimum torpid metabolic rate in 11 bat species of the family Vespertilionidae with different body mass from warm and cold climates. We included data on the minimum torpid metabolic rate of five species we retrieved from the literature. We tested the effects using mixed-effect phylogenetic models. All models showed a significant interaction between body mass and climate. Smaller bats went into torpor and reached minimum torpid metabolic rates at warmer temperatures, showed a higher reduction in the metabolic rate below basal values, and presented lower torpid metabolic rates than larger ones. The slopes of the models were different for bats from different climates. These results are likely explained by differences in body mass and the metabolic rate of bats, which may favor larger bats expressing torpor in colder sites and smaller bats in the warmer ones. Further studies to assess torpor use in bats from different climates are proposed.
Collapse
Affiliation(s)
- Jorge Ayala-Berdon
- CONAHCYT, Universidad Autónoma de Tlaxcala, Carretera Tlaxcala-Puebla Km. 1.5, C.P. 90062, Tlaxcala de Xicohténcatl, Tlaxcala, México
| | - Kevin I Medina-Bello
- Posgrado en Ciencias Biológicas, Centro Tlaxcala de Biología de la Conducta, Universidad Autónoma de Tlaxcala, Carretera Tlaxcala-Puebla Km. 1.5, C.P. 90062, Tlaxcala de Xicohténcatl, Tlaxcala, México
| |
Collapse
|
2
|
Doll Y, Koga H, Tsukaya H. Beyond stomatal development: SMF transcription factors as versatile toolkits for land plant evolution. QUANTITATIVE PLANT BIOLOGY 2024; 5:e6. [PMID: 39220371 PMCID: PMC11363000 DOI: 10.1017/qpb.2024.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 04/18/2024] [Accepted: 04/30/2024] [Indexed: 09/04/2024]
Abstract
As master transcription factors of stomatal development, SPEECHLESS, MUTE, and FAMA, collectively termed SMFs, are primary targets of molecular genetic analyses in the model plant Arabidopsis thaliana. Studies in other model systems identified SMF orthologs as key players in evolutionary developmental biology studies on stomata. However, recent studies on the astomatous liverwort Marchantia polymorpha revealed that the functions of these genes are not limited to the stomatal development, but extend to other types of tissues, namely sporophytic setal and gametophytic epidermal tissues. These studies provide insightful examples of gene-regulatory network co-opting, and highlight SMFs and related transcription factors as general toolkits for novel trait evolution in land plant lineages. Here, we critically review recent literature on the SMF-like gene in M. polymorpha and discuss their implications for plant evolutionary biology.
Collapse
Affiliation(s)
- Yuki Doll
- Division of Biological Sciences, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Hiroyuki Koga
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Tsukaya
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
3
|
Pouresmaeil M, Azizi-Dargahlou S. Investigation of CaMV-host co-evolution through synonymous codon pattern. J Basic Microbiol 2024; 64:e2300664. [PMID: 38436477 DOI: 10.1002/jobm.202300664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/20/2024] [Accepted: 02/10/2024] [Indexed: 03/05/2024]
Abstract
Cauliflower mosaic virus (CaMV) has a double-stranded DNA genome and is globally distributed. The phylogeny tree of 121 CaMV isolates was categorized into two primary groups, with Iranian isolates showing the greatest genetic variations. Nucleotide A demonstrated the highest percentage (36.95%) in the CaMV genome and the dinucleotide odds ratio analysis revealed that TC dinucleotide (1.34 ≥ 1.23) and CG dinucleotide (0.63 ≤ 0.78) are overrepresented and underrepresented, respectively. Relative synonymous codon usage (RSCU) analysis confirmed codon usage bias in CaMV and its hosts. Brassica oleracea and Brassica rapa, among the susceptible hosts of CaMV, showed a codon adaptation index (CAI) value above 0.8. Additionally, relative codon deoptimization index (RCDI) results exhibited the highest degree of deoptimization in Raphanus sativus. These findings suggest that the genes of CaMV underwent codon adaptation with its hosts. Among the CaMV open reading frames (ORFs), genes that produce reverse transcriptase and virus coat proteins showed the highest CAI value of 0.83. These genes are crucial for the creation of new virion particles. The results confirm that CaMV co-evolved with its host to ensure the optimal expression of its genes in the hosts, allowing for easy infection and effective spread. To detect the force behind codon usage bias, an effective number of codons (ENC)-plot and neutrality plot were conducted. The results indicated that natural selection is the primary factor influencing CaMV codon usage bias.
Collapse
Affiliation(s)
- Mahin Pouresmaeil
- Faculty of Agriculture and Natural Resources, University of Mohaghegh Ardabili, Ardabil, Iran
| | - Shahnam Azizi-Dargahlou
- Agricultural Biotechnology, Seed and Plant Certification and Registration Institute, Ardabil Agricultural and Natural Resources Research Center, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| |
Collapse
|
4
|
Maza-Márquez P, Lee MD, Bebout BM. Community ecology and functional potential of bacteria, archaea, eukarya and viruses in Guerrero Negro microbial mat. Sci Rep 2024; 14:2561. [PMID: 38297006 PMCID: PMC10831059 DOI: 10.1038/s41598-024-52626-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 01/22/2024] [Indexed: 02/02/2024] Open
Abstract
In this study, the microbial ecology, potential environmental adaptive mechanisms, and the potential evolutionary interlinking of genes between bacterial, archaeal and viral lineages in Guerrero Negro (GN) microbial mat were investigated using metagenomic sequencing across a vertical transect at millimeter scale. The community composition based on unique genes comprised bacteria (98.01%), archaea (1.81%), eukarya (0.07%) and viruses (0.11%). A gene-focused analysis of bacteria archaea, eukarya and viruses showed a vertical partition of the community. The greatest coverages of genes of bacteria and eukarya were detected in first layers, while the highest coverages of genes of archaea and viruses were found in deeper layers. Many genes potentially related to adaptation to the local environment were detected, such as UV radiation, multidrug resistance, oxidative stress, heavy metals, salinity and desiccation. Those genes were found in bacterial, archaeal and viral lineages with 6477, 44, and 1 genes, respectively. The evolutionary histories of those genes were studied using phylogenetic analysis, showing an interlinking between domains in GN mat.
Collapse
Affiliation(s)
- P Maza-Márquez
- Exobiology Branch, NASA Ames Research Center, Moffett Field, CA, USA.
- University of Granada, Granada, Spain.
| | - M D Lee
- Exobiology Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - B M Bebout
- Exobiology Branch, NASA Ames Research Center, Moffett Field, CA, USA
| |
Collapse
|
5
|
Garger D, Meinel M, Dietl T, Hillig C, Garzorz‐Stark N, Eyerich K, de Angelis MH, Eyerich S, Menden MP. The impact of the cardiovascular component and somatic mutations on ageing. Aging Cell 2023; 22:e13957. [PMID: 37608601 PMCID: PMC10577550 DOI: 10.1111/acel.13957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 06/27/2023] [Accepted: 07/20/2023] [Indexed: 08/24/2023] Open
Abstract
Mechanistic insight into ageing may empower prolonging the lifespan of humans; however, a complete understanding of this process is still lacking despite a plethora of ageing theories. In order to address this, we investigated the association of lifespan with eight phenotypic traits, that is, litter size, body mass, female and male sexual maturity, somatic mutation, heart, respiratory, and metabolic rate. In support of the somatic mutation theory, we analysed 15 mammalian species and their whole-genome sequencing deriving somatic mutation rate, which displayed the strongest negative correlation with lifespan. All remaining phenotypic traits showed almost equivalent strong associations across this mammalian cohort, however, resting heart rate explained additional variance in lifespan. Integrating somatic mutation and resting heart rate boosted the prediction of lifespan, thus highlighting that resting heart rate may either directly influence lifespan, or represents an epiphenomenon for additional lower-level mechanisms, for example, metabolic rate, that are associated with lifespan.
Collapse
Affiliation(s)
- Daniel Garger
- Computational Health Center, Helmholtz MunichNeuherbergGermany
- Faculty of BiologyLudwig Maximilian UniversityMartinsriedGermany
| | - Martin Meinel
- Computational Health Center, Helmholtz MunichNeuherbergGermany
- Faculty of BiologyLudwig Maximilian UniversityMartinsriedGermany
- Department of Dermatology and AllergyTechnical University of MunichMunichGermany
| | - Tamina Dietl
- Computational Health Center, Helmholtz MunichNeuherbergGermany
- Faculty of BiologyLudwig Maximilian UniversityMartinsriedGermany
| | - Christina Hillig
- Computational Health Center, Helmholtz MunichNeuherbergGermany
- Department of MathematicsTechnical University of MunichMunichGermany
| | - Natalie Garzorz‐Stark
- Department of Dermatology and AllergyTechnical University of MunichMunichGermany
- Division of Dermatology and Venereology, Department of Medicine Solna, and Center for molecular medicineKarolinska InstitutetStockholmSweden
| | - Kilian Eyerich
- Division of Dermatology and Venereology, Department of Medicine Solna, and Center for molecular medicineKarolinska InstitutetStockholmSweden
- Department of Dermatology and Venerology, Medical SchoolUniversity of FreiburgFreiburgGermany
| | - Martin Hrabě de Angelis
- Institute of Experimental GeneticsHelmholtz MunichNeuherbergGermany
- Chair of Experimental Genetics, TUM School of Life SciencesTechnical University MunichFreisingGermany
- German Center for Diabetes Research (DZD)NeuherbergGermany
| | - Stefanie Eyerich
- Center for Allergy and Environment (ZAUM)Technical University MunichMunichGermany
- Institute for Allergy ResearchHelmholtz Munich, NeuherbergNeuherbergGermany
| | - Michael P. Menden
- Computational Health Center, Helmholtz MunichNeuherbergGermany
- Faculty of BiologyLudwig Maximilian UniversityMartinsriedGermany
- German Center for Diabetes Research (DZD)NeuherbergGermany
- Department of Biochemistry and PharmacologyUniversity of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
6
|
Aizenbud Y, Jaffe A, Wang M, Hu A, Amsel N, Nadler B, Chang JT, Kluger Y. Spectral top-down recovery of latent tree models. INFORMATION AND INFERENCE : A JOURNAL OF THE IMA 2023; 12:iaad032. [PMID: 37593361 PMCID: PMC10431953 DOI: 10.1093/imaiai/iaad032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/24/2023] [Accepted: 06/24/2023] [Indexed: 08/19/2023]
Abstract
Modeling the distribution of high-dimensional data by a latent tree graphical model is a prevalent approach in multiple scientific domains. A common task is to infer the underlying tree structure, given only observations of its terminal nodes. Many algorithms for tree recovery are computationally intensive, which limits their applicability to trees of moderate size. For large trees, a common approach, termed divide-and-conquer, is to recover the tree structure in two steps. First, separately recover the structure of multiple, possibly random subsets of the terminal nodes. Second, merge the resulting subtrees to form a full tree. Here, we develop spectral top-down recovery (STDR), a deterministic divide-and-conquer approach to infer large latent tree models. Unlike previous methods, STDR partitions the terminal nodes in a non random way, based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes. We prove that under certain conditions, this partitioning is consistent with the tree structure. This, in turn, leads to a significantly simpler merging procedure of the small subtrees. We prove that STDR is statistically consistent and bound the number of samples required to accurately recover the tree with high probability. Using simulated data from several common tree models in phylogenetics, we demonstrate that STDR has a significant advantage in terms of runtime, with improved or similar accuracy.
Collapse
Affiliation(s)
- Yariv Aizenbud
- Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA
| | - Ariel Jaffe
- Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA
| | - Meng Wang
- Department of Pathology, Yale University, New Haven, CT 06511, USA
| | - Amber Hu
- Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA
| | - Noah Amsel
- Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA
| | - Boaz Nadler
- Department of Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Joseph T Chang
- Department of Statistics, Yale University, New Haven, CT 06520, USA
| | - Yuval Kluger
- Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA
- Department of Pathology, Yale University, New Haven, CT 06511, USA
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA
| |
Collapse
|
7
|
Aldakhil T, Alshammari SO, Siraj B, El-Aarag B, Zarina S, Salehi D, Ahmed A. The structural characterization and bioactivity assessment of nonspecific lipid transfer protein 1 (nsLTP1) from caraway (Carum carvi) seeds. BMC Complement Med Ther 2023; 23:254. [PMID: 37474939 PMCID: PMC10357877 DOI: 10.1186/s12906-023-04083-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 07/11/2023] [Indexed: 07/22/2023] Open
Abstract
BACKGROUND Carum carvi (caraway) of the Apiaceae family has been used in many cultures as a cooking spice and part of the folk medicine. Previous reports primarily focus on the medicinal properties of caraway seed essential oil and the whole seeds extract. However, no effort has been made to study caraway proteins and their potential pharmacological properties, including nonspecific lipid transfer protein (nsLTP), necessitating further research. The current study aimed to characterize nonspecific lipid transfer protein 1 (nsLTP1) from caraway seed, determine its three-dimensional structure, and analyze protein-ligand complex interactions through docking studies. We also evaluated nsLTP1 in vitro cytotoxic effect and antioxidant capacity. Additionally, nsLTP1 thermal- and pH- stability were investigated. METHODS Caraway nsLTP1 was purified using two-dimensional chromatography. The complete amino acid sequence of nsLTP1 was achieved by intact protein sequence for the first 20 residues and the overlapping digested peptides. The three-dimensional structure was predicted using MODELLER. Autodock Vina software was employed for docking fatty acids against caraway nsLTP1. Assessment of nsLTP1 cytotoxic activity was achieved by MTS assay, and the Trolox equivalent antioxidant capacity (TAC) was determined. Thermal and pH stability of the nsLTP1 was examined by circular dichroism (CD) spectroscopy. RESULTS Caraway nsLTP1 is composed of 91 residues and weighs 9652 Da. The three-dimensional structure of caraway nsLTP1 sequence was constructed based on searching known structures in the PDB. We chose nsLTP of Solanum melongena (PDB ID: 5TVI) as the modeling template with the highest identity among all other homologous proteins. Docking linolenic acid with caraway protein showed a maximum binding score of -3.6 kcal/mol. A preliminary screening of caraway nsLTP1 suppressed the proliferation of human breast cancer cell lines MDA-MB-231 and MCF-7 in a dose‑dependent manner with an IC50 value of 52.93 and 44.76 μM, respectively. Also, nsLTP1 (41.4 μM) showed TAC up to 750.4 μM Trolox equivalent. Assessment of nsLTP1 demonstrated high thermal/pH stability. CONCLUSION To the best of our knowledge, this is the first study carried out on nsLTP1 from caraway seeds. We hereby report the sequence of nsLTP1 from caraway seeds and its possible interaction with respective fatty acids using in silico approach. Our data indicated that the protein had anticancer and antioxidant activities and was thermally stable.
Collapse
Affiliation(s)
- Taibah Aldakhil
- Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA
- Department of Pharmaceutical Chemistry, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Al-Kharj, 16278, Saudi Arabia
| | - Saud O Alshammari
- Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA
- Department of Plant Chemistry and Natural Products, Faculty of Pharmacy, Northern Border University, Arar, 91431, Saudi Arabia
| | - Bushra Siraj
- Dr. Zafar H. Zaidi Center for Proteomics, University of Karachi, Karachi, Pakistan
| | - Bishoy El-Aarag
- Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA
- Biochemistry Division, Chemistry Department, Faculty of Science, Menoufia University, Shebin El-Koom, 32512, Egypt
| | - Shamshad Zarina
- Dr. Zafar H. Zaidi Center for Proteomics, University of Karachi, Karachi, Pakistan
| | - David Salehi
- Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA
| | - Aftab Ahmed
- Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA, 92618, USA.
| |
Collapse
|
8
|
Guarracino A, Buonaiuto S, de Lima LG, Potapova T, Rhie A, Koren S, Rubinstein B, Fischer C, Gerton JL, Phillippy AM, Colonna V, Garrison E. Recombination between heterologous human acrocentric chromosomes. Nature 2023; 617:335-343. [PMID: 37165241 PMCID: PMC10172130 DOI: 10.1038/s41586-023-05976-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 03/17/2023] [Indexed: 05/12/2023]
Abstract
The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats and extended segmental duplications1,2. Although the resolution of these regions in the first complete assembly of a human genome-the Telomere-to-Telomere Consortium's CHM13 assembly (T2T-CHM13)-provided a model of their homology3, it remained unclear whether these patterns were ancestral or maintained by ongoing recombination exchange. Here we show that acrocentric chromosomes contain pseudo-homologous regions (PHRs) indicative of recombination between non-homologous sequences. Utilizing an all-to-all comparison of the human pangenome from the Human Pangenome Reference Consortium4 (HPRC), we find that contigs from all of the SAACs form a community. A variation graph5 constructed from centromere-spanning acrocentric contigs indicates the presence of regions in which most contigs appear nearly identical between heterologous acrocentric chromosomes in T2T-CHM13. Except on chromosome 15, we observe faster decay of linkage disequilibrium in the pseudo-homologous regions than in the corresponding short and long arms, indicating higher rates of recombination6,7. The pseudo-homologous regions include sequences that have previously been shown to lie at the breakpoint of Robertsonian translocations8, and their arrangement is compatible with crossover in inverted duplications on chromosomes 13, 14 and 21. The ubiquity of signals of recombination between heterologous acrocentric chromosomes seen in the HPRC draft pangenome suggests that these shared sequences form the basis for recurrent Robertsonian translocations, providing sequence and population-based confirmation of hypotheses first developed from cytogenetic studies 50 years ago9.
Collapse
Affiliation(s)
- Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Silvia Buonaiuto
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Christian Fischer
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Vincenza Colonna
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA.
| |
Collapse
|
9
|
Jacques F, Bolivar P, Pietras K, Hammarlund EU. Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023; 18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open
Abstract
Developments in sequencing technologies and the sequencing of an ever-increasing number of genomes have revolutionised studies of biodiversity and organismal evolution. This accumulation of data has been paralleled by the creation of numerous public biological databases through which the scientific community can mine the sequences and annotations of genomes, transcriptomes, and proteomes of multiple species. However, to find the appropriate databases and bioinformatic tools for respective inquiries and aims can be challenging. Here, we present a compilation of DNA and protein databases, as well as bioinformatic tools for phylogenetic reconstruction and a wide range of studies on molecular evolution. We provide a protocol for information extraction from biological databases and simple phylogenetic reconstruction using probabilistic and distance methods, facilitating the study of biodiversity and evolution at the molecular level for the broad scientific community.
Collapse
Affiliation(s)
- Florian Jacques
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Paulina Bolivar
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Kristian Pietras
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Emma U. Hammarlund
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| |
Collapse
|
10
|
Abstract
Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these biases have not been systematically examined. To address this gap, here we use 116,884 publicly available metagenome-assembled genomes (MAGs, completeness ≥80%) from 203 surveys worldwide as a culture-independent sample of bacterial and archaeal diversity, and compare these MAGs to the popular RefSeq genome database, which heavily relies on cultures. We compare the distribution of 12,454 KEGG gene orthologs (used as trait proxies) in the MAGs and RefSeq genomes, while controlling for environment type (ocean, soil, lake, bioreactor, human, and other animals). Using statistical modeling, we then determine the conditional probabilities that a species is represented in RefSeq depending on its genetic repertoire. We find that the majority of examined genes are significantly biased for or against in RefSeq. Our systematic estimates of gene prevalences across bacteria and archaea in nature and gene-specific biases in reference genomes constitutes a resource for addressing these issues in the future.
Collapse
Affiliation(s)
- Sage Albright
- Department of Biology, University of Oregon, Eugene, USA
| | - Stilianos Louca
- Department of Biology, University of Oregon, Eugene, USA.
- Institute of Ecology and Evolution, University of Oregon, Eugene, USA.
| |
Collapse
|
11
|
Orlandi KN, Phillips SR, Sailer ZR, Harman JL, Harms MJ. Topiary: Pruning the manual labor from ancestral sequence reconstruction. Protein Sci 2023; 32:e4551. [PMID: 36565302 PMCID: PMC9847077 DOI: 10.1002/pro.4551] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 12/14/2022] [Accepted: 12/17/2022] [Indexed: 12/25/2022]
Abstract
Ancestral sequence reconstruction (ASR) is a powerful tool to study the evolution of proteins and thus gain deep insight into the relationships among protein sequence, structure, and function. A major barrier to its broad use is the complexity of the task: it requires multiple software packages, complex file manipulations, and expert phylogenetic knowledge. Here we introduce topiary, a software pipeline that aims to overcome this barrier. To use topiary, users prepare a spreadsheet with a handful of sequences. Topiary then: (1) Infers the taxonomic scope for the ASR study and finds relevant sequences by BLAST; (2) Does taxonomically informed sequence quality control and redundancy reduction; (3) Constructs a multiple sequence alignment; (4) Generates a maximum-likelihood gene tree; (5) Reconciles the gene tree to the species tree; (6) Reconstructs ancestral amino acid sequences; and (7) Determines branch supports. The pipeline returns annotated evolutionary trees, spreadsheets with sequences, and graphical summaries of ancestor quality. This is achieved by integrating modern phylogenetics software (Muscle5, RAxML-NG, GeneRax, and PastML) with online databases (NCBI and the Open Tree of Life). In this paper, we introduce non-expert readers to the steps required for ASR, describe the specific design choices made in topiary, provide a detailed protocol for users, and then validate the pipeline using datasets from a broad collection of protein families. Topiary is freely available for download: https://github.com/harmslab/topiary.
Collapse
Affiliation(s)
- Kona N. Orlandi
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of BiologyUniversity of OregonEugeneOregonUSA
| | - Sophia R. Phillips
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| | - Zachary R. Sailer
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| | - Joseph L. Harman
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| | - Michael J. Harms
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| |
Collapse
|
12
|
Legall N, Salvador LCM. Selective sweep sites and SNP dense regions differentiate Mycobacterium bovis isolates across scales. Front Microbiol 2022; 13:787856. [PMID: 36160199 PMCID: PMC9489834 DOI: 10.3389/fmicb.2022.787856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 08/08/2022] [Indexed: 11/28/2022] Open
Abstract
Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), infects a broad mammalian host range worldwide. This characteristic has led to bidirectional transmission events between livestock and wildlife species as well as the formation of wildlife reservoirs, impacting the success of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites, however the genomic signatures related to host adaptation following spillover, alongside the role of other genomic factors in the M. bovis transmission process are understudied problems. We analyzed publicly available M. bovis datasets collected from 700 hosts across three countries with bTB endemic regions (United Kingdom, United States, and New Zealand) to investigate if genomic regions with high SNP density and/or selective sweep sites play a role in Mycobacterium bovis adaptation to new environments (e.g., at the host-species, geographical, and/or sub-population levels). A simulated M. bovis alignment was created to generate null distributions for defining genomic regions with high SNP counts and regions with selective sweeps evidence. Random Forest (RF) models were used to investigate evolutionary metrics within the genomic regions of interest to determine which genomic processes were the best for classifying M. bovis across ecological scales. We identified in the M. bovis genomes 14 and 132 high SNP density and selective sweep regions, respectively. Selective sweep regions were ranked as the most important in classifying M. bovis across the different scales in all RF models. SNP dense regions were found to have high importance in the badger and cattle specific RF models in classifying badger derived isolates from livestock derived ones. Additionally, the genes detected within these genomic regions harbor various pathogenic functions such as virulence and immunogenicity, membrane structure, host survival, and mycobactin production. The results of this study demonstrate how comparative genomics alongside machine learning approaches are useful to investigate further the nature of M. bovis host-pathogen interactions.
Collapse
Affiliation(s)
- Noah Legall
- Interdisciplinary Disease Ecology Across Scales Research Traineeship Program, University of Georgia, Athens, GA, United States
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United States
| | - Liliana C. M. Salvador
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, United States
- Department of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, GA, United States
| |
Collapse
|
13
|
Tabatabaee Y, Sarker K, Warnow T. Quintet Rooting: rooting species trees under the multi-species coalescent model. Bioinformatics 2022; 38:i109-i117. [PMID: 35758805 PMCID: PMC9236578 DOI: 10.1093/bioinformatics/btac224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Motivation Rooted species trees are a basic model with multiple applications throughout biology, including understanding adaptation, biodiversity, phylogeography and co-evolution. Because most species tree estimation methods produce unrooted trees, methods for rooting these trees have been developed. However, most rooting methods either rely on prior biological knowledge or assume that evolution is close to clock-like, which is not usually the case. Furthermore, most prior rooting methods do not account for biological processes that create discordance between gene trees and species trees. Results We present Quintet Rooting (QR), a method for rooting species trees based on a proof of identifiability of the rooted species tree under the multi-species coalescent model established by Allman, Degnan and Rhodes (J. Math. Biol., 2011). We show that QR is generally more accurate than other rooting methods, except under extreme levels of gene tree estimation error. Availability and implementation Quintet Rooting is available in open source form at https://github.com/ytabatabaee/Quintet-Rooting. The simulated datasets used in this study are from a prior study and are available at https://www.ideals.illinois.edu/handle/2142/55319. The biological dataset used in this study is also from a prior study and is available at http://gigadb.org/dataset/101041. Contact warnow@illinois.edu Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yasamin Tabatabaee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Kowshika Sarker
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
14
|
Hornok S, Szekeres S, Horváth G, Takács N, Bekő K, Kontschán J, Gyuranecz M, Tóth B, Sándor AD, Juhász A, Beck R, Farkas R. Diversity of tick species and associated pathogens on peri-urban wild boars – first report of the zoonotic Babesia cf. crassa from Hungary. Ticks Tick Borne Dis 2022; 13:101936. [DOI: 10.1016/j.ttbdis.2022.101936] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 02/23/2022] [Accepted: 03/05/2022] [Indexed: 10/18/2022]
|
15
|
Louca S. The rates of global bacterial and archaeal dispersal. THE ISME JOURNAL 2022; 16:159-167. [PMID: 34282284 PMCID: PMC8692594 DOI: 10.1038/s41396-021-01069-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 06/28/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023]
Abstract
The phylogenetic resolution at which microorganisms display geographic endemism, the rates at which they disperse at global scales, and the role of humans on global microbial dispersal are largely unknown. Answering these questions is necessary for interpreting microbial biogeography, ecology, and macroevolution and for predicting the spread of emerging pathogenic strains. To resolve these questions, I analyzed the geographic and evolutionary relationships between 36,795 bacterial and archaeal ("prokaryotic") genomes from ∼7000 locations around the world. I find clear signs of continental-scale endemism, including strong correlations between phylogenetic divergence and geographic distance. However, the phylogenetic scale at which endemism generally occurs is extremely small, and most "species" (defined by an average nucleotide identity ≥ 95%) and even closely related strains (average nucleotide identity ≥ 99.9%) are globally distributed. Human-associated lineages display faster dispersal rates than other terrestrial lineages; the average net distance between any two human-associated cell lineages diverging 50 years ago is roughly 580 km. These results suggest that many previously reported global-scale microbial biogeographical patterns are likely the result of recent or current environmental filtering rather than geographic endemism. For human-associated lineages, estimated transition rates between Europe and North America are particularly high, and much higher than for non-human associated terrestrial lineages, highlighting the role that human movement plays in global microbial dispersal. Dispersal was slowest for hot spring- and terrestrial subsurface-associated lineages, indicating that these environments may act as "isolated islands" of microbial evolution.
Collapse
Affiliation(s)
- Stilianos Louca
- Department of Biology, University of Oregon, Eugene, OR, USA.
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA.
| |
Collapse
|
16
|
Fournier GP, Parsons CW, Cutts EM, Tamre E. Standard Candles for Dating Microbial Lineages. Methods Mol Biol 2022; 2569:41-74. [PMID: 36083443 DOI: 10.1007/978-1-0716-2691-7_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Molecular clock analyses are challenging for microbial phylogenies, due to a lack of fossil calibrations that can reliably provide absolute time constraints. An alternative source of temporal constraints for microbial groups is provided by the inheritance of proteins that are specific for the utilization of eukaryote-derived substrates, which have often been dispersed across the Tree of Life via horizontal gene transfer. In particular, animal, algal, and plant-derived substrates are often produced by groups with more precisely known divergence times, providing an older-bound on their availability within microbial environments. Therefore, these ages can serve as "standard candles" for dating microbial groups across the Tree of Life, expanding the reach of informative molecular clock investigations. Here, we formally develop the concept of substrate standard candles and describe how they can be propagated and applied using both microbial species trees and individual gene family phylogenies. We also provide detailed evaluations of several candidate standard candles and discuss their suitability in light of their often complex evolutionary and metabolic histories.
Collapse
Affiliation(s)
- Gregory P Fournier
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Chris W Parsons
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Elise M Cutts
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Erik Tamre
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
17
|
Bhattacharya S, Nuttall PA. Phylogenetic Analysis Indicates That Evasin-Like Proteins of Ixodid Ticks Fall Into Three Distinct Classes. Front Cell Infect Microbiol 2021; 11:769542. [PMID: 34746035 PMCID: PMC8569228 DOI: 10.3389/fcimb.2021.769542] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 09/24/2021] [Indexed: 12/22/2022] Open
Abstract
Chemokines are structurally related proteins that activate leucocyte migration in response to injury or infection. Tick saliva contains chemokine-binding proteins or evasins which likely neutralize host chemokine function and inflammation. Biochemical characterisation of 50 evasins from Ixodes, Amblyomma and Rhipicephalus shows that they fall into two functional classes, A and B, with exclusive binding to either CC- or CXC- chemokines, respectively. Class A evasins, EVA1 and EVA4 have a four-disulfide-bonded core, whereas the class B evasin EVA3 has a three-disulfide-bonded “knottin” structure. All 29 class B evasins have six cysteine residues conserved with EVA3, arrangement of which defines a Cys6-motif. Nineteen of 21 class A evasins have eight cysteine residues conserved with EVA1/EVA4, the arrangement of which defines a Cys8-motif. Two class A evasins from Ixodes (IRI01, IHO01) have less than eight cysteines. Many evasin-like proteins have been identified in tick salivary transcriptomes, but their phylogenetic relationship with respect to biochemically characterized evasins is not clear. Here, using BLAST searches of tick transcriptomes with biochemically characterized evasins, we identify 292 class A and 157 class B evasins and evasin-like proteins from Prostriate (Ixodes), and Metastriate (Amblyomma, Dermacentor, Hyalomma, Rhipicephalus) ticks. Phylogenetic analysis shows that class A evasins/evasin-like proteins segregate into two classes, A1 and A2. Class A1 members are exclusive to Metastriate ticks and typically have a Cys8-motif and include EVA1 and EVA4. Class A2 members are exclusive to Prostriate ticks, lack the Cys8-motif, and include IHO01 and IRI01. Class B evasins/evasin-like proteins are present in both Prostriate and Metastriate lineages, typically have a Cys6-motif, and include EVA3. Most evasins/evasin-like proteins in Metastriate ticks belong to class A1, whereas in Prostriate species they are predominantly class B. In keeping with this, the majority of biochemically characterized Metastriate evasins bind CC-chemokines, whereas the majority of Prostriate evasins bind CXC-chemokines. While the origin of the structurally dissimilar classes A1 and A2 is yet unresolved, these results suggest that class B evasin-like proteins arose before the divergence of Prostriate and Metastriate lineages and likely functioned to neutralize CXC-chemokines and support blood feeding.
Collapse
Affiliation(s)
- Shoumo Bhattacharya
- Division of Cardiovascular Medicine, Radcliffe Department of Medicine, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.,National Institute for Health Research (NIHR) Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals National Health Service (NHS) Foundation Trust, Oxford, United Kingdom
| | | |
Collapse
|
18
|
Jacob Machado D, Scott R, Guirales S, Janies DA. Fundamental evolution of all Orthocoronavirinae including three deadly lineages descendent from Chiroptera-hosted coronaviruses: SARS-CoV, MERS-CoV and SARS-CoV-2. Cladistics 2021; 37:461-488. [PMID: 34570933 PMCID: PMC8239696 DOI: 10.1111/cla.12454] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2021] [Indexed: 12/14/2022] Open
Abstract
The severe acute respiratory syndrome coronavirus (SARS-CoV) emerged in humans in 2002. Despite reports showing Chiroptera as the original animal reservoir of SARS-CoV, many argue that Carnivora-hosted viruses are the most likely origin. The emergence of the Middle East respiratory syndrome coronavirus (MERS-CoV) in 2012 also involves Chiroptera-hosted lineages. However, factors such as the lack of comprehensive phylogenies hamper our understanding of host shifts once MERS-CoV emerged in humans and Artiodactyla. Since 2019, the origin of SARS-CoV-2, causative agent of coronavirus disease 2019 (COVID-19), added to this episodic history of zoonotic transmission events. Here we introduce a phylogenetic analysis of 2006 unique and complete genomes of different lineages of Orthocoronavirinae. We used gene annotations to align orthologous sequences for total evidence analysis under the parsimony optimality criterion. Deltacoronavirus and Gammacoronavirus were set as outgroups to understand spillovers of Alphacoronavirus and Betacoronavirus among ten orders of animals. We corroborated that Chiroptera-hosted viruses are the sister group of SARS-CoV, SARS-CoV-2 and MERS-related viruses. Other zoonotic events were qualified and quantified to provide a comprehensive picture of the risk of coronavirus emergence among humans. Finally, we used a 250 SARS-CoV-2 genomes dataset to elucidate the phylogenetic relationship between SARS-CoV-2 and Chiroptera-hosted coronaviruses.
Collapse
Affiliation(s)
- Denis Jacob Machado
- Department of Bioinformatics and GenomicsUniversity of North Carolina at Charlotte9331 Robert D. Snyder RdCharlotteNC28223USA
| | - Rachel Scott
- Department of Bioinformatics and GenomicsUniversity of North Carolina at Charlotte9331 Robert D. Snyder RdCharlotteNC28223USA
| | - Sayal Guirales
- Department of Bioinformatics and GenomicsUniversity of North Carolina at Charlotte9331 Robert D. Snyder RdCharlotteNC28223USA
| | - Daniel A. Janies
- Department of Bioinformatics and GenomicsUniversity of North Carolina at Charlotte9331 Robert D. Snyder RdCharlotteNC28223USA
| |
Collapse
|
19
|
Evolutionary and functional analysis of an NRPS condensation domain integrates β-lactam, ᴅ-amino acid, and dehydroamino acid synthesis. Proc Natl Acad Sci U S A 2021; 118:2026017118. [PMID: 33893237 DOI: 10.1073/pnas.2026017118] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Nonribosomal peptide synthetases (NRPSs) are large, multidomain biosynthetic enzymes involved in the assembly-line-like synthesis of numerous peptide natural products. Among these are clinically useful antibiotics including three classes of β-lactams: the penicillins/cephalosporins, the monobactams, and the monocyclic nocardicins, as well as the vancomycin family of glycopeptides and the depsipeptide daptomycin. During NRPS synthesis, peptide bond formation is catalyzed by condensation (C) domains, which couple the nascent peptide with the next programmed amino acid of the sequence. A growing number of additional functions are linked to the activity of C domains. In the biosynthesis of the nocardicins, a specialized C domain prepares the embedded β-lactam ring from a serine residue. Here, we examine the evolutionary descent of this unique β-lactam-synthesizing C domain. Guided by its ancestry, we predict and demonstrate in vitro that this C domain alternatively performs peptide bond formation when a single stereochemical change is introduced into its peptide starting material. Remarkably, the function of the downstream thioesterase (TE) domain also changes. Natively, the TE directs C terminus epimerization prior to hydrolysis when the β-lactam is made but catalyzes immediate release of the alternative peptide. In addition, we investigate the roles of C-domain histidine residues in light of clade-specific sequence motifs, refining earlier mechanistic proposals of both β-lactam formation and canonical peptide synthesis. Finally, expanded phylogenetic analysis reveals unifying connections between β-lactam synthesis and allied C domains associated with the appearance of ᴅ-amino acid and dehydroamino acid residues in other NRPS-derived natural products.
Collapse
|
20
|
Krushkal J, Negi S, Yee LM, Evans JR, Grkovic T, Palmisano A, Fang J, Sankaran H, McShane LM, Zhao Y, O'Keefe BR. Molecular genomic features associated with in vitro response of the NCI-60 cancer cell line panel to natural products. Mol Oncol 2021; 15:381-406. [PMID: 33169510 PMCID: PMC7858122 DOI: 10.1002/1878-0261.12849] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 09/29/2020] [Accepted: 11/06/2020] [Indexed: 12/17/2022] Open
Abstract
Natural products remain a significant source of anticancer chemotherapeutics. The search for targeted drugs for cancer treatment includes consideration of natural products, which may provide new opportunities for antitumor cytotoxicity as single agents or in combination therapy. We examined the association of molecular genomic features in the well-characterized NCI-60 cancer cell line panel with in vitro response to treatment with 1302 small molecules which included natural products, semisynthetic natural product derivatives, and synthetic compounds based on a natural product pharmacophore from the Developmental Therapeutics Program of the US National Cancer Institute's database. These compounds were obtained from a variety of plant, marine, and microbial species. Molecular information utilized for the analysis included expression measures for 23059 annotated transcripts, lncRNAs, and miRNAs, and data on protein-changing single nucleotide variants in 211 cancer-related genes. We found associations of expression of multiple genes including SLFN11, CYP2J2, EPHX1, GPC1, ELF3, and MGMT involved in DNA damage repair, NOTCH family members, ABC and SLC transporters, and both mutations in tyrosine kinases and BRAF V600E with NCI-60 responses to specific categories of natural products. Hierarchical clustering identified groups of natural products, which correlated with a specific mechanism of action. Specifically, several natural product clusters were associated with SLFN11 gene expression, suggesting that potential action of these compounds may involve DNA damage. The associations between gene expression or genome alterations of functionally relevant genes with the response of cancer cells to natural products provide new information about potential mechanisms of action of these identified clusters of compounds with potentially similar biological effects. This information will assist in future drug discovery and in design of new targeted cancer chemotherapy agents.
Collapse
Affiliation(s)
- Julia Krushkal
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Simarjeet Negi
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Laura M. Yee
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Jason R. Evans
- Natural Products BranchDevelopmental Therapeutics ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteFrederickMDUSA
| | - Tanja Grkovic
- Natural Products Support GroupFrederick National Laboratory for Cancer ResearchFrederickMDUSA
| | - Alida Palmisano
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
- General Dynamics Information Technology (GDIT)Falls ChurchVAUSA
| | - Jianwen Fang
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Hari Sankaran
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Lisa M. McShane
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Yingdong Zhao
- Biometric Research ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteNIHRockvilleMDUSA
| | - Barry R. O'Keefe
- Natural Products BranchDevelopmental Therapeutics ProgramDivision of Cancer Treatment and DiagnosisNational Cancer InstituteFrederickMDUSA
- Molecular Targets ProgramCenter for Cancer ResearchNational Cancer InstituteFrederickMDUSA
| |
Collapse
|
21
|
de Bernadi Schneider A, Jacob Machado D, Guirales S, Janies DA. FLAVi: An Enhanced Annotator for Viral Genomes of Flaviviridae. Viruses 2020; 12:E892. [PMID: 32824044 PMCID: PMC7472247 DOI: 10.3390/v12080892] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 08/11/2020] [Accepted: 08/11/2020] [Indexed: 12/15/2022] Open
Abstract
Responding to the ongoing and severe public health threat of viruses of the family Flaviviridae, including dengue, hepatitis C, West Nile, yellow fever, and Zika, demands a greater understanding of how these viruses emerge and spread. Updated phylogenies are central to this understanding. Most cladograms of Flaviviridae focus on specific lineages and ignore outgroups, hampering the efficacy of the analysis to test ingroup monophyly and relationships. This is due to the lack of annotated Flaviviridae genomes, which has gene content variation among genera. This variation makes analysis without partitioning difficult. Therefore, we developed an annotation pipeline for the genera of Flaviviridae (Flavirirus, Hepacivirus, Pegivirus, and Pestivirus, named "Fast Loci Annotation of Viruses" (FLAVi; http://flavi-web.com/), that combines ab initio and homology-based strategies. FLAVi recovered 100% of the genes in Flavivirus and Hepacivirus genomes. In Pegivirus and Pestivirus, annotation efficiency was 100% except for one partition each. There were no false positives. The combined phylogenetic analysis of multiple genes made possible by annotation has clear impacts over the tree topology compared to phylogenies that we inferred without outgroups or data partitioning. The final tree is largely congruent with previous hypotheses and adds evidence supporting the close phylogenetic relationship between dengue and Zika.
Collapse
Affiliation(s)
- Adriano de Bernadi Schneider
- AntiViral Research Center, Department of Medicine, University of California San Diego, San Diego, CA 92103, USA;
| | - Denis Jacob Machado
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA; (S.G.); (D.A.J.)
| | - Sayal Guirales
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA; (S.G.); (D.A.J.)
| | - Daniel A. Janies
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA; (S.G.); (D.A.J.)
| |
Collapse
|
22
|
Wade T, Rangel LT, Kundu S, Fournier GP, Bansal MS. Assessing the accuracy of phylogenetic rooting methods on prokaryotic gene families. PLoS One 2020; 15:e0232950. [PMID: 32413061 PMCID: PMC7228096 DOI: 10.1371/journal.pone.0232950] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 04/24/2020] [Indexed: 12/18/2022] Open
Abstract
Almost all standard phylogenetic methods for reconstructing gene trees result in unrooted trees; yet, many of the most useful applications of gene trees require that the gene trees be correctly rooted. As a result, several computational methods have been developed for inferring the root of unrooted gene trees. However, the accuracy of such methods has never been systematically evaluated on prokaryotic gene families, where horizontal gene transfer is often one of the dominant evolutionary events driving gene family evolution. In this work, we address this gap by conducting a thorough comparative evaluation of five different rooting methods using large collections of both simulated and empirical prokaryotic gene trees. Our simulation study is based on 6000 true and reconstructed gene trees on 100 species and characterizes the rooting accuracy of the four methods under 36 different evolutionary conditions and 3 levels of gene tree reconstruction error. The empirical study is based on a large, carefully designed data set of 3098 gene trees from 504 bacterial species (406 Alphaproteobacteria and 98 Cyanobacteria) and reveals insights that supplement those gleaned from the simulation study. Overall, this work provides several valuable insights into the accuracy of the considered methods that will help inform the choice of rooting methods to use when studying microbial gene family evolution. Among other findings, this study identifies parsimonious Duplication-Transfer-Loss (DTL) rooting and Minimal Ancestor Deviation (MAD) rooting as two of the most accurate gene tree rooting methods for prokaryotes and specifies the evolutionary conditions under which these methods are most accurate, demonstrates that DTL rooting is highly sensitive to high evolutionary rates and gene tree error, and that rooting methods based on branch-lengths are generally robust to gene tree reconstruction error.
Collapse
Affiliation(s)
- Taylor Wade
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, United States of America
| | - L. Thiberio Rangel
- Department of Earth, Atmospheric & Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Soumya Kundu
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, United States of America
| | - Gregory P. Fournier
- Department of Earth, Atmospheric & Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Mukul S. Bansal
- Department of Computer Science & Engineering, University of Connecticut, Storrs, CT, United States of America
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, United States of America
| |
Collapse
|
23
|
Stadler PF, Geiß M, Schaller D, López Sánchez A, González Laffitte M, Valdivia DI, Hellmuth M, Hernández Rosales M. From pairs of most similar sequences to phylogenetic best matches. Algorithms Mol Biol 2020; 15:5. [PMID: 32308731 PMCID: PMC7147060 DOI: 10.1186/s13015-020-00165-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 03/26/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many of the commonly used methods for orthology detection start from mutually most similar pairs of genes (reciprocal best hits) as an approximation for evolutionary most closely related pairs of genes (reciprocal best matches). This approximation of best matches by best hits becomes exact for ultrametric dissimilarities, i.e., under the Molecular Clock Hypothesis. It fails, however, whenever there are large lineage specific rate variations among paralogous genes. In practice, this introduces a high level of noise into the input data for best-hit-based orthology detection methods. RESULTS If additive distances between genes are known, then evolutionary most closely related pairs can be identified by considering certain quartets of genes provided that in each quartet the outgroup relative to the remaining three genes is known. A priori knowledge of underlying species phylogeny greatly facilitates the identification of the required outgroup. Although the workflow remains a heuristic since the correct outgroup cannot be determined reliably in all cases, simulations with lineage specific biases and rate asymmetries show that nearly perfect results can be achieved. In a realistic setting, where distances data have to be estimated from sequence data and hence are noisy, it is still possible to obtain highly accurate sets of best matches. CONCLUSION Improvements of tree-free orthology assessment methods can be expected from a combination of the accurate inference of best matches reported here and recent mathematical advances in the understanding of (reciprocal) best match graphs and orthology relations. AVAILABILITY Accompanying software is available at https://github.com/david-schaller/AsymmeTree.
Collapse
Affiliation(s)
- Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Competence Center for Scalable Data Services and Solutions Dresden/Leipzig, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv), and Leipzig Research Center for Civilization Diseases, Universität Leipzig, Augustusplatz 12, 04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, 1090 Vienna, Austria
- Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá, Ciudad Universitaria, 111321 Bogotá, D.C. Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
| | - Manuela Geiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
- Software Competence Center Hagenberg GmbH, Softwarepark 21, 4232 Hagenberg, Austria
| | - David Schaller
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, 04107 Leipzig, Germany
| | - Alitzel López Sánchez
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Marcos González Laffitte
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| | - Dulce I. Valdivia
- Departamento de Ingeniería Genética, Centro de Investigación y de Estudios Avanzados del IPN (CINVESTAV), Km. 9.6 Libramiento Norte Carretera Irapuato-León, 36821 Irapuato, GTO México
| | - Marc Hellmuth
- School of Computing, University of Leeds, E C Stoner Building, Leeds, LS2 9JT UK
| | - Maribel Hernández Rosales
- CONACYT-Instituto de Matemáticas, UNAM Juriquilla, Blvd. Juriquilla 3001, 76230 Juriquilla, Querétaro, QRO México
| |
Collapse
|
24
|
Hornok S, Horváth G, Takács N, Farkas R, Szőke K, Kontschán J. Molecular evidence of a badger-associated Ehrlichia sp., a Candidatus Neoehrlichia lotoris-like genotype and Anaplasma marginale in dogs. Ticks Tick Borne Dis 2018; 9:1302-1309. [DOI: 10.1016/j.ttbdis.2018.05.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 05/10/2018] [Accepted: 05/21/2018] [Indexed: 12/20/2022]
|
25
|
Wainaina JM, De Barro P, Kubatko L, Kehoe MA, Harvey J, Karanja D, Boykin LM. Global phylogenetic relationships, population structure and gene flow estimation of Trialeurodes vaporariorum (Greenhouse whitefly). BULLETIN OF ENTOMOLOGICAL RESEARCH 2018; 108:5-13. [PMID: 28532532 DOI: 10.1017/s0007485317000360] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Trialeurodes vaporariorum (Westwood, 1856) (Greenhouse whitefly) is an agricultural pest of global importance. It is associated with damage to plants during feeding and subsequent virus transmission. Yet, global phylogenetic relationships, population structure, and estimation of the rates of gene flow within this whitefly species remain largely unexplored. In this study, we obtained and filtered 227 GenBank records of mitochondrial cytochrome c oxidase I (mtCOI) sequences of T. vaporariorum, across various global locations to obtain a final set of 217 GenBank records. We further amplified and sequenced a ~750 bp fragment of mtCOI from an additional 31 samples collected from Kenya in 2014. Based on a total of 248 mtCOI sequences, we identified 16 haplotypes, with extensive overlap across all countries. Population structure analysis did not suggest population differentiation. Phylogenetic analysis indicated the 2014 Kenyan collection of samples clustered with a single sequence from the Netherlands to form a well-supported clade (denoted clade 1a) nested within the total set of sequences (denoted clade 1). Pairwise distances between sequences show greater sequence divergence between clades than within clades. In addition, analysis using migrate-n gave evidence for recent gene flow between the two groups. Overall, we find that T. vaporariorum forms a single large group, with evidence of further diversification consisting primarily of Kenyan sequences and one sequence from the Netherlands forming a well-supported clade.
Collapse
Affiliation(s)
- J M Wainaina
- The University of Western Australia,Australian Research Council Centre of Excellence in Plant Energy Biology and School of Molecular Sciences,Crawley,Perth 6009,Western Australia,Australia
| | - P De Barro
- CSIRO,GPO Box 2583,Brisbane QLD 4001,Australia
| | - L Kubatko
- The Ohio State University 12th Avenue Columbus,Ohio,USA
| | - M A Kehoe
- Departments of Agriculture and Food Western Australia,South Perth WA 6151,Australia
| | - J Harvey
- Feed the Future Innovation Lab for the Reduction of Post-Harvest Loss,Kansas State University,Manhattan,Kansas,USA
| | - D Karanja
- Kenya Agriculture and Livestock Research Organization (KARLO) Box 340-90100,Machakos,Kenya
| | - L M Boykin
- The University of Western Australia,Australian Research Council Centre of Excellence in Plant Energy Biology and School of Molecular Sciences,Crawley,Perth 6009,Western Australia,Australia
| |
Collapse
|
26
|
Wang IN, Yeh WB, Lin NS. Phylogeography and Coevolution of Bamboo Mosaic Virus and Its Associated Satellite RNA. Front Microbiol 2017; 8:886. [PMID: 28588562 PMCID: PMC5440514 DOI: 10.3389/fmicb.2017.00886] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 05/02/2017] [Indexed: 11/17/2022] Open
Abstract
Bamboo mosaic virus (BaMV), a plant potexvirus, has been found only in infected bamboo species. It is frequently associated with a large, linear single-stranded satellite RNA (satBaMV) that encodes a non-structural protein. Decades of collecting across a wide geographic area in Asia have accumulated a sizable number of BaMV and satBaMV isolates. In this study, we reconstructed the BaMV phylogeny and satBaMV phylogeny with partial coat protein gene sequences and partial genomic sequences, respectively. The evolutionary relationships allowed us to infer the phylogeography of BaMV and satBaMV on the Asian continent and its outlying islands. The BaMV phylogeny suggests that the BaMV isolates from Taiwan, unsurprisingly, are most likely derived from China. Interestingly, the newly available satBaMV isolates from China were found to be most closely related to the previously established Clade III, which is found in India. The general pattern of clustering along the China/India and Taiwan divide led us to hypothesize that the Taiwan Strait has been a physical barrier to gene flow in the past evolutionary history of both BaMV and satBaMV. Lastly, cophylogeny analyses revealed a complex association pattern between BaMV and satBaMV isolates from China. In general, closely related BaMV sequences tend to carry closely related satBaMV sequences as well; but instances of mismatching with distantly related satBaMV isolates were also found. We hypothesize plausible scenarios of infection and superinfection of bamboo hosts that may be responsible for the observed association pattern. However, a more systematic sampling throughout the geographic distribution of various bamboo species is needed to unambiguously establish the origin, movement, and evolution of BaMV and satBaMV.
Collapse
Affiliation(s)
- Ing-Nang Wang
- Department of Biological Sciences, University at Albany, AlbanyNY, United States
| | - Wen-Bin Yeh
- Department of Entomology, National Chung Hsin UniversityTaichung, Taiwan
| | - Na-Sheng Lin
- Institute of Plant and Microbial Biology, Academia SinicaTaipei, Taiwan
| |
Collapse
|