1
|
Walt HK, Ahn SJ, Hoffmann FG. Horizontally transferred glycoside hydrolase 26 may aid hemipteran insects in plant tissue digestion. Mol Phylogenet Evol 2024; 198:108134. [PMID: 38901473 DOI: 10.1016/j.ympev.2024.108134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/28/2024] [Accepted: 06/15/2024] [Indexed: 06/22/2024]
Abstract
Glycoside hydrolases are enzymes that break down complex carbohydrates into simple sugars by catalyzing the hydrolysis of glycosidic bonds. There have been multiple instances of adaptive horizontal gene transfer of genes belonging to various glycoside hydrolase families from microbes to insects, as glycoside hydrolases can metabolize constituents of the carbohydrate-rich plant cell wall. In this study, we characterize the horizontal transfer of a gene from the glycoside hydrolase family 26 (GH26) from bacteria to insects of the order Hemiptera. Our phylogenies trace the horizontal gene transfer to the common ancestor of the superfamilies Pentatomoidea and Lygaeoidea, which include stink bugs and seed bugs. After horizontal transfer, the gene was assimilated into the insect genome as indicated by the gain of an intron, and a eukaryotic signal peptide. Subsequently, the gene has undergone independent losses and expansions in copy number in multiple lineages, suggesting an adaptive role of GH26s in some insects. Finally, we measured tissue-level gene expression of multiple stink bugs and the large milkweed bug using publicly available RNA-seq datasets. We found that the GH26 genes are highly expressed in tissues associated with plant digestion, especially in the principal salivary glands of the stink bugs. Our results are consistent with the hypothesis that this horizontally transferred GH26 was co-opted by the insect to aid in plant tissue digestion and that this HGT event was likely adaptive.
Collapse
Affiliation(s)
- Hunter K Walt
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Seung-Joon Ahn
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA; Institute for Genomics, Biotechnology and Biocomputing, Mississippi State University, Mississippi State, MS 39762, USA.
| |
Collapse
|
2
|
Ahel J, Pandey A, Schwaiger M, Mohn F, Basters A, Kempf G, Andriollo A, Kaaij L, Hess D, Bühler M. ChAHP2 and ChAHP control diverse retrotransposons by complementary activities. Genes Dev 2024; 38:554-568. [PMID: 38960717 PMCID: PMC11293393 DOI: 10.1101/gad.351769.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 06/07/2024] [Indexed: 07/05/2024]
Abstract
Retrotransposon control in mammals is an intricate process that is effectuated by a broad network of chromatin regulatory pathways. We previously discovered ChAHP, a protein complex with repressive activity against short interspersed element (SINE) retrotransposons that is composed of the transcription factor ADNP, chromatin remodeler CHD4, and HP1 proteins. Here we identify ChAHP2, a protein complex homologous to ChAHP, in which ADNP is replaced by ADNP2. ChAHP2 is predominantly targeted to endogenous retroviruses (ERVs) and long interspersed elements (LINEs) via HP1β-mediated binding of H3K9 trimethylated histones. We further demonstrate that ChAHP also binds these elements in a manner mechanistically equivalent to that of ChAHP2 and distinct from DNA sequence-specific recruitment at SINEs. Genetic ablation of ADNP2 alleviates ERV and LINE1 repression, which is synthetically exacerbated by additional depletion of ADNP. Together, our results reveal that the ChAHP and ChAHP2 complexes function to control both nonautonomous and autonomous retrotransposons by complementary activities, further adding to the complexity of mammalian transposon control.
Collapse
Affiliation(s)
- Josip Ahel
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Aparna Pandey
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Michaela Schwaiger
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
- Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Fabio Mohn
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Anja Basters
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Georg Kempf
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Aude Andriollo
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
- University of Basel, Basel 4003, Switzerland
| | - Lucas Kaaij
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Daniel Hess
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland
| | - Marc Bühler
- Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland;
- University of Basel, Basel 4003, Switzerland
| |
Collapse
|
3
|
Walt HK, King JG, Towles TB, Ahn SJ, Hoffmann FG. Comparative Genomics and the Salivary Transcriptome of the Redbanded Stink Bug Shed Light on Its High Damage Potential to Soybean. Genome Biol Evol 2024; 16:evae121. [PMID: 38864488 PMCID: PMC11226756 DOI: 10.1093/gbe/evae121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 05/28/2024] [Accepted: 06/05/2024] [Indexed: 06/13/2024] Open
Abstract
The redbanded stink bug, Piezodorus guildinii (Westwood) (Hemiptera: Pentatomidae), is a significant soybean pest in the Americas, which inflicts more physical damage on soybean than other native stink bugs. Studies suggest that its heightened impact is attributed to the aggressive digestive properties of its saliva. Despite its agricultural importance, the factors driving its greater ability to degrade plant tissues have remained unexplored in a genomic evolutionary context. In this study, we hypothesized that lineage-specific gene family expansions have increased the copy number of digestive genes expressed in the salivary glands. To investigate this, we annotated a previously published genome assembly of the redbanded stink bug, performed a comparative genomic analysis on 11 hemipteran species, and reconstructed patterns of gene duplication, gain, and loss in the redbanded stink bug. We also performed RNA-seq on the redbanded stink bug's salivary tissues, along with the rest of the body without salivary glands. We identified hundreds of differentially expressed salivary genes, including a subset lost in other stink bug lineages, but retained and expressed in the redbanded stink bug's salivary glands. These genes were significantly enriched with protein families involved in proteolysis, potentially explaining the redbanded stink bug's heightened damage to soybeans. Contrary to our hypothesis, we found no support for an enrichment of duplicated digestive genes that are also differentially expressed in the salivary glands of the redbanded stink bug. Nonetheless, these results provide insight into the evolution of this important crop pest, establishing a link between its genomic history and its agriculturally important physiology.
Collapse
Affiliation(s)
- Hunter K Walt
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Jonas G King
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Tyler B Towles
- Macon Ridge Research Station, Louisiana State University, Winnsboro, LA 71295, USA
| | - Seung-Joon Ahn
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology, and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| |
Collapse
|
4
|
Caña-Bozada VH, Huerta-Ocampo JÁ, Bojórquez-Velázquez E, Elizalde-Contreras JM, May ER, Morales-Serna FN. Proteomic analysis of Neobenedenia sp. and Rhabdosynochus viridisi (Monogenea, Monopisthocotylea): Insights into potential vaccine targets and diagnostic markers for finfish aquaculture. Vet Parasitol 2024; 329:110196. [PMID: 38763120 DOI: 10.1016/j.vetpar.2024.110196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/22/2024] [Accepted: 05/07/2024] [Indexed: 05/21/2024]
Abstract
Monogeneans are parasitic flatworms that represent a significant threat to the aquaculture industry. Species like Neobenedenia melleni (Capsalidae) and Rhabdosynochus viridisi (Diplectanidae) have been identified as causing diseases in farmed fish. In the past years, molecular research on monogeneans of the subclass Monopisthocotylea has focused on the generation of genomic and transcriptomic information and the identification in silico of some protein families of veterinary interest. Proteomic analysis has been suggested as a powerful tool to investigate proteins in parasites and identify potential targets for vaccine development and diagnosis. To date, the proteomic dataset for monogeneans has been restricted to a species of the subclass Polyopisthocotylea, while in monopisthocotyleans there is no proteomic data. In this study, we present the first proteomic data on two monopisthocotylean species, Neobenedenia sp. and R. viridisi, obtained from three distinct sample types: tissue, excretory-secretory products (ESPs), and eggs. A total of 1691 and 1846 expressed proteins were identified in Neobenedenia sp. and R. viridisi, respectively. The actin family was the largest protein family, followed by the tubulin family and the heat shock protein 70 (HSP70) family. We focused mainly on ESPs because they are important to modulate the host immune system. We identified proteins of the actin, tubulin, HSP70 and HSP90 families in both tissue and ESPs, which have been recognized for their antigenic activities in parasitic flatworms. Furthermore, our study uncovered the presence of proteins within ESPs, such as annexin, calcium-binding protein, fructose bisphosphate aldolase, glutamate dehydrogenase, myoferlin, and paramyosin, that are targets for immunodiagnostic and vaccine development and hold paramount relevance in veterinary medicine. This study expands our knowledge of monogeneans and identified proteins that, in other platyhelminths are potential targets for vaccines and drug discovery.
Collapse
Affiliation(s)
| | | | | | | | - Eliel Ruiz May
- Instituto de Ecología, A.C., Xalapa, Veracruz 91070, Mexico
| | - Francisco N Morales-Serna
- Instituto de Ciencias del Mar y Limnología, Universidad Nacional Autónoma de México, Mazatlán, Sinaloa 82040, Mexico
| |
Collapse
|
5
|
Liu M, Wang C, Huo L, Cao J, Mao X, He Z, Hu C, Sun H, Deng W, He W, Chen Y, Gu M, Liao J, Guo N, He X, Wu Q, Chen J, Zhang L, Wang X, Shang C, Dong J. Complexin-1 enhances ultrasound neurotransmission in the mammalian auditory pathway. Nat Genet 2024; 56:1503-1515. [PMID: 38834904 DOI: 10.1038/s41588-024-01781-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 04/25/2024] [Indexed: 06/06/2024]
Abstract
Unlike megabats, which rely on well-developed vision, microbats use ultrasonic echolocation to navigate and locate prey. To study ultrasound perception, here we compared the auditory cortices of microbats and megabats by constructing reference genomes and single-nucleus atlases for four species. We found that parvalbumin (PV)+ neurons exhibited evident cross-species differences and could respond to ultrasound signals, whereas their silencing severely affected ultrasound perception in the mouse auditory cortex. Moreover, megabat PV+ neurons expressed low levels of complexins (CPLX1-CPLX4), which can facilitate neurotransmitter release, while microbat PV+ neurons highly expressed CPLX1, which improves neurotransmission efficiency. Further perturbation of Cplx1 in PV+ neurons impaired ultrasound perception in the mouse auditory cortex. In addition, CPLX1 functioned in other parts of the auditory pathway in microbats but not megabats and exhibited convergent evolution between echolocating microbats and whales. Altogether, we conclude that CPLX1 expression throughout the entire auditory pathway can enhance mammalian ultrasound neurotransmission.
Collapse
Affiliation(s)
- Meiling Liu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China
| | - Changliang Wang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Lifang Huo
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China
| | - Jie Cao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China
| | - Xiuguang Mao
- School of Ecological and Environmental Sciences, East China Normal University, Shanghai, China
| | - Ziqing He
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Chuanxia Hu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China
| | - Haijian Sun
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Wenjun Deng
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Weiya He
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Yifu Chen
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Meifeng Gu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Jiayu Liao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Ning Guo
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Xiangyang He
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Sciences, Guangzhou, China
| | - Qian Wu
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Jiekai Chen
- CAS Key Laboratory of Regenerative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
| | - Libiao Zhang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Sciences, Guangzhou, China.
| | - Xiaoqun Wang
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China.
| | - Congping Shang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China.
| | - Ji Dong
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou, China.
| |
Collapse
|
6
|
Rossier V, Train C, Nevers Y, Robinson-Rechavi M, Dessimoz C. Matreex: Compact and Interactive Visualization for Scalable Studies of Large Gene Families. Genome Biol Evol 2024; 16:evae100. [PMID: 38742690 PMCID: PMC11149776 DOI: 10.1093/gbe/evae100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/17/2024] [Accepted: 05/03/2024] [Indexed: 05/16/2024] Open
Abstract
Studying gene family evolution strongly benefits from insightful visualizations. However, the ever-growing number of sequenced genomes is leading to increasingly larger gene families, which challenges existing gene tree visualizations. Indeed, most of them present users with a dilemma: display complete but intractable gene trees, or collapse subtrees, thereby hiding their children's information. Here, we introduce Matreex, a new dynamic tool to scale up the visualization of gene families. Matreex's key idea is to use "phylogenetic" profiles, which are dense representations of gene repertoires, to minimize the information loss when collapsing subtrees. We illustrate Matreex's usefulness with three biological applications. First, we demonstrate on the MutS family the power of combining gene trees and phylogenetic profiles to delve into precise evolutionary analyses of large multicopy gene families. Second, by displaying 22 intraflagellar transport gene families across 622 species cumulating 5,500 representatives, we show how Matreex can be used to automate large-scale analyses of gene presence-absence. Notably, we report for the first time the complete loss of intraflagellar transport in the myxozoan Thelohanellus kitauei. Finally, using the textbook example of visual opsins, we show Matreex's potential to create easily interpretable figures for teaching and outreach. Matreex is available from the Python Package Index (pip install Matreex) with the source code and documentation available at https://github.com/DessimozLab/matreex.
Collapse
Affiliation(s)
- Victor Rossier
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Clement Train
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Yannis Nevers
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
| |
Collapse
|
7
|
Caña-Bozada VH, Ovando-Vázquez C, Flores-Méndez LC, Martínez-Brown JM, Morales-Serna FN. Identifying potential drug targets in the kinomes of two monogenean species. Helminthologia 2024; 61:142-150. [PMID: 39040804 PMCID: PMC11260314 DOI: 10.2478/helm-2024-0020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 05/24/2024] [Indexed: 07/24/2024] Open
Abstract
Protein kinases are enzymes involved in essential biological processes such as signal transduction, transcription, metabolism, and the cell cycle. Human kinases are targets for several drugs approved by the US Food and Drug Administration. Therefore, the identification and classification of kinases in other organisms, including pathogenic parasites, is an interesting subject of study. Monogeneans are platyhelminths, mainly ectoparasites, capable of causing health problems in farmed fish. Although some genomes and transcriptomes are available for monogenean species, their full repertoire of kinases is unknown. The aim of this study was to identify and classify the putative kinases in the transcriptomes of two monogeneans, Rhabdosynochus viridisi and Scutogyrus longicornis, and then to predict potential monogenean drug targets (MDTs) and selective inhibitor drugs using computational approaches. Monogenean kinases having orthologs in the lethal phenotype of C. elegans but not in fish or humans were considered MDTs. A total of 160 and 193 kinases were identified in R. viridisi and S. longicornis, respectively. Of these, 22 kinases, belonging mainly to the major groups CAMK, AGC, and TK, were classified as MDTs, five of which were evaluated further. Molecular docking analysis indicated that dihydroergotamine, ergotamine, and lomitapide have the highest affinity for the kinases BRSK and MEKK1. These well-known drugs could be evaluated in future studies for potential repurposing as anti-monogenean agents. The present study contributes valuable data for the development of new antiparasitic candidates for finfish aquaculture.
Collapse
Affiliation(s)
- V. H. Caña-Bozada
- Centro de Investigación en Alimentación y Desarrollo, A.C., Mazatlán, Sinaloa82112, Mexico
| | - C. Ovando-Vázquez
- Centro Nacional de Supercómputo, Instituto Potosino de investigación Científica y Tecnológica, San Luis Potosí78216, Mexico
- Consejo Nacional de Humanidades, Ciencias y Tecnologías (CONAHCYT), Ciudad de México, Mexico
| | - L. C. Flores-Méndez
- Centro de Investigación en Alimentación y Desarrollo, A.C., Mazatlán, Sinaloa82112, Mexico
- Present address:Universidad Autónoma de Occidente, Unidad Regional Mazatlán, Mazatlán, 82100, Sinaloa, Mexico
| | - J. M. Martínez-Brown
- Centro de Investigación en Alimentación y Desarrollo, A.C., Mazatlán, Sinaloa82112, Mexico
| | - F. N. Morales-Serna
- Instituto de Ciencias del Mar y Limnología, Universidad Nacional Autónoma de México, Mazatlán82040, Sinaloa, Mexico
| |
Collapse
|
8
|
Marlétaz F, Timoshevskaya N, Timoshevskiy VA, Parey E, Simakov O, Gavriouchkina D, Suzuki M, Kubokawa K, Brenner S, Smith JJ, Rokhsar DS. The hagfish genome and the evolution of vertebrates. Nature 2024; 627:811-820. [PMID: 38262590 PMCID: PMC10972751 DOI: 10.1038/s41586-024-07070-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 01/15/2024] [Indexed: 01/25/2024]
Abstract
As the only surviving lineages of jawless fishes, hagfishes and lampreys provide a crucial window into early vertebrate evolution1-3. Here we investigate the complex history, timing and functional role of genome-wide duplications4-7 and programmed DNA elimination8,9 in vertebrates in the light of a chromosome-scale genome sequence for the brown hagfish Eptatretus atami. Combining evidence from syntenic and phylogenetic analyses, we establish a comprehensive picture of vertebrate genome evolution, including an auto-tetraploidization (1RV) that predates the early Cambrian cyclostome-gnathostome split, followed by a mid-late Cambrian allo-tetraploidization (2RJV) in gnathostomes and a prolonged Cambrian-Ordovician hexaploidization (2RCY) in cyclostomes. Subsequently, hagfishes underwent extensive genomic changes, with chromosomal fusions accompanied by the loss of genes that are essential for organ systems (for example, genes involved in the development of eyes and in the proliferation of osteoclasts); these changes account, in part, for the simplification of the hagfish body plan1,2. Finally, we characterize programmed DNA elimination in hagfish, identifying protein-coding genes and repetitive elements that are deleted from somatic cell lineages during early development. The elimination of these germline-specific genes provides a mechanism for resolving genetic conflict between soma and germline by repressing germline and pluripotency functions, paralleling findings in lampreys10,11. Reconstruction of the early genomic history of vertebrates provides a framework for further investigations of the evolution of cyclostomes and jawed vertebrates.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
| | | | | | - Elise Parey
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Oleg Simakov
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department for Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
| | - Daria Gavriouchkina
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- UK Dementia Research Institute, University College London, London, UK
| | - Masakazu Suzuki
- Department of Science, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Kaoru Kubokawa
- Ocean Research Institute, The University of Tokyo, Tokyo, Japan
| | - Sydney Brenner
- Comparative and Medical Genomics Laboratory, Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore, Singapore
| | - Jeramiah J Smith
- Department of Biology, University of Kentucky, Lexington, KY, USA.
| | - Daniel S Rokhsar
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
9
|
Thiébaut A, Altenhoff AM, Campli G, Glover N, Dessimoz C, Waterhouse RM. DrosOMA: the Drosophila Orthologous Matrix browser. F1000Res 2024; 12:936. [PMID: 38434623 PMCID: PMC10905159 DOI: 10.12688/f1000research.135250.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/12/2024] [Indexed: 03/05/2024] Open
Abstract
Background Comparative genomic analyses to delineate gene evolutionary histories inform the understanding of organismal biology by characterising gene and gene family origins, trajectories, and dynamics, as well as enabling the tracing of speciation, duplication, and loss events, and facilitating the transfer of gene functional information across species. Genomic data are available for an increasing number of species from the genus Drosophila, however, a dedicated resource exploiting these data to provide the research community with browsable results from genus-wide orthology delineation has been lacking. Methods Using the OMA Orthologous Matrix orthology inference approach and browser deployment framework, we catalogued orthologues across a selected set of Drosophila species with high-quality annotated genomes. We developed and deployed a dedicated instance of the OMA browser to facilitate intuitive exploration, visualisation, and downloading of the genus-wide orthology delineation results. Results DrosOMA - the Drosophila Orthologous Matrix browser, accessible from https://drosoma.dcsr.unil.ch/ - presents the results of orthology delineation for 36 drosophilids from across the genus and four outgroup dipterans. It enables querying and browsing of the orthology data through a feature-rich web interface, with gene-view, orthologous group-view, and genome-view pages, including comprehensive gene name and identifier cross-references together with available functional annotations and protein domain architectures, as well as tools to visualise local and global synteny conservation. Conclusions The DrosOMA browser demonstrates the deployability of the OMA browser framework for building user-friendly orthology databases with dense sampling of a selected taxonomic group. It provides the Drosophila research community with a tailored resource of browsable results from genus-wide orthology delineation.
Collapse
Affiliation(s)
- Antonin Thiébaut
- Department of Ecology and Evolution, SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Adrian M. Altenhoff
- Department of Computer Science, SIB Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland
| | - Giulia Campli
- Department of Ecology and Evolution, SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Natasha Glover
- Department of Computational Biology, SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Robert M. Waterhouse
- Department of Ecology and Evolution, SIB Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
10
|
Pechmann S. Single-cell expression predicts neuron-specific protein homeostasis networks. Open Biol 2024; 14:230386. [PMID: 38262604 PMCID: PMC10805596 DOI: 10.1098/rsob.230386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 11/17/2023] [Indexed: 01/25/2024] Open
Abstract
The protein homeostasis network keeps proteins in their correct shapes and avoids unwanted aggregation. In turn, the accumulation of aberrantly misfolded proteins has been directly associated with the onset of ageing-associated neurodegenerative diseases such as Alzheimer's and Parkinson's. However, a detailed and rational understanding of how protein homeostasis is achieved in health, and how it can be targeted for therapeutic intervention in diseases remains missing. Here, large-scale single-cell expression data from the Allen Brain Map are analysed to investigate the transcription regulation of the core protein homeostasis network across the human brain. Remarkably, distinct expression profiles suggest specialized protein homeostasis networks with systematic adaptations in excitatory neurons, inhibitory neurons and non-neuronal cells. Moreover, several chaperones and Ubiquitin ligases are found transcriptionally coregulated with genes important for synapse formation and maintenance, thus linking protein homeostasis to the regulation of neuronal function. Finally, evolutionary analyses highlight the conservation of an elevated interaction density in the chaperone network, suggesting that one of the most exciting aspects of chaperone action may yet be discovered in their collective action at the systems level. More generally, our work highlights the power of computational analyses for breaking down complexity and gaining complementary insights into fundamental biological problems.
Collapse
|
11
|
Nikolaidis M, Oliver SG, Amoutzias GD. pyPGCF: A Python Software for Phylogenomic Analysis, Species Demarcation, Identification of Core, and Fingerprint Proteins of Bacterial Genomes That Are Important for Plants. Methods Mol Biol 2024; 2788:139-155. [PMID: 38656512 DOI: 10.1007/978-1-0716-3782-1_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
This computational protocol describes how to use pyPGCF, a python software package that runs in the linux environment, in order to analyze bacterial genomes and perform: (i) phylogenomic analysis, (ii) species demarcation, (iii) identification of the core proteins of a bacterial genus and its individual species, (iv) identification of species-specific fingerprint proteins that are found in all strains of a species and, at the same time, are absent from all other species of the genus, (v) functional annotation of the core and fingerprint proteins with eggNOG, and (vi) identification of secondary metabolite biosynthetic gene clusters (smBGCs) with antiSMASH. This software has already been implemented to analyze bacterial genera and species that are important for plants (e.g., Pseudomonas, Bacillus, Streptomyces). In addition, we provide a test dataset and example commands showing how to analyze 165 genomes from 55 species of the genus Bacillus. The main advantages of pyPGCF are that: (i) it uses adjustable orthology cut-offs, (ii) it identifies species-specific fingerprints, and (iii) its computational cost scales linearly with the number of genomes being analyzed. Therefore, pyPGCF is able to deal with a very large number of bacterial genomes, in reasonable timescales, using widely available levels of computing power.
Collapse
Affiliation(s)
- Marios Nikolaidis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, Greece
| | - Stephen G Oliver
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Grigorios D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, Greece.
| |
Collapse
|
12
|
Dylus D, Altenhoff A, Majidian S, Sedlazeck FJ, Dessimoz C. Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree. Nat Biotechnol 2024; 42:139-147. [PMID: 37081138 PMCID: PMC10791578 DOI: 10.1038/s41587-023-01753-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 03/16/2023] [Indexed: 04/22/2023]
Abstract
Current methods for inference of phylogenetic trees require running complex pipelines at substantial computational and labor costs, with additional constraints in sequencing coverage, assembly and annotation quality, especially for large datasets. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes and bypasses traditional steps in phylogeny inference, such as genome assembly, annotation and all-versus-all sequence comparisons, while retaining accuracy. In a benchmark encompassing a broad variety of datasets, Read2Tree is 10-100 times faster than assembly-based approaches and in most cases more accurate-the exception being when sequencing coverage is high and reference species very distant. Here, to illustrate the broad applicability of the tool, we reconstruct a yeast tree of life of 435 species spanning 590 million years of evolution. We also apply Read2Tree to >10,000 Coronaviridae samples, accurately classifying highly diverse animal samples and near-identical severe acute respiratory syndrome coronavirus 2 sequences on a single tree. The speed, accuracy and versatility of Read2Tree enable comparative genomics at scale.
Collapse
Affiliation(s)
- David Dylus
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- F. Hoffmann-La Roche Ltd, Immunology, Infectious Disease, and Ophthalmology (I2O), Roche Pharmaceutical Research and Early Development (pRED), Basel, Switzerland
| | - Adrian Altenhoff
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computer Science, ETH, Zurich, Switzerland
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Department of Computer Science, University College London, London, UK.
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.
| |
Collapse
|
13
|
Carhuaricra-Huaman D, Setubal JC. Protein-Coding Gene Families in Prokaryote Genome Comparisons. Methods Mol Biol 2024; 2802:33-55. [PMID: 38819555 DOI: 10.1007/978-1-0716-3838-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
The identification of orthologous genes is relevant for comparative genomics, phylogenetic analysis, and functional annotation. There are many computational tools for the prediction of orthologous groups as well as web-based resources that offer orthology datasets for download and online analysis. This chapter presents a simple and practical guide to the process of orthologous group prediction, using a dataset of 10 prokaryotic proteomes as example. The orthology methods covered are OrthoMCL, COGtriangles, OrthoFinder2, and OMA. The authors compare the number of orthologous groups predicted by these various methods, and present a brief workflow for the functional annotation and reconstruction of phylogenies from inferred single-copy orthologous genes. The chapter also demonstrates how to explore two orthology databases: eggNOG6 and OrthoDB.
Collapse
Affiliation(s)
- Dennis Carhuaricra-Huaman
- Programa de Pós-Graduação Interunidades em Bioinformática, Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo, SP, Brazil
- Research Group in Biotechnology Applied to Animal Health, Production and Conservation (SANIGEN), Laboratory of Biology and Molecular Genetics, Faculty of Veterinary Medicine, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | - João Carlos Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
14
|
Benigno V, Carraro N, Sarton-Lohéac G, Romano-Bertrand S, Blanc DS, van der Meer JR. Diversity and evolution of an abundant ICE clc family of integrative and conjugative elements in Pseudomonas aeruginosa. mSphere 2023; 8:e0051723. [PMID: 37902330 PMCID: PMC10732049 DOI: 10.1128/msphere.00517-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 09/24/2023] [Indexed: 10/31/2023] Open
Abstract
IMPORTANCE Microbial populations swiftly adapt to changing environments through horizontal gene transfer. While the mechanisms of gene transfer are well known, the impact of environmental conditions on the selection of transferred gene functions remains less clear. We investigated ICEs, specifically the ICEclc-type, in Pseudomonas aeruginosa clinical isolates. Our findings revealed co-evolution between ICEs and their hosts, with ICE transfers occurring within strains. Gene functions carried by ICEs are positively selected, including potential virulence factors and heavy metal resistance. Comparison to publicly available P. aeruginosa genomes unveiled widespread antibiotic-resistance determinants within ICEclc clades. Thus, the ubiquitous ICEclc family significantly contributes to P. aeruginosa's adaptation and fitness in diverse environments.
Collapse
Affiliation(s)
- Valentina Benigno
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Nicolas Carraro
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Garance Sarton-Lohéac
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Sara Romano-Bertrand
- Hydrosciences Montpellier, IRD, CNRS, University of Montpellier, Hospital Hygiene and Infection Control Team, University Hospital of Montpellier, Montpellier, France
| | - Dominique S. Blanc
- Prevention and Infection Control Unit, Infectious Diseases Service, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | | |
Collapse
|
15
|
Klemm P, Stadler PF, Lechner M. Proteinortho6: pseudo-reciprocal best alignment heuristic for graph-based detection of (co-)orthologs. FRONTIERS IN BIOINFORMATICS 2023; 3:1322477. [PMID: 38152702 PMCID: PMC10751348 DOI: 10.3389/fbinf.2023.1322477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 11/06/2023] [Indexed: 12/29/2023] Open
Abstract
Proteinortho is a widely used tool to predict (co)-orthologous groups of genes for any set of species. It finds application in comparative and functional genomics, phylogenomics, and evolutionary reconstructions. With a rapidly increasing number of available genomes, the demand for large-scale predictions is also growing. In this contribution, we evaluate and implement major algorithmic improvements that significantly enhance the speed of the analysis without reducing precision. Graph-based detection of (co-)orthologs is typically based on a reciprocal best alignment heuristic that requires an all vs. all comparison of proteins from all species under study. The initial identification of similar proteins is accelerated by introducing an alternative search tool along with a revised search strategy-the pseudo-reciprocal best alignment heuristic-that reduces the number of required sequence comparisons by one-half. The clustering algorithm was reworked to efficiently decompose very large clusters and accelerate processing. Proteinortho6 reduces the overall processing time by an order of magnitude compared to its predecessor while maintaining its small memory footprint and good predictive quality.
Collapse
Affiliation(s)
- Paul Klemm
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, Marburg, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Institute of Computer Science and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, NM, United States
| | - Marcus Lechner
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, Marburg, Germany
| |
Collapse
|
16
|
Singleton M, Eisen M. Leveraging genomic redundancy to improve inference and alignment of orthologous proteins. G3 (BETHESDA, MD.) 2023; 13:jkad222. [PMID: 37770067 PMCID: PMC10700111 DOI: 10.1093/g3journal/jkad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/11/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]
Abstract
Identifying protein sequences with common ancestry is a core task in bioinformatics and evolutionary biology. However, methods for inferring and aligning such sequences in annotated genomes have not kept pace with the increasing scale and complexity of the available data. Thus, in this work, we implemented several improvements to the traditional methodology that more fully leverage the redundancy of closely related genomes and the organization of their annotations. Two highlights include the application of the more flexible k-clique percolation algorithm for identifying clusters of orthologous proteins and the development of a novel technique for removing poorly supported regions of alignments with a phylogenetic hidden Markov model (phylo-HMM). In making the latter, we wrote a fully documented Python package Homomorph that implements standard HMM algorithms and created a set of tutorials to promote its use by a wide audience. We applied the resulting pipeline to a set of 33 annotated Drosophila genomes, generating 22,813 orthologous groups and 8,566 high-quality alignments.
Collapse
Affiliation(s)
- Marc Singleton
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Michael Eisen
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
17
|
He Z, He W, Hu C, Liao J, Deng W, Sun H, Huang Q, Chen W, Zhang L, Liu M, Dong J. Cross-species comparison illuminates the importance of iron homeostasis for splenic anti-immunosenescence. Aging Cell 2023; 22:e13982. [PMID: 37681451 PMCID: PMC10652311 DOI: 10.1111/acel.13982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 08/21/2023] [Accepted: 08/23/2023] [Indexed: 09/09/2023] Open
Abstract
Although immunosenescence may result in increased morbidity and mortality, many mammals have evolved effective immune coping strategies to extend their lifespans. Thus, the immune systems of long-lived mammals present unique models to study healthy longevity. To identify the molecular clues of anti-immunosenescence, we first built high-quality reference genome for a long-lived myotis bat, and then compared three long-lived mammals (i.e., bat, naked mole rat, and human) versus the short-lived mammal, mouse, in splenic immune cells at single-cell resolution. A close relationship between B:T cell ratio and immunosenescence was detected, as B:T cell ratio was much higher in mouse than long-lived mammals and significantly increased during aging. Importantly, we identified several iron-related genes that could resist immunosenescence changes, especially the iron chaperon, PCBP1, which was upregulated in long-lived mammals but dramatically downregulated during aging in all splenic immune cell types. Supportively, immune cells of mouse spleens contained more free iron than those of bat spleens, suggesting higher level of ROS-induced damage in mouse. PCBP1 downregulation during aging was also detected in hepatic but not pulmonary immune cells, which is consistent with the crucial roles of spleen and liver in organismal iron recycling. Furthermore, PCBP1 perturbation in immune cell lines would result in cellular iron dyshomeostasis and senescence. Finally, we identified two transcription factors that could regulate PCBP1 during aging. Together, our findings highlight the importance of iron homeostasis in splenic anti-immunosenescence, and provide unique insight for improving human healthspan.
Collapse
Affiliation(s)
- Ziqing He
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
- Faculty of Health SciencesUniversity of MacauMacauChina
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory)GuangzhouChina
| | - Weiya He
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
- Faculty of Health SciencesUniversity of MacauMacauChina
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory)GuangzhouChina
| | - Chuanxia Hu
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Jiayu Liao
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Wenjun Deng
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Haijian Sun
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
- Faculty of Health SciencesUniversity of MacauMacauChina
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory)GuangzhouChina
| | - Qingpei Huang
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Weilue Chen
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Libiao Zhang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and UtilizationInstitute of Zoology, Guangdong Academy of SciencesGuangzhouChina
| | - Meiling Liu
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Ji Dong
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National LaboratoryGuangzhou Medical UniversityGuangzhouChina
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory)GuangzhouChina
| |
Collapse
|
18
|
Rodríguez-López M, Bordin N, Lees J, Scholes H, Hassan S, Saintain Q, Kamrad S, Orengo C, Bähler J. Broad functional profiling of fission yeast proteins using phenomics and machine learning. eLife 2023; 12:RP88229. [PMID: 37787768 PMCID: PMC10547477 DOI: 10.7554/elife.88229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Open
Abstract
Many proteins remain poorly characterized even in well-studied organisms, presenting a bottleneck for research. We applied phenomics and machine-learning approaches with Schizosaccharomyces pombe for broad cues on protein functions. We assayed colony-growth phenotypes to measure the fitness of deletion mutants for 3509 non-essential genes in 131 conditions with different nutrients, drugs, and stresses. These analyses exposed phenotypes for 3492 mutants, including 124 mutants of 'priority unstudied' proteins conserved in humans, providing varied functional clues. For example, over 900 proteins were newly implicated in the resistance to oxidative stress. Phenotype-correlation networks suggested roles for poorly characterized proteins through 'guilt by association' with known proteins. For complementary functional insights, we predicted Gene Ontology (GO) terms using machine learning methods exploiting protein-network and protein-homology data (NET-FF). We obtained 56,594 high-scoring GO predictions, of which 22,060 also featured high information content. Our phenotype-correlation data and NET-FF predictions showed a strong concordance with existing PomBase GO annotations and protein networks, with integrated analyses revealing 1675 novel GO predictions for 783 genes, including 47 predictions for 23 priority unstudied proteins. Experimental validation identified new proteins involved in cellular aging, showing that these predictions and phenomics data provide a rich resource to uncover new protein functions.
Collapse
Affiliation(s)
- María Rodríguez-López
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Nicola Bordin
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jon Lees
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
- University of BristolBristolUnited Kingdom
| | - Harry Scholes
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Shaimaa Hassan
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
- Helwan University, Faculty of PharmacyCairoEgypt
| | - Quentin Saintain
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Stephan Kamrad
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| | - Christine Orengo
- University College London, Institute of Structural and Molecular BiologyLondonUnited Kingdom
| | - Jürg Bähler
- University College London, Institute of Healthy Ageing and Department of Genetics, Evolution & EnvironmentLondonUnited Kingdom
| |
Collapse
|
19
|
Rubert DP, Braga MDV. Efficient gene orthology inference via large-scale rearrangements. Algorithms Mol Biol 2023; 18:14. [PMID: 37770945 PMCID: PMC10540461 DOI: 10.1186/s13015-023-00238-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 08/17/2023] [Indexed: 09/30/2023] Open
Abstract
BACKGROUND Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the similarities into account, the weighted rearrangement distance between the analyzed genomes (a problem that is NP-hard). The gene matchings are then integrated into gene families in the final step. The mentioned ILP includes an optimal capping that connects each end of a linear segment of one genome to an end of a linear segment in the other genome, producing an exponential increase of the search space. RESULTS In this work, we design and implement a heuristic capping algorithm that replaces the optimal capping by clustering (based on their gene content intersections) the linear segments into [Formula: see text] subsets, whose ends are capped independently. Furthermore, in each subset, instead of allowing all possible connections, we let only the ends of content-related segments be connected. Although there is no guarantee that m is much bigger than one, and with the possible side effect of resulting in sub-optimal instead of optimal gene matchings, the heuristic works very well in practice, from both the speed performance and the quality of computed solutions. Our experiments on primate and fruit fly genomes show two positive results. First, for complete assemblies of five primates the version with heuristic capping reports orthologies that are very similar to the orthologies computed by the version of our tool with optimal capping. Second, we were able to efficiently analyze fruit fly genomes with incomplete assemblies distributed in hundreds or even thousands of contigs, obtaining gene families that are very similar to [Formula: see text] families. Indeed, our tool inferred a higher number of complete cliques, with a higher intersection with [Formula: see text], when compared to gene families computed by other inference tools. We added a post-processing for refining, with the aid of the [Formula: see text] algorithm, our ambiguous families (those with more than one gene per genome), improving even more the accuracy of our results. Our approach is implemented into a pipeline incorporating the pre-computation of gene similarities and the post-processing refinement of ambiguous families with [Formula: see text]. Both the original version with optimal capping and the new modified version with heuristic capping can be downloaded, together with their detailed documentations, at https://gitlab.ub.uni-bielefeld.de/gi/FFGC or as a Conda package at https://anaconda.org/bioconda/ffgc .
Collapse
Affiliation(s)
- Diego P Rubert
- Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Campo Grande, Brazil
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Marília D V Braga
- Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
20
|
Langschied F, Leisegang MS, Brandes RP, Ebersberger I. ncOrtho: efficient and reliable identification of miRNA orthologs. Nucleic Acids Res 2023; 51:e71. [PMID: 37260093 PMCID: PMC10359484 DOI: 10.1093/nar/gkad467] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/04/2023] [Accepted: 05/30/2023] [Indexed: 06/02/2023] Open
Abstract
MicroRNAs (miRNAs) are post-transcriptional regulators that finetune gene expression via translational repression or degradation of their target mRNAs. Despite their functional relevance, frameworks for the scalable and accurate detection of miRNA orthologs are missing. Consequently, there is still no comprehensive picture of how miRNAs and their associated regulatory networks have evolved. Here we present ncOrtho, a synteny informed pipeline for the targeted search of miRNA orthologs in unannotated genome sequences. ncOrtho matches miRNA annotations from multi-tissue transcriptomes in precision, while scaling to the analysis of hundreds of custom-selected species. The presence-absence pattern of orthologs to 266 human miRNA families across 402 vertebrate species reveals four bursts of miRNA acquisition, of which the most recent event occurred in the last common ancestor of higher primates. miRNA families are rarely modified or lost, but notable exceptions for both events exist. miRNA co-ortholog numbers faithfully indicate lineage-specific whole genome duplications, and miRNAs are powerful markers for phylogenomic analyses. Their exceptionally low genetic diversity makes them suitable to resolve clades where the phylogenetic signal is blurred by incomplete lineage sorting of ancestral alleles. In summary, ncOrtho allows to routinely consider miRNAs in evolutionary analyses that were thus far reserved to protein-coding genes.
Collapse
Affiliation(s)
- Felix Langschied
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt, Germany
| | - Matthias S Leisegang
- Institute for Cardiovascular Physiology, Goethe University, Frankfurt, Germany
- German Center of Cardiovascular Research (DZHK), Partner site RheinMain, Frankfurt, Germany
| | - Ralf P Brandes
- Institute for Cardiovascular Physiology, Goethe University, Frankfurt, Germany
- German Center of Cardiovascular Research (DZHK), Partner site RheinMain, Frankfurt, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt, Germany
- Senckenberg Biodiversity and Climate Research Centre (S-BIK-F), Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| |
Collapse
|
21
|
Huete SG, Benaroudj N. The Arsenal of Leptospira Species against Oxidants. Antioxidants (Basel) 2023; 12:1273. [PMID: 37372003 DOI: 10.3390/antiox12061273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/01/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
Reactive oxygen species (ROS) are byproducts of oxygen metabolism produced by virtually all organisms living in an oxic environment. ROS are also produced by phagocytic cells in response to microorganism invasion. These highly reactive molecules can damage cellular constituents (proteins, DNA, and lipids) and exhibit antimicrobial activities when present in sufficient amount. Consequently, microorganisms have evolved defense mechanisms to counteract ROS-induced oxidative damage. Leptospira are diderm bacteria form the Spirochaetes phylum. This genus is diverse, encompassing both free-living non-pathogenic bacteria as well as pathogenic species responsible for leptospirosis, a widespread zoonotic disease. All leptospires are exposed to ROS in the environment, but only pathogenic species are well-equipped to sustain the oxidative stress encountered inside their hosts during infection. Importantly, this ability plays a pivotal role in Leptospira virulence. In this review, we describe the ROS encountered by Leptospira in their different ecological niches and outline the repertoire of defense mechanisms identified so far in these bacteria to scavenge deadly ROS. We also review the mechanisms controlling the expression of these antioxidants systems and recent advances in understanding the contribution of Peroxide Stress Regulators in Leptospira adaptation to oxidative stress.
Collapse
Affiliation(s)
- Samuel G Huete
- Institut Pasteur, Université Paris Cité, Biologie des Spirochètes, CNRS UMR 6047, F-75015 Paris, France
| | - Nadia Benaroudj
- Institut Pasteur, Université Paris Cité, Biologie des Spirochètes, CNRS UMR 6047, F-75015 Paris, France
| |
Collapse
|
22
|
Gonzalez BC, González VL, Martínez A, Worsaae K, Osborn KJ. A transcriptome-based phylogeny for Polynoidae (Annelida: Aphroditiformia). Mol Phylogenet Evol 2023:107811. [PMID: 37169231 DOI: 10.1016/j.ympev.2023.107811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 03/31/2023] [Accepted: 05/05/2023] [Indexed: 05/13/2023]
Abstract
Polynoidae is the most diverse radiation of Aphroditiformia and one of the most successful groups of all Annelida in terms of diversity and habitats colonized. With such an unmatched diversity, phylogenetic investigations have struggled to understand their evolutionary relationships. Previous phylogenetic analyses have slowly increased taxon sampling and employed methodologies, but despite their diversity and biological importance, large genomic sampling is limited. To investigate the internal relationships within Polynoidae, we conducted the first phylogenomic analyses of the group based on 12 transcriptomes collected from species inhabiting a broad array of habitats, including shallow and deep waters, as well as hydrothermal vents, anchialine caves and the midwater. Our phylogenomic analyses of Polynoidae recovered congruent tree topologies representing the clades Polynoinae, Macellicephalinae and Lepidonotopodinae. Members of Polynoinae and Macellicephalinae clustered in well supported and independent clades. In contrast, Lepidonotopodinae taxa were always recovered nested within Macellicephalinae. Though our sampling only covers a small proportion of the species known for Polynoidae, our results provide a robust phylogenomic framework to build from, emphasizing previously hypothesized relationships between Macellicephalinae and Lepidonotopodinae taxa, while providing new insights on the origin of enigmatic cave and pelagic lineages.
Collapse
Affiliation(s)
- Brett C Gonzalez
- Smithsonian Institution, National Museum of Natural History, Department of Invertebrate Zoology, P.O. Box 37012, Washington D.C., USA.
| | - Vanessa L González
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, P.O. Box 37012, Washington, D.C., USA
| | - Alejandro Martínez
- Molecular Ecology Group (MEG), Water Research Institute (IRSA), National Research Council of Italy (CNR), Largo Tonolli, 50, 28922. Pallanza, Italy
| | - Katrine Worsaae
- Marine Biological Section, Department of Biology, University of Copenhagen, Universitetsparken 4, Copenhagen Ø, Denmark
| | - Karen J Osborn
- Smithsonian Institution, National Museum of Natural History, Department of Invertebrate Zoology, P.O. Box 37012, Washington D.C., USA; Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss Landing, CA 95039, USA
| |
Collapse
|
23
|
Dosch J, Bergmann H, Tran V, Ebersberger I. FAS: assessing the similarity between proteins using multi-layered feature architectures. Bioinformatics 2023; 39:btad226. [PMID: 37084276 PMCID: PMC10185405 DOI: 10.1093/bioinformatics/btad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/23/2023] [Accepted: 04/13/2023] [Indexed: 04/23/2023] Open
Abstract
MOTIVATION Protein sequence comparison is a fundamental element in the bioinformatics toolkit. When sequences are annotated with features such as functional domains, transmembrane domains, low complexity regions or secondary structure elements, the resulting feature architectures allow better informed comparisons. However, many existing schemes for scoring architecture similarities cannot cope with features arising from multiple annotation sources. Those that do fall short in the resolution of overlapping and redundant feature annotations. RESULTS Here, we introduce FAS, a scoring method that integrates features from multiple annotation sources in a directed acyclic architecture graph. Redundancies are resolved as part of the architecture comparison by finding the paths through the graphs that maximize the pair-wise architecture similarity. In a large-scale evaluation on more than 10 000 human-yeast ortholog pairs, architecture similarities assessed with FAS are consistently more plausible than those obtained using e-values to resolve overlaps or leaving overlaps unresolved. Three case studies demonstrate the utility of FAS on architecture comparison tasks: benchmarking of orthology assignment software, identification of functionally diverged orthologs, and diagnosing protein architecture changes stemming from faulty gene predictions. With the help of FAS, feature architecture comparisons can now be routinely integrated into these and many other applications. AVAILABILITY AND IMPLEMENTATION FAS is available as python package: https://pypi.org/project/greedyFAS/.
Collapse
Affiliation(s)
- Julian Dosch
- Applied Bioinformatics Group, Goethe University Frankfurt, Faculty of Biosciences, Institute of Cell Biology and Neuroscience, Frankfurt, 60438, Germany
| | - Holger Bergmann
- Applied Bioinformatics Group, Goethe University Frankfurt, Faculty of Biosciences, Institute of Cell Biology and Neuroscience, Frankfurt, 60438, Germany
| | - Vinh Tran
- Applied Bioinformatics Group, Goethe University Frankfurt, Faculty of Biosciences, Institute of Cell Biology and Neuroscience, Frankfurt, 60438, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Goethe University Frankfurt, Faculty of Biosciences, Institute of Cell Biology and Neuroscience, Frankfurt, 60438, Germany
- Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, 60325, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt, 60325, Germany
| |
Collapse
|
24
|
Gu L, Xia C, Yang S, Yang G. The adaptive evolution of cancer driver genes. BMC Genomics 2023; 24:215. [PMID: 37098512 PMCID: PMC10131384 DOI: 10.1186/s12864-023-09301-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 04/08/2023] [Indexed: 04/27/2023] Open
Abstract
BACKGROUND Cancer is a life-threatening disease in humans; yet, cancer genes are frequently reported to be under positive selection. This suggests an evolutionary-genetic paradox in which cancer evolves as a secondary product of selection in human beings. However, systematic investigation of the evolution of cancer driver genes is sparse. RESULTS Using comparative genomics analysis, population genetics analysis and computational molecular evolutionary analysis, the evolution of 568 cancer driver genes of 66 cancer types were evaluated at two levels, selection on the early evolution of humans (long timescale selection in the human lineage during primate evolution, i.e., millions of years), and recent selection in modern human populations (~ 100,000 years). Results showed that eight cancer genes covering 11 cancer types were under positive selection in the human lineage (long timescale selection). And 35 cancer genes covering 47 cancer types were under positive selection in modern human populations (recent selection). Moreover, SNPs associated with thyroid cancer in three thyroid cancer driver genes (CUX1, HERC2 and RGPD3) were under positive selection in East Asian and European populations, consistent with the high incidence of thyroid cancer in these populations. CONCLUSIONS These findings suggest that cancer can be evolved, in part, as a by-product of adaptive changes in humans. Different SNPs at the same locus can be under different selection pressures in different populations, and thus should be under consideration during precision medicine, especially for targeted medicine in specific populations.
Collapse
Affiliation(s)
- Langyu Gu
- State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China.
| | - Canwei Xia
- Ministry of Education Key Laboratory for Biodiversity and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Shiyu Yang
- The Affiliated Brain Hospital, Guangzhou Medical University, Guangzhou, 510180, Guangdong, China
| | - Guofen Yang
- Department of Gynecology, First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510060, Guangdong, China.
| |
Collapse
|
25
|
Marlétaz F, Timoshevskaya N, Timoshevskiy V, Simakov O, Parey E, Gavriouchkina D, Suzuki M, Kubokawa K, Brenner S, Smith J, Rokhsar DS. The hagfish genome and the evolution of vertebrates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.17.537254. [PMID: 37131617 PMCID: PMC10153176 DOI: 10.1101/2023.04.17.537254] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
As the only surviving lineages of jawless fishes, hagfishes and lampreys provide a critical window into early vertebrate evolution. Here, we investigate the complex history, timing, and functional role of genome-wide duplications in vertebrates in the light of a chromosome-scale genome of the brown hagfish Eptatretus atami. Using robust chromosome-scale (paralogon-based) phylogenetic methods, we confirm the monophyly of cyclostomes, document an auto-tetraploidization (1RV) that predated the origin of crown group vertebrates ~517 Mya, and establish the timing of subsequent independent duplications in the gnathostome and cyclostome lineages. Some 1RV gene duplications can be linked to key vertebrate innovations, suggesting that this early genomewide event contributed to the emergence of pan-vertebrate features such as neural crest. The hagfish karyotype is derived by numerous fusions relative to the ancestral cyclostome arrangement preserved by lampreys. These genomic changes were accompanied by the loss of genes essential for organ systems (eyes, osteoclast) that are absent in hagfish, accounting in part for the simplification of the hagfish body plan; other gene family expansions account for hagfishes' capacity to produce slime. Finally, we characterise programmed DNA elimination in somatic cells of hagfish, identifying protein-coding and repetitive elements that are deleted during development. As in lampreys, the elimination of these genes provides a mechanism for resolving genetic conflict between soma and germline by repressing germline/pluripotency functions. Reconstruction of the early genomic history of vertebrates provides a framework for further exploration of vertebrate novelties.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | | | | | - Oleg Simakov
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department of Molecular Evolution and Development, University of Vienna, Vienna, Austria
| | - Elise Parey
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Daria Gavriouchkina
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Present address: UK Dementia Research Institute, University College London, London, UK
| | - Masakazu Suzuki
- Department of Science, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Kaoru Kubokawa
- Ocean Research Institute, The University of Tokyo, Tokyo, Japan
| | - Sydney Brenner
- Comparative and Medical Genomics Laboratory, Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore 138673, Singapore
- Deceased
| | - Jeramiah Smith
- Department of Biology, University of Kentucky, Lexington, KY, USA
| | - Daniel S Rokhsar
- Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
26
|
Watanabe T, Kure A, Horiike T. OrthoPhy: A Program to Construct Ortholog Data Sets Using Taxonomic Information. Genome Biol Evol 2023; 15:7044703. [PMID: 36799928 PMCID: PMC9991595 DOI: 10.1093/gbe/evad026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 01/30/2023] [Accepted: 02/13/2023] [Indexed: 02/18/2023] Open
Abstract
Species phylogenetic trees represent the evolutionary processes of organisms, and they are fundamental in evolutionary research. Therefore, new methods have been developed to obtain more reliable species phylogenetic trees. A highly reliable method is the construction of an ortholog data set based on sequence information of genes, which is then used to infer the species phylogenetic tree. However, although methods for constructing an ortholog data set for species phylogenetic analysis have been developed, they cannot remove some paralogs, which is necessary for reliable species phylogenetic inference. To address the limitations of current methods, we developed OrthoPhy, a program that excludes paralogs and constructs highly accurate ortholog data sets using taxonomic information dividing analyzed species into monophyletic groups. OrthoPhy can remove paralogs, detecting inconsistencies between taxonomic information and phylogenetic trees of candidate ortholog groups clustered by sequence similarity. Performance tests using evolutionary simulated sequences and real sequences of 40 bacteria revealed that the precision of ortholog inference by OrthoPhy is higher than that of existing programs. Additionally, the phylogenetic analysis of species was more accurate when performed using ortholog data sets constructed by OrthoPhy than that performed using data sets constructed by existing programs. Furthermore, we performed a benchmark test of the Quest for Orthologs using real sequence data and found that the concordance rate between the phylogenetic trees of orthologs inferred by OrthoPhy and those of species was higher than the rates obtained by other ortholog inference programs. Therefore, ortholog data sets constructed using OrthoPhy enabled a more accurate phylogenetic analysis of species than those constructed using the existing programs, and OrthoPhy can be used for the phylogenetic analysis of species even for distantly related species that have experienced many evolutionary events.
Collapse
Affiliation(s)
- Tomoaki Watanabe
- United Graduate School of Agricultural Science, Gifu University, Gifu, Japan
| | - Akinori Kure
- Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Tokumasa Horiike
- Department of Bioresource Sciences, Shizuoka University, Shizuoka, Japan
| |
Collapse
|
27
|
Caña-Bozada V, Robinson MW, Hernández-Mena DI, Morales-Serna FN. Exploring Evolutionary Relationships within Neodermata Using Putative Orthologous Groups of Proteins, with Emphasis on Peptidases. Trop Med Infect Dis 2023; 8:tropicalmed8010059. [PMID: 36668966 PMCID: PMC9860727 DOI: 10.3390/tropicalmed8010059] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/10/2023] [Accepted: 01/11/2023] [Indexed: 01/14/2023] Open
Abstract
The phylogenetic relationships within Neodermata were examined based on putative orthologous groups of proteins (OGPs) from 11 species of Monogenea, Trematoda, and Cestoda. The dataset included OGPs from BUSCO and OMA. Additionally, peptidases were identified and evaluated as phylogenetic markers. Phylogenies were inferred using the maximum likelihood method. A network analysis and a hierarchical grouping analysis of the principal components (HCPC) of orthologous groups of peptidases were performed. The phylogenetic analyses showed the monopisthocotylean monogeneans as the sister-group of cestodes, and the polyopisthocotylean monogeneans as the sister-group of trematodes. However, the sister-group relationship between Monopisthocotylea and Cestoda was not statistically well supported. The network analysis and HCPC also showed a cluster formed by polyopisthocotyleans and trematodes. The present study supports the non-monophyly of Monogenea. An analysis of mutation rates indicated that secreted peptidases and inhibitors, and those with multiple copies, are under positive selection pressure, which could explain the expansion of some families such as C01, C19, I02, and S01. Whilst not definitive, our study presents another point of view in the discussion of the evolution of Neodermata, and we hope that our data drive further discussion and debate on this intriguing topic.
Collapse
Affiliation(s)
- Víctor Caña-Bozada
- Centro de Investigación en Alimentación y Desarrollo, Mazatlán 82112, Mexico
| | - Mark W. Robinson
- School of Biological Sciences, Queen’s University Belfast, 19 Chlorine Gardens, Belfast BT9 5DL, UK
| | - David I. Hernández-Mena
- Centro de Investigación y de Estudios Avanzados, Instituto Politécnico Nacional, Unidad Mérida, Mérida 97310, Mexico
| | - Francisco N. Morales-Serna
- Instituto de Ciencias del Mar y Limnología, Universidad Nacional Autónoma de México, Mazatlán 82040, Mexico
- Correspondence:
| |
Collapse
|
28
|
Nesterenko M, Miroliubov A. From head to rootlet: comparative transcriptomic analysis of a rhizocephalan barnacle Peltogaster reticulata (Crustacea: Rhizocephala). F1000Res 2023; 11:583. [PMID: 36447930 PMCID: PMC9664023 DOI: 10.12688/f1000research.110492.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
Background: Rhizocephalan barnacles stand out in the diverse world of metazoan parasites. The body of a rhizocephalan female is modified beyond revealing any recognizable morphological features, consisting of the interna, a system of rootlets, and the externa, a sac-like reproductive body. Moreover, rhizocephalans have an outstanding ability to control their hosts, literally turning them into "zombies". Despite all these amazing traits, there are no genomic or transcriptomic data about any Rhizocephala. Methods: We collected transcriptomes from four body parts of an adult female rhizocephalan Peltogaster reticulata: the externa, and the main, growing, and thoracic parts of the interna. We used all prepared data for the de novo assembly of the reference transcriptome. Next, a set of encoded proteins was determined, the expression levels of protein-coding genes in different parts of the parasite's body were calculated and lists of enriched bioprocesses were identified. We also in silico identified and analyzed sets of potential excretory / secretory proteins. Finally, we applied phylostratigraphy and evolutionary transcriptomics approaches to our data. Results: The assembled reference transcriptome included transcripts of 12,620 protein-coding genes and was the first for any rhizocephalan. Based on the results obtained, the spatial heterogeneity of protein-coding gene expression in different regions of the adult female body of P. reticulata was established. The results of both transcriptomic analysis and histological studies indicated the presence of germ-like cells in the lumen of the interna. The potential molecular basis of the interaction between the nervous system of the host and the parasite's interna was also determined. Given the prolonged expression of development-associated genes, we suggest that rhizocephalans "got stuck in their metamorphosis", even at the reproductive stage. Conclusions: The results of the first comparative transcriptomic analysis for Rhizocephala not only clarified but also expanded the existing ideas about the biology of these extraordinary parasites.
Collapse
Affiliation(s)
- Maksim Nesterenko
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, 199034, Russian Federation,Laboratory of parasitic worms and protists, Zoological Institute of Russian Academy of Sciences, St Petersburg, 199034, Russian Federation,
| | - Aleksei Miroliubov
- Laboratory of parasitic worms and protists, Zoological Institute of Russian Academy of Sciences, St Petersburg, 199034, Russian Federation
| |
Collapse
|
29
|
Benavides LR, Edgecombe GD, Giribet G. Re-evaluating and dating myriapod diversification with phylotranscriptomics under a regime of dense taxon sampling. Mol Phylogenet Evol 2023; 178:107621. [PMID: 36116731 DOI: 10.1016/j.ympev.2022.107621] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 08/17/2022] [Accepted: 08/18/2022] [Indexed: 12/14/2022]
Abstract
Recent transcriptomic studies of myriapod phylogeny have been based on relatively small datasets with <40 myriapod terminals and variably supported or contradicted the traditional morphological groupings of Progoneata and Dignatha. Here we amassed a large dataset of 104 myriapod terminals, including multiple species for each of the four myriapod classes. Across the tree, most nodes are stable and well supported. Most analyses across a range of gene occupancy levels provide moderate to strong support for a deep split of Myriapoda into Symphyla + Pauropoda (=Edafopoda) and an uncontradicted grouping of Chilopoda + Diplopoda (=Pectinopoda nov.), as in other recent transcriptome-based analyses; no analysis recovers Progoneata or Dignatha as clades. As in all recent multi-locus and phylogenomic studies, chilopod interrelationships resolve with Craterostigmus excluded from Amalpighiata rather than uniting with other centipedes with maternal brood care in Phylactometria. Diplopod ordinal interrelationships are largely congruent with morphology-based classifications. Chilognathan clades that are not invariably advocated by morphologists include Glomerida + Glomeridesmida, such that the volvation-related characters of pill millipedes may be convergent, and Stemmiulida + Polydesmida more closely allied to Juliformia than to Callipodida + Chordeumatida. The latter relationship implies homoplasy in spinnerets and contradicts Nematophora. A time-tree with nodes calibrated by 25 myriapod and six outgroup fossil terminals recovers Cambrian-Ordovician divergences for the deepest splits in Myriapoda, Edafopoda and Pectinopoda, predating the terrestrial fossil record of myriapods as in other published chronograms, whereas age estimates within Chilopoda and Diplopoda overlap with or do not appreciably predate the calibration fossils. The grouping of Chilopoda and Diplopoda is recovered in all our analyses and is formalized as Pectinopoda nov., named for the shared presence of mandibular comb lamellae. New taxonomic proposals for Chilopoda based on uncontradicted clades are Tykhepoda nov. for the three blind families of Scolopendromorpha that share a "sieve-type" gizzard, and Taktikospina nov. for Scolopendromorpha to the exclusion of Mimopidae.
Collapse
Affiliation(s)
- Ligia R Benavides
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA.
| | | | - Gonzalo Giribet
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
30
|
Dylus D, Altenhoff A, Majidian S, Sedlazeck FJ, Dessimoz C. Read2Tree: scalable and accurate phylogenetic trees from raw reads. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.04.18.488678. [PMID: 36561179 PMCID: PMC9774205 DOI: 10.1101/2022.04.18.488678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes. In a benchmark encompassing a broad variety of datasets, our assembly-free approach was 10-100x faster than conventional approaches, and in most cases more accurate-the exception being when sequencing coverage was high and reference species very distant. To illustrate the broad applicability of the tool, we reconstructed a yeast tree of life of 435 species spanning 590 million years of evolution. Applied to Coronaviridae samples, Read2Tree accurately classified highly diverse animal samples and near-identical SARS-CoV-2 sequences on a single tree-thereby exhibiting remarkable breadth and depth. The speed, accuracy, and versatility of Read2Tree enables comparative genomics at scale.
Collapse
Affiliation(s)
- David Dylus
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- present address: F. Hoffmann-La Roche Ltd, Immunology, Infectious Disease, and Ophthalmology (I2O), Roche Pharmaceutical Research and Early Development (pRED), Basel, 4070, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Adrian Altenhoff
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computer Science, ETH, 8092 Zurich, Switzerland
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computer Science, University College London, London WC1E 6BT, UK
- Centre for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London WC1E, UK
| |
Collapse
|
31
|
Multilayered Networks of SalmoNet2 Enable Strain Comparisons of the Salmonella Genus on a Molecular Level. mSystems 2022; 7:e0149321. [PMID: 35913188 PMCID: PMC9426430 DOI: 10.1128/msystems.01493-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Serovars of the genus Salmonella primarily evolved as gastrointestinal pathogens in a wide range of hosts. Some serotypes later evolved further, adopting a more invasive lifestyle in a narrower host range associated with systemic infections. A system-level knowledge of these pathogens could identify the complex adaptations associated with the evolution of serovars with distinct pathogenicity, host range, and risk to human health. This promises to aid the design of interventions and serve as a knowledge base in the Salmonella research community. Here, we present SalmoNet2, a major update to SalmoNet1, the first multilayered interaction resource for Salmonella strains, containing protein-protein, transcriptional regulatory, and enzyme-enzyme interactions. The new version extends the number of Salmonella networks from 11 to 20. We now include a strain from the second species in the Salmonella genus, a strain from the Salmonella enterica subspecies arizonae and additional strains of importance from the subspecies enterica, including S. Typhimurium strain D23580, an epidemic multidrug-resistant strain associated with invasive nontyphoidal salmonellosis (iNTS). The database now uses strain specific metabolic models instead of a generalized model to highlight differences between strains. The update has increased the coverage of high-quality protein-protein interactions, and enhanced interoperability with other computational resources by adopting standardized formats. The resource website has been updated with tutorials to help researchers analyze their Salmonella data using molecular interaction networks from SalmoNet2. SalmoNet2 is accessible at http://salmonet.org/. IMPORTANCE Multilayered network databases collate interaction information from multiple sources, and are powerful both as a knowledge base and subject of analysis. Here, we present SalmoNet2, an integrated network resource containing protein-protein, transcriptional regulatory, and metabolic interactions for 20 Salmonella strains. Key improvements to the update include expanding the number of strains, strain-specific metabolic networks, an increase in high-quality protein-protein interactions, community standard computational formats to help interoperability, and online tutorials to help users analyze their data using SalmoNet2.
Collapse
|
32
|
A thorough annotation of the krill transcriptome offers new insights for the study of physiological processes. Sci Rep 2022; 12:11415. [PMID: 35794144 PMCID: PMC9259678 DOI: 10.1038/s41598-022-15320-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 06/22/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractThe krill species Euphausia superba plays a critical role in the food chain of the Antarctic ecosystem. Significant changes in climate conditions observed in the Antarctic Peninsula region in the last decades have already altered the distribution of krill and its reproductive dynamics. A deeper understanding of the adaptation capabilities of this species is urgently needed. The availability of a large body of RNA-seq assays allowed us to extend the current knowledge of the krill transcriptome. Our study covered the entire developmental process providing information of central relevance for ecological studies. Here we identified a series of genes involved in different steps of the krill moulting cycle, in the reproductive process and in sexual maturation in accordance with what was already described in previous works. Furthermore, the new transcriptome highlighted the presence of differentially expressed genes previously unknown, playing important roles in cuticle development as well as in energy storage during the krill life cycle. The discovery of new opsin sequences, specifically rhabdomeric opsins, one onychopsin, and one non-visual arthropsin, expands our knowledge of the krill opsin repertoire. We have collected all these results into the KrillDB2 database, a resource combining the latest annotation of the krill transcriptome with a series of analyses targeting genes relevant to krill physiology. KrillDB2 provides in a single resource a comprehensive catalog of krill genes; an atlas of their expression profiles over all RNA-seq datasets publicly available; a study of differential expression across multiple conditions. Finally, it provides initial indications about the expression of microRNA precursors, whose contribution to krill physiology has never been reported before.
Collapse
|
33
|
Uribe JE, González VL, Irisarri I, Kano Y, Herbert DG, Strong EE, Harasewych MG. A phylogenomic backbone for gastropod molluscs. Syst Biol 2022; 71:1271-1280. [PMID: 35766870 DOI: 10.1093/sysbio/syac045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 06/16/2022] [Accepted: 06/24/2022] [Indexed: 11/13/2022] Open
Abstract
Gastropods have survived several mass extinctions during their evolutionary history resulting in extraordinary diversity in morphology, ecology, and developmental modes, which complicate the reconstruction of a robust phylogeny. Currently, gastropods are divided into six subclasses: Caenogastropoda, Heterobranchia, Neomphaliones, Neritimorpha, Patellogastropoda, and Vetigastropoda. Phylogenetic relationships among these taxa historically lack consensus, despite numerous efforts using morphological and molecular information. We generated sequence data for transcriptomes derived from twelve taxa belonging to clades with little or no prior representation in previous studies in order to infer the deeper cladogenetic events within Gastropoda and, for the first time, infer the position of the deep-sea Neomphaliones using a phylogenomic approach. We explored the impact of missing data, homoplasy, and compositional heterogeneity on the inferred phylogenetic hypotheses. We recovered a highly supported backbone for gastropod relationships that is congruent with morphological and mitogenomic evidence, in which Patellogastropoda, true limpets, are the sister lineage to all other gastropods (Orthogastropoda) which are divided into two main clades (i) Vetigastropoda s.l. (including Pleurotomariida + Neomphaliones) and (ii) Neritimorpha + (Caenogastropoda + Heterobranchia). As such, our results support the recognition of five subclasses (or infraclasses) in Gastropoda: Patellogastropoda, Vetigastropoda, Neritimorpha, Caenogastropoda and Heterobranchia.
Collapse
Affiliation(s)
- Juan E Uribe
- Department of Invertebrate Zoology, MRC 163, National Museum of Natural History, Smithsonian Institution, P O Box 37012 Washington, DC 20013-7012, USA
| | - Vanessa L González
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013, USA
| | - Iker Irisarri
- Department of Applied Bioinformatics, Institute for Microbiology and Genetics, University of Göttingen, and Campus Institute Data Science (CIDAS), Göttingen, Germany.,Leibniz Institute for the Analysis of Biodiversity Change (LIB), Zoological Museum Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Yasunori Kano
- Department of Marine Ecosystems Dynamics, Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Chiba, Japan
| | - David G Herbert
- Department of Natural Sciences, National Museum Wales, Cathays Park, Cardiff, CF10 3NP, UK
| | - Ellen E Strong
- Department of Invertebrate Zoology, MRC 163, National Museum of Natural History, Smithsonian Institution, P O Box 37012 Washington, DC 20013-7012, USA
| | - M G Harasewych
- Department of Invertebrate Zoology, MRC 163, National Museum of Natural History, Smithsonian Institution, P O Box 37012 Washington, DC 20013-7012, USA
| |
Collapse
|
34
|
InvL, an Invasin-Like Adhesin, Is a Type II Secretion System Substrate Required for Acinetobacter baumannii Uropathogenesis. mBio 2022; 13:e0025822. [PMID: 35638734 PMCID: PMC9245377 DOI: 10.1128/mbio.00258-22] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Acinetobacter baumannii is an opportunistic pathogen of growing concern, as isolates are commonly multidrug resistant. While A. baumannii is most frequently associated with pulmonary infections, a significant proportion of clinical isolates come from urinary sources, highlighting its uropathogenic potential. The type II secretion system (T2SS) of commonly used model Acinetobacter strains is important for virulence in various animal models, but the potential role of the T2SS in urinary tract infection (UTI) remains unknown. Here, we used a catheter-associated UTI (CAUTI) model to demonstrate that a modern urinary isolate, UPAB1, requires the T2SS for full virulence. A proteomic screen to identify putative UPAB1 T2SS effectors revealed an uncharacterized lipoprotein with structural similarity to the intimin-invasin family, which serve as type V secretion system (T5SS) adhesins required for the pathogenesis of several bacteria. This protein, designated InvL, lacked the β-barrel domain associated with T5SSs but was confirmed to require the T2SS for both surface localization and secretion. This makes InvL the first identified T2SS effector belonging to the intimin-invasin family. InvL was confirmed to be an adhesin, as the protein bound to extracellular matrix components and mediated adhesion to urinary tract cell lines in vitro. Additionally, the invL mutant was attenuated in the CAUTI model, indicating a role in Acinetobacter uropathogenesis. Finally, bioinformatic analyses revealed that InvL is present in nearly all clinical isolates belonging to international clone 2, a lineage of significant clinical importance. In all, we conclude that the T2SS substrate InvL is an adhesin required for A. baumannii uropathogenesis. IMPORTANCE While pathogenic Acinetobacter can cause various infections, we recently found that 20% of clinical isolates come from urinary sources. Despite the clinical relevance of Acinetobacter as a uropathogen, few virulence factors involved in urinary tract colonization have been defined. Here, we identify a novel type II secretion system effector, InvL, which is required for full uropathogenesis by a modern urinary isolate. Although InvL has predicted structural similarity to the intimin-invasin family of autotransporter adhesins, InvL is predicted to be anchored to the membrane as a lipoprotein. Similar to other invasin homologs, however, we demonstrate that InvL is a bona fide adhesin capable of binding extracellular matrix components and mediating adhesion to urinary tract cell lines. In all, this work establishes InvL as an adhesin important for Acinetobacter's urinary tract virulence and represents the first report of a type II secretion system effector belonging to the intimin-invasin family.
Collapse
|
35
|
Djahanschiri B, Di Venanzio G, Distel JS, Breisch J, Dieckmann MA, Goesmann A, Averhoff B, Göttig S, Wilharm G, Feldman MF, Ebersberger I. Evolutionarily stable gene clusters shed light on the common grounds of pathogenicity in the Acinetobacter calcoaceticus-baumannii complex. PLoS Genet 2022; 18:e1010020. [PMID: 35653398 PMCID: PMC9162365 DOI: 10.1371/journal.pgen.1010020] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 04/04/2022] [Indexed: 11/19/2022] Open
Abstract
Nosocomial pathogens of the Acinetobacter calcoaceticus-baumannii (ACB) complex are a cautionary example for the world-wide spread of multi- and pan-drug resistant bacteria. Aiding the urgent demand for novel therapeutic targets, comparative genomics studies between pathogens and their apathogenic relatives shed light on the genetic basis of human-pathogen interaction. Yet, existing studies are limited in taxonomic scope, sensing of the phylogenetic signal, and resolution by largely analyzing genes independent of their organization in functional gene clusters. Here, we explored more than 3,000 Acinetobacter genomes in a phylogenomic framework integrating orthology-based phylogenetic profiling and microsynteny conservation analyses. We delineate gene clusters in the type strain A. baumannii ATCC 19606 whose evolutionary conservation indicates a functional integration of the subsumed genes. These evolutionarily stable gene clusters (ESGCs) reveal metabolic pathways, transcriptional regulators residing next to their targets but also tie together sub-clusters with distinct functions to form higher-order functional modules. We shortlisted 150 ESGCs that either co-emerged with the pathogenic ACB clade or are preferentially found therein. They provide a high-resolution picture of genetic and functional changes that coincide with the manifestation of the pathogenic phenotype in the ACB clade. Key innovations are the remodeling of the regulatory-effector cascade connecting LuxR/LuxI quorum sensing via an intermediate messenger to biofilm formation, the extension of micronutrient scavenging systems, and the increase of metabolic flexibility by exploiting carbon sources that are provided by the human host. We could show experimentally that only members of the ACB clade use kynurenine as a sole carbon and energy source, a substance produced by humans to fine-tune the antimicrobial innate immune response. In summary, this study provides a rich and unbiased set of novel testable hypotheses on how pathogenic Acinetobacter interact with and ultimately infect their human host. It is a comprehensive resource for future research into novel therapeutic strategies. The spread of multi- and pan-drug resistant bacterial pathogens is a worldwide threat to human health. Understanding the genetics of host colonization and infection can substantially help in devising novel ways of treatment. Acinetobacter baumannii, a nosocomial pathogen ranked top by the World Health Organization in the list of bacteria for which novel therapeutic approaches are needed, is a prime example. Here, we have carved out the genetic make-up that distinguishes A. baumannii and its pathogenic next relatives from other and mostly apathogenic Acinetobacter species. We found a rich spectrum of pathways and regulatory modules that reveal how the pathogens have modified biofilm formation, iron scavenging, and their carbohydrate metabolism to adapt to their human host. Among these, the capability to metabolize kynurenine is particularly intriguing. Humans produce this substance to contain bacterial invaders and to fine-tune the innate immune response. But A. baumannii and closely related pathogens found a way to feed on kynurenine. This suggests that the pathogens might be able to dysregulate the human immune response. In summary, our study substantially deepens the understanding of how a highly critical pathogen interacts with its host, which substantially eases the identification of novel targets for innovative therapeutic strategies.
Collapse
Affiliation(s)
- Bardya Djahanschiri
- Applied Bioinformatics Group, Inst. of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Gisela Di Venanzio
- Department of Molecular Microbiology, Washington University School of Medicine, St Louis, Missouri, United States of America
| | - Jesus S. Distel
- Department of Molecular Microbiology, Washington University School of Medicine, St Louis, Missouri, United States of America
| | - Jennifer Breisch
- Inst. of Molecular Biosciences, Department of Molecular Microbiology and Bioenergetics, Goethe University Frankfurt, Frankfurt am Main, Germany
| | | | - Alexander Goesmann
- Bioinformatics and Systems Biology, Justus Liebig University Gießen, Gießen, Germany
| | - Beate Averhoff
- Inst. of Molecular Biosciences, Department of Molecular Microbiology and Bioenergetics, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Stephan Göttig
- Institute for Medical Microbiology and Infection Control, University Hospital, Goethe University, Frankfurt, Germany
| | | | - Mario F. Feldman
- Department of Molecular Microbiology, Washington University School of Medicine, St Louis, Missouri, United States of America
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Inst. of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt am Main, Germany
- Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
- * E-mail:
| |
Collapse
|
36
|
Nesterenko M, Miroliubov A. From head to rootlet: comparative transcriptomic analysis of a rhizocephalan barnacle Peltogaster reticulata (Crustacea: Rhizocephala). F1000Res 2022; 11:583. [PMID: 36447930 PMCID: PMC9664023 DOI: 10.12688/f1000research.110492.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/04/2023] [Indexed: 09/16/2023] Open
Abstract
Background: Rhizocephalan barnacles stand out in the diverse world of metazoan parasites. The body of a rhizocephalan female is modified beyond revealing any recognizable morphological features, consisting of the interna, a system of rootlets, and the externa, a sac-like reproductive body. Moreover, rhizocephalans have an outstanding ability to control their hosts, literally turning them into "zombies". Despite all these amazing traits, there are no genomic or transcriptomic data about any Rhizocephala. Methods: We collected transcriptomes from four body parts of an adult female rhizocephalan Peltogaster reticulata: the externa, and the main, growing, and thoracic parts of the interna. We used all prepared data for the de novo assembly of the reference transcriptome. Next, a set of encoded proteins was determined, the expression levels of protein-coding genes in different parts of the parasite's body were calculated and lists of enriched bioprocesses were identified. We also in silico identified and analyzed sets of potential excretory / secretory proteins. Finally, we applied phylostratigraphy and evolutionary transcriptomics approaches to our data. Results: The assembled reference transcriptome included transcripts of 12,620 protein-coding genes and was the first for any rhizocephalan. Based on the results obtained, the spatial heterogeneity of protein-coding gene expression in different regions of the adult female body of P. reticulata was established. The results of both transcriptomic analysis and histological studies indicated the presence of germ-like cells in the lumen of the interna. The potential molecular basis of the interaction between the nervous system of the host and the parasite's interna was also determined. Given the prolonged expression of development-associated genes, we suggest that rhizocephalans "got stuck in their metamorphosis", even at the reproductive stage. Conclusions: The results of the first comparative transcriptomic analysis for Rhizocephala not only clarified but also expanded the existing ideas about the biology of these extraordinary parasites.
Collapse
Affiliation(s)
- Maksim Nesterenko
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, 199034, Russian Federation
- Laboratory of parasitic worms and protists, Zoological Institute of Russian Academy of Sciences, St Petersburg, 199034, Russian Federation
| | - Aleksei Miroliubov
- Laboratory of parasitic worms and protists, Zoological Institute of Russian Academy of Sciences, St Petersburg, 199034, Russian Federation
| |
Collapse
|
37
|
Integrated analysis of microbe-host interactions in Crohn’s disease reveals potential mechanisms of microbial proteins on host gene expression. iScience 2022; 25:103963. [PMID: 35479407 PMCID: PMC9035720 DOI: 10.1016/j.isci.2022.103963] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 12/11/2021] [Accepted: 02/18/2022] [Indexed: 12/15/2022] Open
|
38
|
Nath O, Fletcher SJ, Hayward A, Shaw LM, Masouleh AK, Furtado A, Henry RJ, Mitter N. A haplotype resolved chromosomal level avocado genome allows analysis of novel avocado genes. HORTICULTURE RESEARCH 2022; 9:uhac157. [PMID: 36204209 PMCID: PMC9531333 DOI: 10.1093/hr/uhac157] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/01/2022] [Accepted: 07/04/2022] [Indexed: 06/16/2023]
Abstract
Avocado (Persea americana) is a member of the magnoliids, an early branching lineage of angiosperms that has high value globally with the fruit being highly nutritious. Here, we report a chromosome-level genome assembly for the commercial avocado cultivar Hass, which represents 80% of the world's avocado consumption. The DNA contigs produced from Pacific Biosciences HiFi reads were further assembled using a previously published version of the genome supported by a genetic map. The total assembly was 913 Mb with a contig N50 of 84 Mb. Contigs assigned to the 12 chromosomes represented 874 Mb and covered 98.8% of benchmarked single-copy genes from embryophytes. Annotation of protein coding sequences identified 48 915 avocado genes of which 39 207 could be ascribed functions. The genome contained 62.6% repeat elements. Specific biosynthetic pathways of interest in the genome were investigated. The analysis suggested that the predominant pathway of heptose biosynthesis in avocado may be through sedoheptulose 1,7 bisphosphate rather than via alternative routes. Endoglucanase genes were high in number, consistent with avocado using cellulase for fruit ripening. The avocado genome appeared to have a limited number of translocations between homeologous chromosomes, despite having undergone multiple genome duplication events. Proteome clustering with related species permitted identification of genes unique to avocado and other members of the Lauraceae family, as well as genes unique to species diverged near or prior to the divergence of monocots and eudicots. This genome provides a tool to support future advances in the development of elite avocado varieties with higher yields and fruit quality.
Collapse
Affiliation(s)
- Onkar Nath
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | - Stephen J Fletcher
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | - Alice Hayward
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | - Lindsay M Shaw
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | - Ardashir Kharabian Masouleh
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane 4072 Australia
| | | | | |
Collapse
|
39
|
Dylus D, Nevers Y, Altenhoff AM, Gürtler A, Dessimoz C, Glover NM. How to build phylogenetic species trees with OMA. F1000Res 2022; 9:511. [PMID: 35722083 PMCID: PMC9194518 DOI: 10.12688/f1000research.23790.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/11/2022] [Indexed: 12/21/2022] Open
Abstract
Knowledge of species phylogeny is critical to many fields of biology. In an era of genome data availability, the most common way to make a phylogenetic species tree is by using multiple protein-coding genes, conserved in multiple species. This methodology is composed of several steps: orthology inference, multiple sequence alignment and inference of the phylogeny with dedicated tools. This can be a difficult task, and orthology inference, in particular, is usually computationally intensive and error prone if done
ad hoc. This tutorial provides protocols to make use of OMA Orthologous Groups, a set of genes all orthologous to each other, to infer a phylogenetic species tree. It is designed to be user-friendly and computationally inexpensive, by providing two options: (1) Using only precomputed groups with species available on the OMA Browser, or (2) Computing orthologs using OMA Standalone for additional species, with the option of using precomputed orthology relations for those present in OMA. A protocol for downstream analyses is provided as well, including creating a supermatrix, tree inference, and visualization. All protocols use publicly available software, and we provide scripts and code snippets to facilitate data handling. The protocols are accompanied with practical examples.
Collapse
Affiliation(s)
- David Dylus
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland
| | - Adrian M Altenhoff
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
- Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
| | - Antoine Gürtler
- Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| | - Natasha M Glover
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland
| |
Collapse
|
40
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
41
|
Librado P, Rozas J. Reconstructing Gene Gains and Losses with BadiRate. Methods Mol Biol 2022; 2569:213-232. [PMID: 36083450 DOI: 10.1007/978-1-0716-2691-7_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Estimating gene gain and losses is paramount to understand the molecular mechanisms underlying adaptive evolution. Despite the advent of high-throughput sequencing, such analyses have been so far hampered by the poor contiguity of genome assemblies. The increasing affordability of long-read sequencing technologies will however revolutionize our capacity to identify gene gains and losses at an unprecedented resolution, even in non-model organisms. To thoroughly exploit all such multigene family variation, the software BadiRate implements a collection of birth-and-death stochastic models, aiming at estimating by maximum likelihood the gene turnover rates along the internal and external branches of a given phylogenetic species tree. Its statistical framework also provides versatility for inferring the gene family content at the internal phylogenetic nodes (and to estimate the minimum number of gene gains and losses in each branch), for statistically contrasting competing hypotheses (e.g., accelerations of the gene turnover rates at pre-defined clades), and for pinpointing gene family expansions or contractions likely driven by natural selection. In this chapter we review the theoretical models implemented in BadiRate and illustrate their applicability by analyzing a hypothetical data set of 14 microbial species.
Collapse
Affiliation(s)
- Pablo Librado
- Centre for Anthropobiology & Genomics of Toulouse, Université Paul Sabatier, Toulouse, France
| | - Julio Rozas
- Departament de Genètica, Microbiologia I Estadística, and Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Barcelona, Spain.
| |
Collapse
|
42
|
Wafula EK, Zhang H, Von Kuster G, Leebens-Mack JH, Honaas LA, dePamphilis CW. PlantTribes2: Tools for comparative gene family analysis in plant genomics. FRONTIERS IN PLANT SCIENCE 2022; 13:1011199. [PMID: 36798801 PMCID: PMC9928214 DOI: 10.3389/fpls.2022.1011199] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 12/02/2022] [Indexed: 05/12/2023]
Abstract
Plant genome-scale resources are being generated at an increasing rate as sequencing technologies continue to improve and raw data costs continue to fall; however, the cost of downstream analyses remains large. This has resulted in a considerable range of genome assembly and annotation qualities across plant genomes due to their varying sizes, complexity, and the technology used for the assembly and annotation. To effectively work across genomes, researchers increasingly rely on comparative genomic approaches that integrate across plant community resources and data types. Such efforts have aided the genome annotation process and yielded novel insights into the evolutionary history of genomes and gene families, including complex non-model organisms. The essential tools to achieve these insights rely on gene family analysis at a genome-scale, but they are not well integrated for rapid analysis of new data, and the learning curve can be steep. Here we present PlantTribes2, a scalable, easily accessible, highly customizable, and broadly applicable gene family analysis framework with multiple entry points including user provided data. It uses objective classifications of annotated protein sequences from existing, high-quality plant genomes for comparative and evolutionary studies. PlantTribes2 can improve transcript models and then sort them, either genome-scale annotations or individual gene coding sequences, into pre-computed orthologous gene family clusters with rich functional annotation information. Then, for gene families of interest, PlantTribes2 performs downstream analyses and customizable visualizations including, (1) multiple sequence alignment, (2) gene family phylogeny, (3) estimation of synonymous and non-synonymous substitution rates among homologous sequences, and (4) inference of large-scale duplication events. We give examples of PlantTribes2 applications in functional genomic studies of economically important plant families, namely transcriptomics in the weedy Orobanchaceae and a core orthogroup analysis (CROG) in Rosaceae. PlantTribes2 is freely available for use within the main public Galaxy instance and can be downloaded from GitHub or Bioconda. Importantly, PlantTribes2 can be readily adapted for use with genomic and transcriptomic data from any kind of organism.
Collapse
Affiliation(s)
- Eric K Wafula
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Huiting Zhang
- Tree Fruit Research Laboratory, United States Department of Agriculture (USDA), Agricultural Research Service (ARS), Wenatchee, WA, United States
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Gregory Von Kuster
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| | | | - Loren A Honaas
- Tree Fruit Research Laboratory, United States Department of Agriculture (USDA), Agricultural Research Service (ARS), Wenatchee, WA, United States
| | - Claude W dePamphilis
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
43
|
Tejada-Martinez D, Avelar RA, Lopes I, Zhang B, Novoa G, de Magalhães JP, Trizzino M. Positive selection and enhancer evolution shaped lifespan and body mass in great apes. Mol Biol Evol 2021; 39:6491260. [PMID: 34971383 PMCID: PMC8837823 DOI: 10.1093/molbev/msab369] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Within primates, the great apes are outliers both in terms of body size and lifespan, since they include the largest and longest-lived species in the order. Yet, the molecular bases underlying such features are poorly understood. Here, we leveraged an integrated approach to investigate multiple sources of molecular variation across primates, focusing on over ten thousand genes, including ∼1,500 previously associated with lifespan, and additional ∼9,000 for which an association with longevity has never been suggested. We analyzed dN/dS rates, positive selection, gene expression (RNA-seq) and gene regulation (ChIP-seq). By analyzing the correlation between dN/dS, maximum lifespan and body mass we identified 276 genes whose rate of evolution positively correlates with maximum lifespan in primates. Further, we identified 5 genes, important for tumor suppression, adaptive immunity, metastasis and inflammation, under positive selection exclusively in the great ape lineage. RNA-seq data, generated from the liver of six species representing all the primate lineages, revealed that 8% of ∼1,500 genes previously associated with longevity are differentially expressed in apes relative to other primates. Importantly, by integrating RNA-seq with ChIP-seq for H3K27ac (which marks active enhancers), we show that the differentially expressed longevity genes are significantly more likely than expected to be located near a novel "ape-specific" enhancer. Moreover, these particular ape-specific enhancers are enriched for young transposable elements, and specifically SINE-Vntr-Alus (SVAs). In summary, we demonstrate that multiple evolutionary forces have contributed to the evolution of lifespan and body size in primates.
Collapse
Affiliation(s)
- Daniela Tejada-Martinez
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA.,Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - Roberto A Avelar
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - Inês Lopes
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - Bruce Zhang
- Institute of Healthy Ageing, and Research Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
| | - Guy Novoa
- Department of Structure of Macromolecules, Centro Nacional de Biotecnología-CSIC, Madrid, Spain
| | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - Marco Trizzino
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
44
|
Smith ML, Hahn MW. The Frequency and Topology of Pseudoorthologs. Syst Biol 2021; 71:649-659. [PMID: 34951639 DOI: 10.1093/sysbio/syab097] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 12/15/2021] [Accepted: 12/17/2021] [Indexed: 11/12/2022] Open
Abstract
Phylogenetics has long relied on the use of orthologs, or genes related through speciation events, to infer species relationships. However, identifying orthologs is difficult because gene duplication can obscure relationships among genes. Researchers have been particularly concerned with the insidious effects of pseudoorthologs-duplicated genes that are mistaken for orthologs because they are present in a single copy in each sampled species. Because gene tree topologies of pseudoorthologs may differ from the species tree topology, they have often been invoked as the cause of counterintuitive results in phylogenetics. Despite these perceived problems, no previous work has calculated the probabilities of pseudoortholog topologies, or has been able to circumscribe the regions of parameter space in which pseudoorthologs are most likely to occur. Here, we introduce a model for calculating the probabilities and branch lengths of orthologs and pseudoorthologs, including concordant and discordant pseudoortholog topologies, on a rooted three-taxon species tree. We show that the probability of orthologs is high relative to the probability of pseudoorthologs across reasonable regions of parameter space. Furthermore, the probabilities of the two discordant topologies are equal and never exceed that of the concordant topology, generally being much lower. We describe the species tree topologies most prone to generating pseudoorthologs, finding that they are likely to present problems to phylogenetic inference irrespective of the presence of pseudoorthologs. Overall, our results suggest that pseudoorthologs are unlikely to mislead inferences of species relationships under the biological scenarios considered here.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
45
|
Chari T, Weissbourd B, Gehring J, Ferraioli A, Leclère L, Herl M, Gao F, Chevalier S, Copley RR, Houliston E, Anderson DJ, Pachter L. Whole-animal multiplexed single-cell RNA-seq reveals transcriptional shifts across Clytia medusa cell types. SCIENCE ADVANCES 2021; 7:eabh1683. [PMID: 34826233 PMCID: PMC8626072 DOI: 10.1126/sciadv.abh1683] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Accepted: 10/06/2021] [Indexed: 05/12/2023]
Abstract
We present an organism-wide, transcriptomic cell atlas of the hydrozoan medusa Clytia hemisphaerica and describe how its component cell types respond to perturbation. Using multiplexed single-cell RNA sequencing, in which individual animals were indexed and pooled from control and perturbation conditions into a single sequencing run, we avoid artifacts from batch effects and are able to discern shifts in cell state in response to organismal perturbations. This work serves as a foundation for future studies of development, function, and regeneration in a genetically tractable jellyfish species. Moreover, we introduce a powerful workflow for high-resolution, whole-animal, multiplexed single-cell genomics that is readily adaptable to other traditional or nontraditional model organisms.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Brandon Weissbourd
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Tianqiao and Chrissy Chen Institute for Neuroscience, Pasadena, CA 91125, USA
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jase Gehring
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Anna Ferraioli
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Lucas Leclère
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Makenna Herl
- University of New Hampshire School of Law, Concord, NH 03301, USA
| | - Fan Gao
- Caltech Bioinformatics Resource Center, California Institute of Technology, Pasadena, CA 91125, USA
| | - Sandra Chevalier
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Richard R. Copley
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Evelyn Houliston
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - David J. Anderson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Tianqiao and Chrissy Chen Institute for Neuroscience, Pasadena, CA 91125, USA
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
46
|
Benavides LR, Daniels SR, Giribet G. Understanding the real magnitude of the arachnid order Ricinulei through deep Sanger sequencing across its distribution range and phylogenomics, with the formalization of the first species from the Lesser Antilles. J ZOOL SYST EVOL RES 2021. [DOI: 10.1111/jzs.12546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Ligia R. Benavides
- Museum of Comparative Zoology Department of Organismic and Evolutionary Biology Harvard University Cambridge MA USA
| | - Savel R. Daniels
- Department of Botany and Zoology Stellenbosch University Matieland South Africa
| | - Gonzalo Giribet
- Museum of Comparative Zoology Department of Organismic and Evolutionary Biology Harvard University Cambridge MA USA
| |
Collapse
|
47
|
Rubert DP, Doerr D, Braga MDV. The potential of family-free rearrangements towards gene orthology inference. J Bioinform Comput Biol 2021; 19:2140014. [PMID: 34775922 DOI: 10.1142/s021972002140014x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recently, we proposed an efficient ILP formulation [Rubert DP, Martinez FV, Braga MDV, Natural family-free genomic distance, Algorithms Mol Biol 16:4, 2021] for exactly computing the rearrangement distance of two genomes in a family-free setting. In such a setting, neither prior classification of genes into families, nor further restrictions on the genomes are imposed. Given two genomes, the mentioned ILP computes an optimal matching of the genes taking into account simultaneously local mutations, given by gene similarities, and large-scale genome rearrangements. Here, we explore the potential of using this ILP for inferring groups of orthologs across several species. More precisely, given a set of genomes, our method first computes all pairwise optimal gene matchings, which are then integrated into gene families in the second step. Our approach is implemented into a pipeline incorporating the pre-computation of gene similarities. It can be downloaded from gitlab.ub.uni-bielefeld.de/gi/FFGC. We obtained promising results with experiments on both simulated and real data.
Collapse
Affiliation(s)
- Diego P Rubert
- Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Campo Grande, Brazil
| | - Daniel Doerr
- Faculty of Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Marília D V Braga
- Faculty of Technology and CeBiTec, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
48
|
Waterworth SC, Parker-Nance S, Kwan JC, Dorrington RA. Comparative Genomics Provides Insight into the Function of Broad-Host Range Sponge Symbionts. mBio 2021; 12:e0157721. [PMID: 34519538 PMCID: PMC8546597 DOI: 10.1128/mbio.01577-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/13/2021] [Indexed: 11/20/2022] Open
Abstract
The fossil record indicates that the earliest evidence of extant marine sponges (phylum Porifera) existed during the Cambrian explosion and that their symbiosis with microbes may have begun in their extinct ancestors during the Precambrian period. Many symbionts have adapted to their sponge host, where they perform specific, specialized functions. There are also widely distributed bacterial taxa such as Poribacteria, SAUL, and Tethybacterales that are found in a broad range of invertebrate hosts. Here, we added 11 new genomes to the Tethybacterales order, identified a novel family, and show that functional potential differs between the three Tethybacterales families. We compare the Tethybacterales with the well-characterized Entoporibacteria and show that these symbionts appear to preferentially associate with low-microbial abundance (LMA) and high-microbial abundance (HMA) sponges, respectively. Within these sponges, we show that these symbionts likely perform distinct functions and may have undergone multiple association events, rather than a single association event followed by coevolution. IMPORTANCE Marine sponges often form symbiotic relationships with bacteria that fulfil a specific need within the sponge holobiont, and these symbionts are often conserved within a narrow range of related taxa. To date, there exist only three known bacterial taxa (Entoporibacteria, SAUL, and Tethybacterales) that are globally distributed and found in a broad range of sponge hosts, and little is known about the latter two. We show that the functional potential of broad-host range symbionts is conserved at a family level and that these symbionts have been acquired several times over evolutionary history. Finally, it appears that the Entoporibacteria are associated primarily with high-microbial abundance sponges, while the Tethybacterales associate with low-microbial abundance sponges.
Collapse
Affiliation(s)
- Samantha C. Waterworth
- Division of Pharmaceutical Sciences, University of Wisconsin, Madison, Wisconsin, USA
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Shirley Parker-Nance
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
- South African Environmental Observation Network, Elwandle Coastal Node, Gqeberha (Port Elizabeth), South Africa
| | - Jason C. Kwan
- Division of Pharmaceutical Sciences, University of Wisconsin, Madison, Wisconsin, USA
| | - Rosemary A. Dorrington
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
- South African Institute for Aquatic Biodiversity, Makhanda, South Africa
| |
Collapse
|
49
|
Palmieri N, de Jesus Ramires M, Hess M, Bilic I. Complete genomes of the eukaryotic poultry parasite Histomonas meleagridis: linking sequence analysis with virulence / attenuation. BMC Genomics 2021; 22:753. [PMID: 34674644 PMCID: PMC8529796 DOI: 10.1186/s12864-021-08059-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 10/06/2021] [Indexed: 12/27/2022] Open
Abstract
Background Histomonas meleagridis is a protozoan parasite and the causative agent of histomonosis, an important poultry disease whose significance is underlined by the absence of any treatment and prophylaxis. The recent successful in vitro attenuation of the parasite urges questions about the underlying mechanisms. Results Whole genome sequence data from a virulent and an attenuated strain originating from the same parental lineage of H. meleagridis were recruited using Oxford Nanopore Technology (ONT) and Illumina platforms, which were combined to generate megabase-sized contigs with high base-level accuracy. Inspecting the genomes for differences identified two substantial deletions within a coding sequence of the attenuated strain. Additionally, one single nucleotide polymorphism (SNP) and indel targeting coding sequences caused the formation of premature stop codons, which resulted in the truncation of two genes in the attenuated strain. Furthermore, the genome of H. meleagridis was used for characterizing protein classes of clinical relevance for parasitic protists. The comparative analysis with the genomes of Trichomonas vaginalis, Tritrichomonas foetus and Entamoeba histolytica identified ~ 2700 lineage-specific gene losses and 9 gene family expansions in the H. meleagridis lineage. Conclusions Taken as a whole, the obtained data provide the first hints to understand the molecular basis of attenuation in H. meleagridis and constitute a genomics platform for future research on this important poultry pathogen. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08059-2.
Collapse
Affiliation(s)
- Nicola Palmieri
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria
| | - Marcelo de Jesus Ramires
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria
| | - Michael Hess
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria.,Christian Doppler Laboratory for Innovative Poultry Vaccines (IPOV), University of Veterinary Medicine Vienna, Vienna, Austria
| | - Ivana Bilic
- Clinic for Poultry and Fish Medicine, Department for Farm Animals and Veterinary Public Health, University of Veterinary Medicine, Vienna, Austria.
| |
Collapse
|
50
|
Inoue J. ORTHOSCOPE*: a phylogenetic pipeline to infer gene histories from genome-wide data. Mol Biol Evol 2021; 39:6400256. [PMID: 34662403 PMCID: PMC8763121 DOI: 10.1093/molbev/msab301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Comparative genome-scale analyses of protein-coding gene sequences are employed to examine evidence for whole-genome duplication and horizontal gene transfer. For this purpose, an orthogroup should be delineated to infer evolutionary history regarding each gene, and results of all orthogroup analyses need to be integrated to infer a genome-scale history. An orthogroup is a set of genes descended from a single gene in the last common ancestor of all species under consideration. However, such analyses confront several problems: (1) analytical pipelines to infer all gene histories with methods comparing species and gene trees are not fully developed, and (2) without detailed analyses within orthogroups, evolutionary events of paralogous genes in the same orthogroup cannot be distinguished for genome-wide integration of results derived from multiple orthogroup analyses. Here I present an analytical pipeline, ORTHOSCOPE* (star), to infer evolutionary histories of animal/plant genes from genome-scale data. ORTHOSCOPE* estimates a tree for a specified gene, detects speciation/gene duplication events that occurred at nodes belonging to only one lineage leading to a species of interest, and then integrates results derived from gene trees estimated for all query genes in genome-wide data. Thus, ORTHOSCOPE* can be used to detect species nodes just after whole genome duplications as a first step of comparative genomic analyses. Moreover, by examining the presence or absence of genes belonging to species lineages with dense taxon sampling available from the ORTHOSCOPE web version, ORTHOSCOPE* can detect genes lost in specific lineages and horizontal gene transfers. This pipeline is available at https://github.com/jun-inoue/ORTHOSCOPE_STAR.
Collapse
Affiliation(s)
- Jun Inoue
- Center for Earth Surface System Dynamics, Atmosphere and Ocean Research Institute, University of Tokyo, Kashiwa, Japan
| |
Collapse
|