1
|
Ruperti F, Papadopoulos N, Musser JM, Mirdita M, Steinegger M, Arendt D. Cross-phyla protein annotation by structural prediction and alignment. Genome Biol 2023; 24:113. [PMID: 37173746 PMCID: PMC10176882 DOI: 10.1186/s13059-023-02942-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 04/18/2023] [Indexed: 05/15/2023] Open
Abstract
BACKGROUND Protein annotation is a major goal in molecular biology, yet experimentally determined knowledge is typically limited to a few model organisms. In non-model species, the sequence-based prediction of gene orthology can be used to infer protein identity; however, this approach loses predictive power at longer evolutionary distances. Here we propose a workflow for protein annotation using structural similarity, exploiting the fact that similar protein structures often reflect homology and are more conserved than protein sequences. RESULTS We propose a workflow of openly available tools for the functional annotation of proteins via structural similarity (MorF: MorphologFinder) and use it to annotate the complete proteome of a sponge. Sponges are highly relevant for inferring the early history of animals, yet their proteomes remain sparsely annotated. MorF accurately predicts the functions of proteins with known homology in [Formula: see text] cases and annotates an additional [Formula: see text] of the proteome beyond standard sequence-based methods. We uncover new functions for sponge cell types, including extensive FGF, TGF, and Ephrin signaling in sponge epithelia, and redox metabolism and control in myopeptidocytes. Notably, we also annotate genes specific to the enigmatic sponge mesocytes, proposing they function to digest cell walls. CONCLUSIONS Our work demonstrates that structural similarity is a powerful approach that complements and extends sequence similarity searches to identify homologous proteins over long evolutionary distances. We anticipate this will be a powerful approach that boosts discovery in numerous -omics datasets, especially for non-model organisms.
Collapse
Affiliation(s)
- Fabian Ruperti
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Biosciences, Collaboration for joint Ph.D. degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Nikolaos Papadopoulos
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Department for Evolutionary Biology, University of Vienna, Vienna, Austria
| | - Jacob M Musser
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Detlev Arendt
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
- Centre for Organismal Studies, University of Heidelberg, Heidelberg, Germany.
| |
Collapse
|
2
|
Ortiz J, Bobkov YV, DeBiasse MB, Mitchell DG, Edgar A, Martindale MQ, Moss AG, Babonis LS, Ryan JF. Independent Innexin Radiation Shaped Signaling in Ctenophores. Mol Biol Evol 2023; 40:7026321. [PMID: 36740225 PMCID: PMC9949713 DOI: 10.1093/molbev/msad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 12/30/2022] [Accepted: 01/25/2023] [Indexed: 02/07/2023] Open
Abstract
Innexins facilitate cell-cell communication by forming gap junctions or nonjunctional hemichannels, which play important roles in metabolic, chemical, ionic, and electrical coupling. The lack of knowledge regarding the evolution and role of these channels in ctenophores (comb jellies), the likely sister group to the rest of animals, represents a substantial gap in our understanding of the evolution of intercellular communication in animals. Here, we identify and phylogenetically characterize the complete set of innexins of four ctenophores: Mnemiopsis leidyi, Hormiphora californensis, Pleurobrachia bachei, and Beroe ovata. Our phylogenetic analyses suggest that ctenophore innexins diversified independently from those of other animals and were established early in the emergence of ctenophores. We identified a four-innexin genomic cluster, which was present in the last common ancestor of these four species and has been largely maintained in these lineages. Evidence from correlated spatial and temporal gene expression of the M. leidyi innexin cluster suggests that this cluster has been maintained due to constraints related to gene regulation. We describe the basic electrophysiological properties of putative ctenophore hemichannels from muscle cells using intracellular recording techniques, showing substantial overlap with the properties of bilaterian innexin channels. Together, our results suggest that the last common ancestor of animals had gap junctional channels also capable of forming functional innexin hemichannels, and that innexin genes have independently evolved in major lineages throughout Metazoa.
Collapse
Affiliation(s)
| | | | - Melissa B DeBiasse
- Whitney Laboratory for Marine Bioscience, University of Florida, St Augustine, FL, USA,School of Natural Sciences, University of California Merced, Merced, CA, USA
| | - Dorothy G Mitchell
- Whitney Laboratory for Marine Bioscience, University of Florida, St Augustine, FL, USA,Department of Biology, University of Florida, Gainesville, FL, USA
| | - Allison Edgar
- Whitney Laboratory for Marine Bioscience, University of Florida, St Augustine, FL, USA
| | - Mark Q Martindale
- Whitney Laboratory for Marine Bioscience, University of Florida, St Augustine, FL, USA,Department of Biology, University of Florida, Gainesville, FL, USA
| | - Anthony G Moss
- Biological Sciences Department, Auburn University, Auburn, AL, USA
| | - Leslie S Babonis
- Whitney Laboratory for Marine Bioscience, University of Florida, St Augustine, FL, USA,Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA
| | | |
Collapse
|
3
|
The Roles of Protein Structure, Taxon Sampling, and Model Complexity in Phylogenomics: A Case Study Focused on Early Animal Divergences. BIOPHYSICA 2021. [DOI: 10.3390/biophysica1020008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Despite the long history of using protein sequences to infer the tree of life, the potential for different parts of protein structures to retain historical signal remains unclear. We propose that it might be possible to improve analyses of phylogenomic datasets by incorporating information about protein structure. We test this idea using the position of the root of Metazoa (animals) as a model system. We examined the distribution of “strongly decisive” sites (alignment positions that support a specific tree topology) in a dataset comprising >1500 proteins and almost 100 taxa. The proportion of each class of strongly decisive sites in different structural environments was very sensitive to the model used to analyze the data when a limited number of taxa were used but they were stable when taxa were added. As long as enough taxa were analyzed, sites in all structural environments supported the same topology regardless of whether standard tree searches or decisive sites were used to select the optimal tree. However, the use of decisive sites revealed a difference between the support for minority topologies for sites in different structural environments: buried sites and sites in sheet and coil environments exhibited equal support for the minority topologies, whereas solvent-exposed and helix sites had unequal numbers of sites, supporting the minority topologies. This suggests that the relatively slowly evolving buried, sheet, and coil sites are giving an accurate picture of the true species tree and the amount of conflict among gene trees. Taken as a whole, this study indicates that phylogenetic analyses using sites in different structural environments can yield different topologies for the deepest branches in the animal tree of life and that analyzing larger numbers of taxa eliminates this conflict. More broadly, our results highlight the desirability of incorporating information about protein structure into phylogenomic analyses.
Collapse
|
4
|
Lateral Gene Transfer Mechanisms and Pan-genomes in Eukaryotes. Trends Parasitol 2020; 36:927-941. [DOI: 10.1016/j.pt.2020.07.014] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 07/20/2020] [Accepted: 07/20/2020] [Indexed: 02/06/2023]
|
5
|
Moreland RT, Nguyen AD, Ryan JF, Baxevanis AD. The Mnemiopsis Genome Project Portal: integrating new gene expression resources and improving data visualization. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5834871. [PMID: 32386298 DOI: 10.1093/database/baaa029] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Accepted: 03/22/2020] [Indexed: 11/13/2022]
Abstract
Following the completion of the genome sequencing and gene prediction of Mnemiopsis leidyi, a lobate ctenophore that is native to the coastal waters of the western Atlantic Ocean, we developed and implemented the Mnemiopsis Genome Project Portal (MGP Portal), a comprehensive Web-based data portal for navigating the genome sequence and gene annotations. In the years following the first release of the MGP Portal, it has become evident that the inclusion of data from significant published studies on Mnemiopsis has been critical to its adoption as the centralized resource for this emerging model organism. With this most recent update, the Portal has significantly expanded to include in situ images, temporal developmental expression profiles and single-cell expression data. Recent enhancements also include implementations of an updated BLAST interface, new graphical visualization tools and updates to gene pages that integrate all new data types. Database URL: https://research.nhgri.nih.gov/mnemiopsis/.
Collapse
Affiliation(s)
- R Travis Moreland
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Anh-Dao Nguyen
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Joseph F Ryan
- Whitney Laboratory for Marine Bioscience, University of Florida, St. Augustine, FL 32080, USA.,Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Andreas D Baxevanis
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|