1
|
Rossier V, Train C, Nevers Y, Robinson-Rechavi M, Dessimoz C. Matreex: Compact and Interactive Visualization for Scalable Studies of Large Gene Families. Genome Biol Evol 2024; 16:evae100. [PMID: 38742690 PMCID: PMC11149776 DOI: 10.1093/gbe/evae100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/17/2024] [Accepted: 05/03/2024] [Indexed: 05/16/2024] Open
Abstract
Studying gene family evolution strongly benefits from insightful visualizations. However, the ever-growing number of sequenced genomes is leading to increasingly larger gene families, which challenges existing gene tree visualizations. Indeed, most of them present users with a dilemma: display complete but intractable gene trees, or collapse subtrees, thereby hiding their children's information. Here, we introduce Matreex, a new dynamic tool to scale up the visualization of gene families. Matreex's key idea is to use "phylogenetic" profiles, which are dense representations of gene repertoires, to minimize the information loss when collapsing subtrees. We illustrate Matreex's usefulness with three biological applications. First, we demonstrate on the MutS family the power of combining gene trees and phylogenetic profiles to delve into precise evolutionary analyses of large multicopy gene families. Second, by displaying 22 intraflagellar transport gene families across 622 species cumulating 5,500 representatives, we show how Matreex can be used to automate large-scale analyses of gene presence-absence. Notably, we report for the first time the complete loss of intraflagellar transport in the myxozoan Thelohanellus kitauei. Finally, using the textbook example of visual opsins, we show Matreex's potential to create easily interpretable figures for teaching and outreach. Matreex is available from the Python Package Index (pip install Matreex) with the source code and documentation available at https://github.com/DessimozLab/matreex.
Collapse
Affiliation(s)
- Victor Rossier
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Clement Train
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Yannis Nevers
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Comparative Genomics, Lausanne, Switzerland
| |
Collapse
|
2
|
Cui X, Xue Y, McCormack C, Garces A, Rachman TW, Yi Y, Stolzer M, Durand D. Simulating domain architecture evolution. Bioinformatics 2022; 38:i134-i142. [PMID: 35758772 PMCID: PMC9236583 DOI: 10.1093/bioinformatics/btac242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Simulation is an essential technique for generating biomolecular data with a ‘known’ history for use in validating phylogenetic inference and other evolutionary methods. On longer time scales, simulation supports investigations of equilibrium behavior and provides a formal framework for testing competing evolutionary hypotheses. Twenty years of molecular evolution research have produced a rich repertoire of simulation methods. However, current models do not capture the stringent constraints acting on the domain insertions, duplications, and deletions by which multidomain architectures evolve. Although these processes have the potential to generate any combination of domains, only a tiny fraction of possible domain combinations are observed in nature. Modeling these stringent constraints on domain order and co-occurrence is a fundamental challenge in domain architecture simulation that does not arise with sequence and gene family simulation. Results Here, we introduce a stochastic model of domain architecture evolution to simulate evolutionary trajectories that reflect the constraints on domain order and co-occurrence observed in nature. This framework is implemented in a novel domain architecture simulator, DomArchov, using the Metropolis–Hastings algorithm with data-driven transition probabilities. The use of a data-driven event module enables quick and easy redeployment of the simulator for use in different taxonomic and protein function contexts. Using empirical evaluation with metazoan datasets, we demonstrate that domain architectures simulated by DomArchov recapitulate properties of genuine domain architectures that reflect the constraints on domain order and adjacency seen in nature. This work expands the realm of evolutionary processes that are amenable to simulation. Availability and implementation DomArchov is written in Python 3 and is available at http://www.cs.cmu.edu/~durand/DomArchov. The data underlying this article are available via the same link. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyue Cui
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yifan Xue
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Collin McCormack
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alejandro Garces
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Thomas W Rachman
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yang Yi
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Maureen Stolzer
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
3
|
Pasternak Z, Chapnik N, Yosef R, Kopelman NM, Jurkevitch E, Segev E. Identifying protein function and functional links based on large-scale co-occurrence patterns. PLoS One 2022; 17:e0264765. [PMID: 35239724 PMCID: PMC8893610 DOI: 10.1371/journal.pone.0264765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 02/16/2022] [Indexed: 11/23/2022] Open
Abstract
Objective The vast majority of known proteins have not been experimentally tested even at the level of measuring their expression, and the function of many proteins remains unknown. In order to decipher protein function and examine functional associations, we developed "Cliquely", a software tool based on the exploration of co-occurrence patterns. Computational model Using a set of more than 23 million proteins divided into 404,947 orthologous clusters, we explored the co-occurrence graph of 4,742 fully sequenced genomes from the three domains of life. Edge weights in this graph represent co-occurrence probabilities. We use the Bron–Kerbosch algorithm to detect maximal cliques in this graph, fully-connected subgraphs that represent meaningful biological networks from different functional categories. Main results We demonstrate that Cliquely can successfully identify known networks from various pathways, including nitrogen fixation, glycolysis, methanogenesis, mevalonate and ribosome proteins. Identifying the virulence-associated type III secretion system (T3SS) network, Cliquely also added 13 previously uncharacterized novel proteins to the T3SS network, demonstrating the strength of this approach. Cliquely is freely available and open source. Users can employ the tool to explore co-occurrence networks using a protein of interest and a customizable level of stringency, either for the entire dataset or for a one of the three domains—Archaea, Bacteria, or Eukarya.
Collapse
Affiliation(s)
- Zohar Pasternak
- Division of Identification and Forensic Science, Israel Police, Jerusalem, Israel
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Noam Chapnik
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Roy Yosef
- Faculty of Management of Technology, Holon Institute of Technology, Holon, Israel
| | - Naama M. Kopelman
- Faculty of Science, Holon Institute of Technology, Holon, Israel
- * E-mail:
| | - Edouard Jurkevitch
- Department of Plant Pathology and Microbiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elad Segev
- Faculty of Science, Holon Institute of Technology, Holon, Israel
| |
Collapse
|
4
|
Ilnitskiy IS, Zharikova AA, Mironov AA. OUP accepted manuscript. Nucleic Acids Res 2022; 50:W534-W540. [PMID: 35610035 PMCID: PMC9252792 DOI: 10.1093/nar/gkac385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 04/19/2022] [Accepted: 04/29/2022] [Indexed: 11/27/2022] Open
Abstract
Extensive amounts of data from next-generation sequencing and omics studies have led to the accumulation of information that provides insight into the evolutionary landscape of related proteins. Here, we present OrthoQuantum, a web server that allows for time-efficient analysis and visualization of phylogenetic profiles of any set of eukaryotic proteins. It is a simple-to-use tool capable of searching large input sets of proteins. Using data from open source databases of orthologous sequences in a wide range of taxonomic groups, it enables users to assess coupled evolutionary patterns and helps define lineage-specific innovations. The web interface allows to perform queries with gene names and UniProt identifiers in different phylogenetic clades and supplement presence with an additional BLAST search. The conservation patterns of proteins are coded as binary vectors, i.e., strings that encode the presence or absence of orthologous proteins in other genomes. These strings are used to calculate top-scoring correlation pairs needed for finding co-inherited proteins which are simultaneously present or simultaneously absent in specific lineages. Profiles are visualized in combination with phylogenetic trees in a JavaScript-based interface. The OrthoQuantum v1.0 web server is freely available at http://orthoq.bioinf.fbb.msu.ru along with documentation and tutorial.
Collapse
Affiliation(s)
| | - Anastasia A Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Lomonosovsky Prospect 27, Building 10, 119991 Moscow, Russia
- Kharkevich Institute of Information Transmission Problems, Russian Academy of Sciences, Big Karetny Lane 19, Building 1, 127051 Moscow, Russia
| | - Andrey A Mironov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Lomonosovsky Prospect 27, Building 10, 119991 Moscow, Russia
- Kharkevich Institute of Information Transmission Problems, Russian Academy of Sciences, Big Karetny Lane 19, Building 1, 127051 Moscow, Russia
| |
Collapse
|
5
|
Defosset A, Kress A, Nevers Y, Ripp R, Thompson JD, Poch O, Lecompte O. Proteome-Scale Detection of Differential Conservation Patterns at Protein and Subprotein Levels with BLUR. Genome Biol Evol 2020; 13:5991441. [PMID: 33211099 PMCID: PMC7851591 DOI: 10.1093/gbe/evaa248] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/18/2020] [Indexed: 11/23/2022] Open
Abstract
In the multiomics era, comparative genomics studies based on gene repertoire comparison are increasingly used to investigate evolutionary histories of species, to study genotype–phenotype relations, species adaptation to various environments, or to predict gene function using phylogenetic profiling. However, comparisons of orthologs have highlighted the prevalence of sequence plasticity among species, showing the benefits of combining protein and subprotein levels of analysis to allow for a more comprehensive study of genotype/phenotype correlations. In this article, we introduce a new approach called BLUR (BLAST Unexpected Ranking), capable of detecting genotype divergence or specialization between two related clades at different levels: gain/loss of proteins but also of subprotein regions. These regions can correspond to known domains, uncharacterized regions, or even small motifs. Our method was created to allow two types of research strategies: 1) the comparison of two groups of species with no previous knowledge, with the aim of predicting phenotype differences or specializations between close species or 2) the study of specific phenotypes by comparing species that present the phenotype of interest with species that do not. We designed a website to facilitate the use of BLUR with a possibility of in-depth analysis of the results with various tools, such as functional enrichments, protein–protein interaction networks, and multiple sequence alignments. We applied our method to the study of two different biological pathways and to the comparison of several groups of close species, all with very promising results. BLUR is freely available at http://lbgi.fr/blur/.
Collapse
Affiliation(s)
- Audrey Defosset
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Arnaud Kress
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Yannis Nevers
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Switzerland
| | - Raymond Ripp
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Julie D Thompson
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Odile Lecompte
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| |
Collapse
|
6
|
Pourhaghighi R, Ash PEA, Phanse S, Goebels F, Hu LZM, Chen S, Zhang Y, Wierbowski SD, Boudeau S, Moutaoufik MT, Malty RH, Malolepsza E, Tsafou K, Nathan A, Cromar G, Guo H, Abdullatif AA, Apicco DJ, Becker LA, Gitler AD, Pulst SM, Youssef A, Hekman R, Havugimana PC, White CA, Blum BC, Ratti A, Bryant CD, Parkinson J, Lage K, Babu M, Yu H, Bader GD, Wolozin B, Emili A. BraInMap Elucidates the Macromolecular Connectivity Landscape of Mammalian Brain. Cell Syst 2020; 10:333-350.e14. [PMID: 32325033 PMCID: PMC7938770 DOI: 10.1016/j.cels.2020.03.003] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 11/25/2019] [Accepted: 03/20/2020] [Indexed: 12/12/2022]
Abstract
Connectivity webs mediate the unique biology of the mammalian brain. Yet, while cell circuit maps are increasingly available, knowledge of their underlying molecular networks remains limited. Here, we applied multi-dimensional biochemical fractionation with mass spectrometry and machine learning to survey endogenous macromolecules across the adult mouse brain. We defined a global "interactome" comprising over one thousand multi-protein complexes. These include hundreds of brain-selective assemblies that have distinct physical and functional attributes, show regional and cell-type specificity, and have links to core neurological processes and disorders. Using reciprocal pull-downs and a transgenic model, we validated a putative 28-member RNA-binding protein complex associated with amyotrophic lateral sclerosis, suggesting a coordinated function in alternative splicing in disease progression. This brain interaction map (BraInMap) resource facilitates mechanistic exploration of the unique molecular machinery driving core cellular processes of the central nervous system. It is publicly available and can be explored here https://www.bu.edu/dbin/cnsb/mousebrain/.
Collapse
Affiliation(s)
- Reza Pourhaghighi
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Peter E A Ash
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA
| | - Sadhna Phanse
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada; Department of Biochemistry, University of Regina, Regina, SK, Canada; Center for Network Systems Biology, Boston University, Boston, MA, USA
| | - Florian Goebels
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Lucas Z M Hu
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Siwei Chen
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Yingying Zhang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Shayne D Wierbowski
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Samantha Boudeau
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA
| | | | - Ramy H Malty
- Department of Biochemistry, University of Regina, Regina, SK, Canada
| | - Edyta Malolepsza
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Broad Institute of Massachusetts Institute of Technology and Harvard University, Boston, MA, USA
| | - Kalliopi Tsafou
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Broad Institute of Massachusetts Institute of Technology and Harvard University, Boston, MA, USA
| | - Aparna Nathan
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Broad Institute of Massachusetts Institute of Technology and Harvard University, Boston, MA, USA
| | - Graham Cromar
- Program in Molecular Medicine, Hospital for Sick Children and University of Toronto, Toronto, ON, Canada
| | - Hongbo Guo
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Ali Al Abdullatif
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA
| | - Daniel J Apicco
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA
| | - Lindsay A Becker
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Aaron D Gitler
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Stefan M Pulst
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | - Ahmed Youssef
- Program in Bioinformatics, Boston University, Boston, MA, USA; Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA
| | - Ryan Hekman
- Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA
| | - Pierre C Havugimana
- Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA; Departments of Biochemistry and Biology, Boston University, Boston, MA, USA
| | - Carl A White
- Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA
| | - Benjamin C Blum
- Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA
| | - Antonia Ratti
- Department of Neurology and Laboratory of Neuroscience, IRCCS, Milan, Italy
| | - Camron D Bryant
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA
| | - John Parkinson
- Program in Molecular Medicine, Hospital for Sick Children and University of Toronto, Toronto, ON, Canada
| | - Kasper Lage
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Broad Institute of Massachusetts Institute of Technology and Harvard University, Boston, MA, USA
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, SK, Canada
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Gary D Bader
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Benjamin Wolozin
- Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine, Boston, MA, USA; Department of Neurology, Boston University School of Medicine, Boston, MA, USA; Program in Neuroscience, Boston University, Boston, MA, USA.
| | - Andrew Emili
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada; Program in Bioinformatics, Boston University, Boston, MA, USA; Center for Network Systems Biology, Boston University, Boston, MA, USA; Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, MA, USA; Departments of Biochemistry and Biology, Boston University, Boston, MA, USA.
| |
Collapse
|
7
|
Wei Y, Xiong ZJ, Li J, Zou C, Cairo CW, Klassen JS, Privé GG. Crystal structures of human lysosomal EPDR1 reveal homology with the superfamily of bacterial lipoprotein transporters. Commun Biol 2019; 2:52. [PMID: 30729188 PMCID: PMC6363788 DOI: 10.1038/s42003-018-0262-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 12/11/2018] [Indexed: 01/01/2023] Open
Abstract
EPDR1, a member of the ependymin-related protein family, is a relatively uncharacterized protein found in the lysosomes and secretomes of most vertebrates. Despite having roles in human disease and health, the molecular functions of EPDR1 remain unknown. Here, we present crystal structures of human EPDR1 and reveal that the protein adopts a fold previously seen only in bacterial proteins related to the LolA lipoprotein transporter. EPDR1 forms a homodimer with an overall shape resembling a half-shell with two non-overlapping hydrophobic grooves on the flat side of the hemisphere. EPDR1 can interact with membranes that contain negatively charged lipids, including BMP and GM1, and we suggest that EPDR1 may function as a lysosomal activator protein or a lipid transporter. A phylogenetic analysis reveals that the fold is more widely distributed than previously suspected, with representatives identified in all branches of cellular life.
Collapse
Affiliation(s)
- Yong Wei
- Princess Margaret Cancer Centre, Toronto, M5G 1L7 ON Canada
| | - Zi Jian Xiong
- Department of Biochemistry, University of Toronto, Toronto, M5S 1A8 ON Canada
| | - Jun Li
- Alberta Glycomics Centre and Department of Chemistry, University of Alberta, Edmonton, T6G 2G2 AB Canada
| | - Chunxia Zou
- Alberta Glycomics Centre and Department of Chemistry, University of Alberta, Edmonton, T6G 2G2 AB Canada
| | - Christopher W. Cairo
- Alberta Glycomics Centre and Department of Chemistry, University of Alberta, Edmonton, T6G 2G2 AB Canada
| | - John S. Klassen
- Alberta Glycomics Centre and Department of Chemistry, University of Alberta, Edmonton, T6G 2G2 AB Canada
| | - Gilbert G. Privé
- Princess Margaret Cancer Centre, Toronto, M5G 1L7 ON Canada
- Department of Biochemistry, University of Toronto, Toronto, M5S 1A8 ON Canada
- Department of Medical Biophysics, University of Toronto, Toronto, M5G 1L7 ON Canada
| |
Collapse
|
8
|
Swapna LS, Molinaro AM, Lindsay-Mosher N, Pearson BJ, Parkinson J. Comparative transcriptomic analyses and single-cell RNA sequencing of the freshwater planarian Schmidtea mediterranea identify major cell types and pathway conservation. Genome Biol 2018; 19:124. [PMID: 30143032 PMCID: PMC6109357 DOI: 10.1186/s13059-018-1498-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 08/01/2018] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND In the Lophotrochozoa/Spiralia superphylum, few organisms have as high a capacity for rapid testing of gene function and single-cell transcriptomics as the freshwater planaria. The species Schmidtea mediterranea in particular has become a powerful model to use in studying adult stem cell biology and mechanisms of regeneration. Despite this, systematic attempts to define gene complements and their annotations are lacking, restricting comparative analyses that detail the conservation of biochemical pathways and identify lineage-specific innovations. RESULTS In this study we compare several transcriptomes and define a robust set of 35,232 transcripts. From this, we perform systematic functional annotations and undertake a genome-scale metabolic reconstruction for S. mediterranea. Cross-species comparisons of gene content identify conserved, lineage-specific, and expanded gene families, which may contribute to the regenerative properties of planarians. In particular, we find that the TRAF gene family has been greatly expanded in planarians. We further provide a single-cell RNA sequencing analysis of 2000 cells, revealing both known and novel cell types defined by unique signatures of gene expression. Among these are a novel mesenchymal cell population as well as a cell type involved in eye regeneration. Integration of our metabolic reconstruction further reveals the extent to which given cell types have adapted energy and nucleotide biosynthetic pathways to support their specialized roles. CONCLUSIONS In general, S. mediterranea displays a high level of gene and pathway conservation compared with other model systems, rendering it a viable model to study the roles of these pathways in stem cell biology and regeneration.
Collapse
Affiliation(s)
| | - Alyssa M Molinaro
- Hospital for Sick Children, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Nicole Lindsay-Mosher
- Hospital for Sick Children, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Bret J Pearson
- Hospital for Sick Children, Toronto, ON, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada. .,Ontario Institute for Cancer Research, Toronto, ON, Canada.
| | - John Parkinson
- Hospital for Sick Children, Toronto, ON, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada. .,Department of Biochemistry, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
9
|
Niu Y, Liu C, Moghimyfiroozabad S, Yang Y, Alavian KN. PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages. PeerJ 2017; 5:e3712. [PMID: 28875072 PMCID: PMC5578374 DOI: 10.7717/peerj.3712] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 07/28/2017] [Indexed: 02/05/2023] Open
Abstract
Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our results with the prediction outcomes of previously published co-occurrency model-based and normalization methods. Here we also introduce PrePhyloPro, a web-based software that uses our method for accurately predicting proteome-wide linkages. We present data on interactions of human mitochondrial proteins, verifying the performance of this software. PrePhyloPro is freely available at http://prephylopro.org/phyloprofile/.
Collapse
Affiliation(s)
- Yulong Niu
- Department of Medicine, Division of Brain Sciences, Imperial College London, London, United Kingdom.,Key Lab of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan, China.,School of Medicine, Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, United States of America
| | - Chengcheng Liu
- Department of Periodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | | | - Yi Yang
- Key Lab of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan, China
| | - Kambiz N Alavian
- Department of Medicine, Division of Brain Sciences, Imperial College London, London, United Kingdom.,School of Medicine, Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, United States of America.,Department of Biology, The Bahá'í Institute for Higher Education (BIHE), Tehran, Iran
| |
Collapse
|