51
|
Mallam AL, Marcotte EM. Systems-wide Studies Uncover Commander, a Multiprotein Complex Essential to Human Development. Cell Syst 2019; 4:483-494. [PMID: 28544880 DOI: 10.1016/j.cels.2017.04.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Revised: 01/25/2017] [Accepted: 03/23/2017] [Indexed: 11/27/2022]
Abstract
Recent mass spectrometry maps of the human interactome independently support the existence of a large multiprotein complex, dubbed "Commander." Broadly conserved across animals and ubiquitously expressed in nearly every human cell type examined thus far, Commander likely plays a fundamental cellular function, akin to other ubiquitous machines involved in expression, proteostasis, and trafficking. Experiments on individual subunits support roles in endosomal protein sorting, including the trafficking of Notch proteins, copper transporters, and lipoprotein receptors. Commander is critical for vertebrate embryogenesis, and defects in the complex and its interaction partners disrupt craniofacial, brain, and heart development. Here, we review the synergy between large-scale proteomic efforts and focused studies in the discovery of Commander, describe its composition, structure, and function, and discuss how it illustrates the power of systems biology. Based on 3D modeling and biochemical data, we draw strong parallels between Commander and the retromer cargo-recognition complex, laying a foundation for future research into Commander's role in human developmental disorders.
Collapse
Affiliation(s)
- Anna L Mallam
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA.
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
52
|
Hu Z, Sackton TB, Edwards SV, Liu JS. Bayesian Detection of Convergent Rate Changes of Conserved Noncoding Elements on Phylogenetic Trees. Mol Biol Evol 2019; 36:1086-1100. [PMID: 30851112 PMCID: PMC6501877 DOI: 10.1093/molbev/msz049] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Conservation of DNA sequence over evolutionary time is a strong indicator of function, and gain or loss of sequence conservation can be used to infer changes in function across a phylogeny. Changes in evolutionary rates on particular lineages in a phylogeny can indicate shared functional shifts, and thus can be used to detect genomic correlates of phenotypic convergence. However, existing methods do not allow easy detection of patterns of rate variation, which causes challenges for detecting convergent rate shifts or other complex evolutionary scenarios. Here we introduce PhyloAcc, a new Bayesian method to model substitution rate changes in conserved elements across a phylogeny. The method assumes several categories of substitution rate for each branch on the phylogenetic tree, estimates substitution rates per category, and detects changes of substitution rate as the posterior probability of a category switch. Simulations show that PhyloAcc can detect genomic regions with rate shifts in multiple target species better than previous methods and has a higher accuracy of reconstructing complex patterns of substitution rate changes than prevalent Bayesian relaxed clock models. We demonstrate the utility of PhyloAcc in two classic examples of convergent phenotypes: loss of flight in birds and the transition to marine life in mammals. In each case, our approach reveals numerous examples of conserved nonexonic elements with accelerations specific to the phenotypically convergent lineages. Our method is widely applicable to any set of conserved elements where multiple rate changes are expected on a phylogeny.
Collapse
Affiliation(s)
- Zhirui Hu
- Department of Statistics, Harvard University, Cambridge, MA
| | | | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA.,Museum of Comparative Zoology, Harvard University, Cambridge, MA
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, MA
| |
Collapse
|
53
|
Abstract
Cilia have evolved to function as essential sensory organelles in animals. To understand why cilia are intimately associated with cell signaling, Sigg et al. (2017) develop and apply a comparative proteomics approach, reported in this issue of Developmental Cell, to analyze the evolutionary relationship between cilia and various signaling pathways.
Collapse
Affiliation(s)
- Avital S Shulman
- Cell Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA
| | - Meng-Fu Bryan Tsou
- Cell Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA.
| |
Collapse
|
54
|
Schweiger R, Erlich Y, Carmi S. FactorialHMM: fast and exact inference in factorial hidden Markov models. Bioinformatics 2019; 35:2162-2164. [PMID: 30445428 DOI: 10.1093/bioinformatics/bty944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 11/07/2018] [Accepted: 11/13/2018] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Hidden Markov models (HMMs) are powerful tools for modeling processes along the genome. In a standard genomic HMM, observations are drawn, at each genomic position, from a distribution whose parameters depend on a hidden state, and the hidden states evolve along the genome as a Markov chain. Often, the hidden state is the Cartesian product of multiple processes, each evolving independently along the genome. Inference in these so-called Factorial HMMs has a naïve running time that scales as the square of the number of possible states, which by itself increases exponentially with the number of sub-chains; such a running time scaling is impractical for many applications. While faster algorithms exist, there is no available implementation suitable for developing bioinformatics applications. RESULTS We developed FactorialHMM, a Python package for fast exact inference in Factorial HMMs. Our package allows simulating either directly from the model or from the posterior distribution of states given the observations. Additionally, we allow the inference of all key quantities related to HMMs: (i) the (Viterbi) sequence of states with the highest posterior probability; (ii) the likelihood of the data and (iii) the posterior probability (given all observations) of the marginal and pairwise state probabilities. The running time and space requirement of all procedures is linearithmic in the number of possible states. Our package is highly modular, providing the user with maximal flexibility for developing downstream applications. AVAILABILITY AND IMPLEMENTATION https://github.com/regevs/factorial_hmm. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Regev Schweiger
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.,MyHeritage, Or Yehuda, Israel
| | - Yaniv Erlich
- MyHeritage, Or Yehuda, Israel.,Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA.,Department of Systems Biology, Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA.,New York Genome Center, New York, NY, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
55
|
Tripathi B, Parthasarathy S, Sinha H, Raman K, Ravindran B. Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks. Front Genet 2019; 10:164. [PMID: 30918511 PMCID: PMC6424898 DOI: 10.3389/fgene.2019.00164] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 02/14/2019] [Indexed: 11/13/2022] Open
Abstract
Biological networks catalog the complex web of interactions happening between different molecules, typically proteins, within a cell. These networks are known to be highly modular, with groups of proteins associated with specific biological functions. Human diseases often arise from the dysfunction of one or more such proteins of the biological functional group. The ability, to identify and automatically extract these modules has implications for understanding the etiology of different diseases as well as the functional roles of different protein modules in disease. The recent DREAM challenge posed the problem of identifying disease modules from six heterogeneous networks of proteins/genes. There exist many community detection algorithms, but all of them are not adaptable to the biological context, as these networks are densely connected and the size of biologically relevant modules is quite small. The contribution of this study is 3-fold: first, we present a comprehensive assessment of many classic community detection algorithms for biological networks to identify non-overlapping communities, and propose heuristics to identify small and structurally well-defined communities-core modules. We evaluated our performance over 180 GWAS datasets. In comparison to traditional approaches, with our proposed approach we could identify 50% more number of disease-relevant modules. Thus, we show that it is important to identify more compact modules for better performance. Next, we sought to understand the peculiar characteristics of disease-enriched modules and what causes standard community detection algorithms to detect so few of them. We performed a comprehensive analysis of the interaction patterns of known disease genes to understand the structure of disease modules and show that merely considering the known disease genes set as a module does not give good quality clusters, as measured by typical metrics such as modularity and conductance. We go on to present a methodology leveraging these known disease genes, to also include the neighboring nodes of these genes into a module, to form good quality clusters and subsequently extract a "gold-standard set" of disease modules. Lastly, we demonstrate, with justification, that "overlapping" community detection algorithms should be the preferred choice for disease module identification since several genes participate in multiple biological functions.
Collapse
Affiliation(s)
- Beethika Tripathi
- Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India.,Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India.,Robert Bosch Centre for Data Science and AI, Indian Institute of Technology Madras, Chennai, India
| | - Srinivasan Parthasarathy
- Department of Computer Science and Engineering, Ohio State University, Columbus, OH, United States.,Department of Biomedical Informatics, Ohio State University, Columbus, OH, United States
| | - Himanshu Sinha
- Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India.,Robert Bosch Centre for Data Science and AI, Indian Institute of Technology Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Karthik Raman
- Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India.,Robert Bosch Centre for Data Science and AI, Indian Institute of Technology Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Balaraman Ravindran
- Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India.,Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India.,Robert Bosch Centre for Data Science and AI, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
56
|
Li Y, Ning S, Calvo SE, Mootha VK, Liu JS. Bayesian hidden Markov tree models for clustering genes with shared evolutionary history. Ann Appl Stat 2019. [DOI: 10.1214/18-aoas1208] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
57
|
Comparative expression profiling reveals widespread coordinated evolution of gene expression across eukaryotes. Nat Commun 2018; 9:4963. [PMID: 30470754 PMCID: PMC6251915 DOI: 10.1038/s41467-018-07436-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 10/24/2018] [Indexed: 12/17/2022] Open
Abstract
Comparative studies of gene expression across species have revealed many important insights, but have also been limited by the number of species represented. Here we develop an approach to identify orthologs between highly diverged transcriptome assemblies, and apply this to 657 RNA-seq gene expression profiles from 309 diverse unicellular eukaryotes. We analyzed the resulting data for coevolutionary patterns, and identify several hundred protein complexes and pathways whose expression levels have evolved in a coordinated fashion across the trillions of generations separating these species, including many gene sets with little or no within-species co-expression across environmental or genetic perturbations. We also detect examples of adaptive evolution, for example of tRNA ligase levels to match genome-wide codon usage. In sum, we find that comparative studies from extremely diverse organisms can reveal new insights into the evolution of gene expression, including coordinated evolution of some of the most conserved protein complexes in eukaryotes. Gene pairs that are coexpressed across various environmental conditions in multiple species suggest functional similarity. Here the authors analyze patterns of gene expression co-evolution across diverse eukaryotes, and identify hundreds of protein complexes and pathways whose gene expression levels have co-evolved since their ancient divergence.
Collapse
|
58
|
Skinnider MA, Stacey RG, Foster LJ. Genomic data integration systematically biases interactome mapping. PLoS Comput Biol 2018; 14:e1006474. [PMID: 30332399 PMCID: PMC6192561 DOI: 10.1371/journal.pcbi.1006474] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 08/30/2018] [Indexed: 12/15/2022] Open
Abstract
Elucidating the complete network of protein-protein interactions, or interactome, is a fundamental goal of the post-genomic era, yet existing interactome maps are far from complete. To increase the throughput and resolution of interactome mapping, methods for protein-protein interaction discovery by co-migration have been introduced. However, accurate identification of interacting protein pairs within the resulting large-scale proteomic datasets is challenging. Consequently, most computational pipelines for co-migration data analysis incorporate external genomic datasets to distinguish interacting from non-interacting protein pairs. The effect of this procedure on interactome mapping is poorly understood. Here, we conduct a rigorous analysis of genomic data integration for interactome recovery across a large number of co-migration datasets, spanning diverse experimental and computational methods. We find that genomic data integration leads to an increase in the functional coherence of the resulting interactome maps, but this comes at the expense of a decrease in power to discover novel interactions. Importantly, putative novel interactions predicted by genomic data integration are no more likely to later be experimentally discovered than those predicted from co-migration data alone. Our results reveal a widespread and unappreciated limitation in a methodology that has been widely used to map the interactome of humans and model organisms.
Collapse
Affiliation(s)
| | - R. Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Leonard J. Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
- Department of Biochemistry, University of British Columbia, Vancouver, Canada
| |
Collapse
|
59
|
Endosomal Retrieval of Cargo: Retromer Is Not Alone. Trends Cell Biol 2018; 28:807-822. [PMID: 30072228 DOI: 10.1016/j.tcb.2018.06.005] [Citation(s) in RCA: 92] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 06/15/2018] [Accepted: 06/22/2018] [Indexed: 11/20/2022]
Abstract
Endosomes are major protein sorting stations in cells. Endosomally localised multi-protein complexes sort integral proteins, including signaling receptors, nutrient transporters, adhesion molecules, and lysosomal hydrolase receptors, for lysosomal degradation or conversely for retrieval and subsequent recycling to various membrane compartments. Correct endosomal sorting of these proteins is essential for maintaining cellular homeostasis, with defects in endosomal sorting implicated in various human pathologies including neurodegenerative disorders. Retromer, an ancient multi-protein complex, is essential for the retrieval and recycling of hundreds of transmembrane proteins. While retromer is a major player in endosomal retrieval and recycling, several studies have recently identified retrieval mechanisms that are independent of retromer. Here, we review endosomal retrieval complexes, with a focus on recently discovered retromer-independent mechanisms.
Collapse
|
60
|
Vidulin V, Šmuc T, Džeroski S, Supek F. The evolutionary signal in metagenome phyletic profiles predicts many gene functions. MICROBIOME 2018; 6:129. [PMID: 29991352 PMCID: PMC6040064 DOI: 10.1186/s40168-018-0506-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 06/19/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND The function of many genes is still not known even in model organisms. An increasing availability of microbiome DNA sequencing data provides an opportunity to infer gene function in a systematic manner. RESULTS We evaluated if the evolutionary signal contained in metagenome phyletic profiles (MPP) is predictive of a broad array of gene functions. The MPPs are an encoding of environmental DNA sequencing data that consists of relative abundances of gene families across metagenomes. We find that such MPPs can accurately predict 826 Gene Ontology functional categories, while drawing on human gut microbiomes, ocean metagenomes, and DNA sequences from various other engineered and natural environments. Overall, in this task, the MPPs are highly accurate, and moreover they provide coverage for a set of Gene Ontology terms largely complementary to standard phylogenetic profiles, derived from fully sequenced genomes. We also find that metagenomes approximated from taxon relative abundance obtained via 16S rRNA gene sequencing may provide surprisingly useful predictive models. Crucially, the MPPs derived from different types of environments can infer distinct, non-overlapping sets of gene functions and therefore complement each other. Consistently, simulations on > 5000 metagenomes indicate that the amount of data is not in itself critical for maximizing predictive accuracy, while the diversity of sampled environments appears to be the critical factor for obtaining robust models. CONCLUSIONS In past work, metagenomics has provided invaluable insight into ecology of various habitats, into diversity of microbial life and also into human health and disease mechanisms. We propose that environmental DNA sequencing additionally constitutes a useful tool to predict biological roles of genes, yielding inferences out of reach for existing comparative genomics approaches.
Collapse
Affiliation(s)
- Vedrana Vidulin
- Faculty of Information Studies, 8000 Novo Mesto, Slovenia
- Division of Electronics, Rudjer Boskovic Institute, 10000 Zagreb, Croatia
- Department of Knowledge Technologies, Jozef Stefan Institute, 1000 Ljubljana, Slovenia
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, 10000 Zagreb, Croatia
| | - Sašo Džeroski
- Department of Knowledge Technologies, Jozef Stefan Institute, 1000 Ljubljana, Slovenia
| | - Fran Supek
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| |
Collapse
|
61
|
Li T, Kim A, Rosenbluh J, Horn H, Greenfeld L, An D, Zimmer A, Liberzon A, Bistline J, Natoli T, Li Y, Tsherniak A, Narayan R, Subramanian A, Liefeld T, Wong B, Thompson D, Calvo S, Carr S, Boehm J, Jaffe J, Mesirov J, Hacohen N, Regev A, Lage K. GeNets: a unified web platform for network-based genomic analyses. Nat Methods 2018; 15:543-546. [PMID: 29915188 PMCID: PMC6450090 DOI: 10.1038/s41592-018-0039-6] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 05/02/2018] [Indexed: 11/25/2022]
Abstract
Functional genomics networks are widely used to identify unexpected pathway relationships in large genomic datasets. However, it is challenging to quantitatively compare the signal-to-noise ratio of different networks, the biology they describe, and to identify the optimal network to interpret a particular genetic dataset. Via GeNets users can train a machine-learning model (Quack) to make such comparisons; and they can execute, store, and share analyses of genetic and RNA sequencing datasets.
Collapse
Affiliation(s)
- Taibo Li
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Electrical Engineering & Computer Science, MIT, Cambridge, MA, USA.,Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - April Kim
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph Rosenbluh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Heiko Horn
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Liraz Greenfeld
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David An
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrew Zimmer
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jon Bistline
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ted Natoli
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yang Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA.,Department of Statistics, Harvard University, Cambridge, MA, USA
| | | | - Rajiv Narayan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Ted Liefeld
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Bang Wong
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Dawn Thompson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sarah Calvo
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA
| | - Steve Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jesse Boehm
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jake Jaffe
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill Mesirov
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Medicine, University of California, San Diego, San Diego, CA, USA
| | - Nir Hacohen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Center for Immunology and Inflammatory Diseases, Massachusetts General Hospital, Boston, MA, USA.,Department of Surgery, Harvard Medical School, Boston, MA, USA
| | - Aviv Regev
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Howard Hughes Medical Institute, Department of Biology, MIT, Cambridge, MA, USA
| | - Kasper Lage
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, USA. .,Department of Surgery, Harvard Medical School, Boston, MA, USA. .,Institute for Biological Psychiatry, Mental Health Center Sct. Hans, University of Copenhagen, Roskilde, Denmark.
| |
Collapse
|
62
|
Nevers Y, Prasad MK, Poidevin L, Chennen K, Allot A, Kress A, Ripp R, Thompson JD, Dollfus H, Poch O, Lecompte O. Insights into Ciliary Genes and Evolution from Multi-Level Phylogenetic Profiling. Mol Biol Evol 2018; 34:2016-2034. [PMID: 28460059 PMCID: PMC5850483 DOI: 10.1093/molbev/msx146] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Cilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ancestor, and are involved in cell motility and integration of extracellular signals. Ciliary dysfunction causes a class of genetic diseases, known as ciliopathies, however current knowledge of the underlying mechanisms is still limited and a better characterization of genes is needed. As cilia have been lost independently several times during evolution and they are subject to important functional variation between species, ciliary genes can be investigated through comparative genomics. We performed phylogenetic profiling by predicting orthologs of human protein-coding genes in 100 eukaryotic species. The analysis integrated three independent methods to predict a consensus set of 274 ciliary genes, including 87 new promising candidates. A fine-grained analysis of the phylogenetic profiles allowed a partitioning of ciliary genes into modules with distinct evolutionary histories and ciliary functions (assembly, movement, centriole, etc.) and thus propagation of potential annotations to previously undocumented genes. The cilia/basal body localization was experimentally confirmed for five of these previously unannotated proteins (LRRC23, LRRC34, TEX9, WDR27, and BIVM), validating the relevance of our approach. Furthermore, our multi-level analysis sheds light on the core gene sets retained in gamete-only flagellates or Ecdysozoa for instance. By combining gene-centric and species-oriented analyses, this work reveals new ciliary and ciliopathy gene candidates and provides clues about the evolution of ciliary processes in the eukaryotic domain. Additionally, the positive and negative reference gene sets and the phylogenetic profile of human genes constructed during this study can be exploited in future work.
Collapse
Affiliation(s)
- Yannis Nevers
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Megana K Prasad
- Laboratoire de Génétique Médicale, Institut de Génétique Médicale d'Alsace, INSERM U1112, Université de Strasbourg, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Strasbourg, France
| | - Laetitia Poidevin
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Kirsley Chennen
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Alexis Allot
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Arnaud Kress
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Raymond Ripp
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Julie D Thompson
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Hélène Dollfus
- Laboratoire de Génétique Médicale, Institut de Génétique Médicale d'Alsace, INSERM U1112, Université de Strasbourg, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Strasbourg, France.,Centre de Référence pour les Affections Rares en Génétique Ophtalmologique, Service de Génétique Médicale, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| | - Odile Lecompte
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, Fédération de Médecine Translationnelle, Strasbourg, France
| |
Collapse
|
63
|
Psomopoulos FE, Vitsios DM, Baichoo S, Ouzounis CA. BioPAXViz: a cytoscape application for the visual exploration of metabolic pathway evolution. Bioinformatics 2018; 33:1418-1420. [PMID: 28453679 DOI: 10.1093/bioinformatics/btw813] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 01/06/2017] [Indexed: 11/12/2022] Open
Abstract
Summary BioPAXViz is a Cytoscape (version 3) application, providing a comprehensive framework for metabolic pathway visualization. Beyond the basic parsing, viewing and browsing roles, the main novel function that BioPAXViz provides is a visual comparative analysis of metabolic pathway topologies across pre-computed pathway phylogenomic profiles given a species phylogeny. Furthermore, BioPAXViz supports the display of hierarchical trees that allow efficient navigation through sets of variants of a single reference pathway. Thus, BioPAXViz can significantly facilitate, and contribute to, the study of metabolic pathway evolution and engineering. Availability and Implementation BioPAXViz has been developed as a Cytoscape app and is available at: https://github.com/CGU-CERTH/BioPAX.Viz. The software is distributed under the MIT License and is accompanied by example files and data. Additional documentation is available at the aforementioned GitHub repository. Contact ouzounis@certh.gr.
Collapse
Affiliation(s)
- Fotis E Psomopoulos
- Computational Genomics Unit, Institute of Applied Biosciences, Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece
| | - Dimitrios M Vitsios
- Computational Genomics Unit, Institute of Applied Biosciences, Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece.,The European Bioinformatics Institute, EMBL Cambridge Outstation, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Shakuntala Baichoo
- Department of Computer Science & Engineering, Faculty of Engineering, University of Mauritius, Reduit 80837, Mauritius
| | - Christos A Ouzounis
- Computational Genomics Unit, Institute of Applied Biosciences, Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece.,Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
64
|
Sigg MA, Menchen T, Lee C, Johnson J, Jungnickel MK, Choksi SP, Garcia G, Busengdal H, Dougherty GW, Pennekamp P, Werner C, Rentzsch F, Florman HM, Krogan N, Wallingford JB, Omran H, Reiter JF. Evolutionary Proteomics Uncovers Ancient Associations of Cilia with Signaling Pathways. Dev Cell 2018; 43:744-762.e11. [PMID: 29257953 DOI: 10.1016/j.devcel.2017.11.014] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Revised: 09/18/2017] [Accepted: 11/17/2017] [Indexed: 12/19/2022]
Abstract
Cilia are organelles specialized for movement and signaling. To infer when during evolution signaling pathways became associated with cilia, we characterized the proteomes of cilia from sea urchins, sea anemones, and choanoflagellates. We identified 437 high-confidence ciliary candidate proteins conserved in mammals and discovered that Hedgehog and G-protein-coupled receptor pathways were linked to cilia before the origin of bilateria and transient receptor potential (TRP) channels before the origin of animals. We demonstrated that candidates not previously implicated in ciliary biology localized to cilia and further investigated ENKUR, a TRP channel-interacting protein identified in the cilia of all three organisms. ENKUR localizes to motile cilia and is required for patterning the left-right axis in vertebrates. Moreover, mutation of ENKUR causes situs inversus in humans. Thus, proteomic profiling of cilia from diverse eukaryotes defines a conserved ciliary proteome, reveals ancient connections to signaling, and uncovers a ciliary protein that underlies development and human disease.
Collapse
Affiliation(s)
- Monika Abedin Sigg
- Department of Biochemistry and Biophysics, Cardiovascular Research Institute, University of California, San Francisco, CA 94158, USA
| | - Tabea Menchen
- Department of General Pediatrics, University Children's Hospital Muenster, Muenster 48149, Germany
| | - Chanjae Lee
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology and Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Jeffery Johnson
- Gladstone Institute of Cardiovascular Disease and Gladstone Institute of Virology and Immunology, San Francisco, CA 94158, USA
| | - Melissa K Jungnickel
- Department of Cell and Developmental Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Semil P Choksi
- Department of Biochemistry and Biophysics, Cardiovascular Research Institute, University of California, San Francisco, CA 94158, USA
| | - Galo Garcia
- Department of Biochemistry and Biophysics, Cardiovascular Research Institute, University of California, San Francisco, CA 94158, USA
| | - Henriette Busengdal
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5008, Norway
| | - Gerard W Dougherty
- Department of General Pediatrics, University Children's Hospital Muenster, Muenster 48149, Germany
| | - Petra Pennekamp
- Department of General Pediatrics, University Children's Hospital Muenster, Muenster 48149, Germany
| | - Claudius Werner
- Department of General Pediatrics, University Children's Hospital Muenster, Muenster 48149, Germany
| | - Fabian Rentzsch
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5008, Norway
| | - Harvey M Florman
- Department of Cell and Developmental Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Nevan Krogan
- Gladstone Institute of Cardiovascular Disease and Gladstone Institute of Virology and Immunology, San Francisco, CA 94158, USA; Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA
| | - John B Wallingford
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology and Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Heymut Omran
- Department of General Pediatrics, University Children's Hospital Muenster, Muenster 48149, Germany
| | - Jeremy F Reiter
- Department of Biochemistry and Biophysics, Cardiovascular Research Institute, University of California, San Francisco, CA 94158, USA.
| |
Collapse
|
65
|
Niu Y, Moghimyfiroozabad S, Safaie S, Yang Y, Jonas EA, Alavian KN. Phylogenetic Profiling of Mitochondrial Proteins and Integration Analysis of Bacterial Transcription Units Suggest Evolution of F1Fo ATP Synthase from Multiple Modules. J Mol Evol 2017; 85:219-233. [PMID: 29177973 PMCID: PMC5709465 DOI: 10.1007/s00239-017-9819-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 11/11/2017] [Indexed: 11/26/2022]
Abstract
ATP synthase is a complex universal enzyme responsible for ATP synthesis across all kingdoms of life. The F-type ATP synthase has been suggested to have evolved from two functionally independent, catalytic (F1) and membrane bound (Fo), ancestral modules. While the modular evolution of the synthase is supported by studies indicating independent assembly of the two subunits, the presence of intermediate assembly products suggests a more complex evolutionary process. We analyzed the phylogenetic profiles of the human mitochondrial proteins and bacterial transcription units to gain additional insight into the evolution of the F-type ATP synthase complex. In this study, we report the presence of intermediary modules based on the phylogenetic profiles of the human mitochondrial proteins. The two main intermediary modules comprise the α3β3 hexamer in the F1 and the c-subunit ring in the Fo. A comprehensive analysis of bacterial transcription units of F1Fo ATP synthase revealed that while a long and constant order of F1Fo ATP synthase genes exists in a majority of bacterial genomes, highly conserved combinations of separate transcription units are present among certain bacterial classes and phyla. Based on our findings, we propose a model that includes the involvement of multiple modules in the evolution of F1Fo ATP synthase. The central and peripheral stalk subunits provide a link for the integration of the F1/Fo modules.
Collapse
Affiliation(s)
- Yulong Niu
- Division of Brain Sciences, Department of Medicine, Imperial College London, E508, Burlington Danes Hammersmith Hospital, DuCane Road, London, W12 0NN, UK
- Key Lab of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China
- Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, USA
| | | | - Sepehr Safaie
- Department of Mathematics and Computer Science, The Bahá'í Institute for Higher Education (BIHE), Tehran, Iran
| | - Yi Yang
- Key Lab of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China
| | - Elizabeth A Jonas
- Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, USA
| | - Kambiz N Alavian
- Division of Brain Sciences, Department of Medicine, Imperial College London, E508, Burlington Danes Hammersmith Hospital, DuCane Road, London, W12 0NN, UK.
- Department of Biology, The Bahá'í Institute for Higher Education (BIHE), Tehran, Iran.
- Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, USA.
| |
Collapse
|
66
|
Fukasawa Y, Oda T, Tomii K, Imai K. Origin and Evolutionary Alteration of the Mitochondrial Import System in Eukaryotic Lineages. Mol Biol Evol 2017; 34:1574-1586. [PMID: 28369657 PMCID: PMC5455965 DOI: 10.1093/molbev/msx096] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Protein transport systems are fundamentally important for maintaining mitochondrial function. Nevertheless, mitochondrial protein translocases such as the kinetoplastid ATOM complex have recently been shown to vary in eukaryotic lineages. Various evolutionary hypotheses have been formulated to explain this diversity. To resolve any contradiction, estimating the primitive state and clarifying changes from that state are necessary. Here, we present more likely primitive models of mitochondrial translocases, specifically the translocase of the outer membrane (TOM) and translocase of the inner membrane (TIM) complexes, using scrutinized phylogenetic profiles. We then analyzed the translocases’ evolution in eukaryotic lineages. Based on those results, we propose a novel evolutionary scenario for diversification of the mitochondrial transport system. Our results indicate that presequence transport machinery was mostly established in the last eukaryotic common ancestor, and that primitive translocases already had a pathway for transporting presequence-containing proteins. Moreover, secondary changes including convergent and migrational gains of a presequence receptor in TOM and TIM complexes, respectively, likely resulted from constrained evolution. The nature of a targeting signal can constrain alteration to the protein transport complex.
Collapse
Affiliation(s)
- Yoshinori Fukasawa
- Artificial Intelligence Research Center, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan
| | - Toshiyuki Oda
- Artificial Intelligence Research Center, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan.,Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan
| | - Kenichiro Imai
- Artificial Intelligence Research Center, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan.,Biotechnology Research Institute for Drug Discovery, National Institute of Advanced Science and Technology (AIST), Tokyo, Japan
| |
Collapse
|
67
|
Xu C, Liu R, Zhang Q, Chen X, Qian Y, Fang W. The Diversification of Evolutionarily Conserved MAPK Cascades Correlates with the Evolution of Fungal Species and Development of Lifestyles. Genome Biol Evol 2017; 9:311-322. [PMID: 26957028 PMCID: PMC5381651 DOI: 10.1093/gbe/evw051] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/04/2016] [Indexed: 11/14/2022] Open
Abstract
The fungal kingdom displays an extraordinary diversity of lifestyles, developmental processes, and ecological niches. The MAPK (mitogen-activated protein kinase) cascade consists of interlinked MAPKKK, MAPKK, and MAPK, and collectively such cascades play pivotal roles in cellular regulation in fungi. However, the mechanism by which evolutionarily conserved MAPK cascades regulate diverse output responses in fungi remains unknown. Here we identified the full complement of MAPK cascade components from 231 fungal species encompassing 9 fungal phyla. Using the largest data set to date, we found that MAPK family members could have two ancestors, while MAPKK and MAPKKK family members could have only one ancestor. The current MAPK, MAPKK, and MAPKKK subfamilies resulted from duplications and subsequent subfunctionalization during the emergence of the fungal kingdom. However, the gene structure diversification and gene expansion and loss have resulted in significant diversity in fungal MAPK cascades, correlating with the evolution of fungal species and lifestyles. In particular, a distinct evolutionary trajectory of MAPK cascades was identified in single-celled fungi in the Saccharomycetes. All MAPK, MAPKK, and MAPKKK subfamilies expanded in the Saccharomycetes; genes encoding MAPK cascade components have a similar exon–intron structure in this class that differs from those in other fungi.
Collapse
Affiliation(s)
- Chuan Xu
- Institute of Microbiology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Ran Liu
- Institute of Microbiology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Qiangqiang Zhang
- Institute of Microbiology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xiaoxuan Chen
- Institute of Microbiology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Ying Qian
- Institute of Microbiology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | | |
Collapse
|
68
|
McNally KE, Faulkner R, Steinberg F, Gallon M, Ghai R, Pim D, Langton P, Pearson N, Danson CM, Nägele H, Morris LL, Singla A, Overlee BL, Heesom KJ, Sessions R, Banks L, Collins BM, Berger I, Billadeau DD, Burstein E, Cullen PJ. Retriever is a multiprotein complex for retromer-independent endosomal cargo recycling. Nat Cell Biol 2017; 19:1214-1225. [PMID: 28892079 PMCID: PMC5790113 DOI: 10.1038/ncb3610] [Citation(s) in RCA: 220] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 08/10/2017] [Indexed: 02/08/2023]
Abstract
Following endocytosis into the endosomal network, integral membrane proteins undergo sorting for lysosomal degradation or are retrieved and recycled back to the cell surface. Here we describe the discovery of an ancient and conserved multiprotein complex that orchestrates cargo retrieval and recycling and, importantly, is biochemically and functionally distinct from the established retromer pathway. We have called this complex 'retriever'; it is a heterotrimer composed of DSCR3, C16orf62 and VPS29, and bears striking similarity to retromer. We establish that retriever associates with the cargo adaptor sorting nexin 17 (SNX17) and couples to CCC (CCDC93, CCDC22, COMMD) and WASH complexes to prevent lysosomal degradation and promote cell surface recycling of α5β1 integrin. Through quantitative proteomic analysis, we identify over 120 cell surface proteins, including numerous integrins, signalling receptors and solute transporters, that require SNX17-retriever to maintain their surface levels. Our identification of retriever establishes a major endosomal retrieval and recycling pathway.
Collapse
Affiliation(s)
- Kerrie E McNally
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Rebecca Faulkner
- Department of Internal Medicine and Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Florian Steinberg
- Center for Biological Systems Analysis, Albert Ludwigs Universitaet Freiburg, 79104 Freiburg, Germany
| | - Matthew Gallon
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Rajesh Ghai
- Institute for Molecular Bioscience, the University of Queensland, St. Lucia, Queensland 4072, Australia
| | - David Pim
- International Centre for Genetic Engineering and Biotechnology, Padriciano 99, I-34149 Trieste, Italy
| | - Paul Langton
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Neil Pearson
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Chris M Danson
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Heike Nägele
- Center for Biological Systems Analysis, Albert Ludwigs Universitaet Freiburg, 79104 Freiburg, Germany
| | - Lindsey L Morris
- Department of Internal Medicine and Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Amika Singla
- Department of Internal Medicine and Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Brittany L Overlee
- Department of Biochemistry and Molecular Biology, and Department of Immunology, Mayo Clinic, Rochester, Minnesota 55905, USA
| | - Kate J Heesom
- Proteomics Facility, School of Biochemistry, University of Bristol, Bristol BS8 1TD, UK
| | - Richard Sessions
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Lawrence Banks
- International Centre for Genetic Engineering and Biotechnology, Padriciano 99, I-34149 Trieste, Italy
| | - Brett M Collins
- Institute for Molecular Bioscience, the University of Queensland, St. Lucia, Queensland 4072, Australia
| | - Imre Berger
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| | - Daniel D Billadeau
- Department of Biochemistry and Molecular Biology, and Department of Immunology, Mayo Clinic, Rochester, Minnesota 55905, USA
| | - Ezra Burstein
- Department of Internal Medicine and Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Peter J Cullen
- School of Biochemistry, Biomedical Sciences Building, University of Bristol, Bristol BS8 1TD, UK
| |
Collapse
|
69
|
Sferra G, Fratini F, Ponzi M, Pizzi E. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling. BMC Bioinformatics 2017; 18:396. [PMID: 28870256 PMCID: PMC5584357 DOI: 10.1186/s12859-017-1815-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 08/29/2017] [Indexed: 12/20/2022] Open
Abstract
Background Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Results Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson’s correlation as measures of profile similarity. Conclusions In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request. Electronic supplementary material The online version of this article (10.1186/s12859-017-1815-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gabriella Sferra
- Dipartimento di Malattie Infettive, Parassitarie e Immunomediate, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy
| | - Federica Fratini
- Dipartimento di Malattie Infettive, Parassitarie e Immunomediate, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy
| | - Marta Ponzi
- Dipartimento di Malattie Infettive, Parassitarie e Immunomediate, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy
| | - Elisabetta Pizzi
- Dipartimento di Malattie Infettive, Parassitarie e Immunomediate, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy.
| |
Collapse
|
70
|
Niu Y, Liu C, Moghimyfiroozabad S, Yang Y, Alavian KN. PrePhyloPro: phylogenetic profile-based prediction of whole proteome linkages. PeerJ 2017; 5:e3712. [PMID: 28875072 PMCID: PMC5578374 DOI: 10.7717/peerj.3712] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 07/28/2017] [Indexed: 02/05/2023] Open
Abstract
Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our results with the prediction outcomes of previously published co-occurrency model-based and normalization methods. Here we also introduce PrePhyloPro, a web-based software that uses our method for accurately predicting proteome-wide linkages. We present data on interactions of human mitochondrial proteins, verifying the performance of this software. PrePhyloPro is freely available at http://prephylopro.org/phyloprofile/.
Collapse
Affiliation(s)
- Yulong Niu
- Department of Medicine, Division of Brain Sciences, Imperial College London, London, United Kingdom.,Key Lab of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan, China.,School of Medicine, Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, United States of America
| | - Chengcheng Liu
- Department of Periodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | | | - Yi Yang
- Key Lab of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan, China
| | - Kambiz N Alavian
- Department of Medicine, Division of Brain Sciences, Imperial College London, London, United Kingdom.,School of Medicine, Department of Internal Medicine, Endocrinology, Yale University, New Haven, CT, United States of America.,Department of Biology, The Bahá'í Institute for Higher Education (BIHE), Tehran, Iran
| |
Collapse
|
71
|
Li Y, Jourdain AA, Calvo SE, Liu JS, Mootha VK. CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets. PLoS Comput Biol 2017; 13:e1005653. [PMID: 28719601 PMCID: PMC5546725 DOI: 10.1371/journal.pcbi.1005653] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 08/07/2017] [Accepted: 06/21/2017] [Indexed: 12/31/2022] Open
Abstract
In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active.
Collapse
Affiliation(s)
- Yang Li
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Department of Statistics, Harvard University, Cambridge, MA, United States of America
| | - Alexis A. Jourdain
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
| | - Sarah E. Calvo
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| | - Vamsi K. Mootha
- Howard Hughes Medical Institute and Department of Molecular Biology and the Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States of America and Department of Systems Biology, Harvard Medical School, Boston, MA United States of America
- Broad Institute, Cambridge, MA, United States of America
- * E-mail: (SEC); (JSL); (VKM)
| |
Collapse
|
72
|
Cahill MA, Medlock AE. Thoughts on interactions between PGRMC1 and diverse attested and potential hydrophobic ligands. J Steroid Biochem Mol Biol 2017; 171:11-33. [PMID: 28104494 DOI: 10.1016/j.jsbmb.2016.12.020] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Revised: 12/21/2016] [Accepted: 12/26/2016] [Indexed: 01/05/2023]
Abstract
Progesterone Receptor Membrane Component 1 (PGRMC1) is located in many different subcellular locations with many different attested and probably location-specific functions. PGRMC1 was recently identified in the mitochondrial outer membrane where it interacts with ferrochelatase, the last enzyme in the heme synthetic pathway. It has been proposed that PGRMC1 may act as a chaperone to shuttle newly synthesized heme from the mitochondrion to cytochrome P450 (cyP450) enzymes. Here we consider potential roles that PGRMC1 may play in transferring heme, and other small hydrophobic ligands such as cholesterol and steroids, between the hydrophobic compartment of the membrane lipid bilayer interior to aqueous proteins, and perhaps to the membranes of other organelles. We review the synthesis and roles of especially PGRMC1- and cyP450-bound heme, the sources and transport of cholesterol, the involvement of PGRMC1 in cholesterol regulation, and the production of the first progestogen pregnenolone from cholesterol. We also show by clustering by inferred models of evolution (CLIME) analysis that PGRMC1 and related proteins exhibit co-evolution with a series of cyP450 enzymes, as well as a group of mitochondrial proteins lacking in several parasitic protist groups. Altogether, PGRMC1 is implicated with important roles in sterol synthesis and energy regulation that are dispensable in certain parasites. Some novel hypothetical models for PGRMC1 function are proposed to direct future investigative research.
Collapse
Affiliation(s)
- Michael A Cahill
- School of Biomedical Sciences, Charles Sturt University, Wagga Wagga, NSW, 2678, Australia.
| | - Amy E Medlock
- Department of Biochemistry and Molecular Biology, Augusta University/University of Georgia Medical Partnership, University of Georgia, Athens, GA, 30602-1111, USA
| |
Collapse
|
73
|
Lemire BD. Evolution, structure and membrane association of NDUFAF6, an assembly factor for NADH:ubiquinone oxidoreductase (Complex I). Mitochondrion 2017; 35:13-22. [PMID: 28476317 DOI: 10.1016/j.mito.2017.04.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Revised: 03/28/2017] [Accepted: 04/28/2017] [Indexed: 01/31/2023]
Abstract
The NADH:ubiquinone oxidoreductase (complex I) is the largest member of the mitochondrial respiratory chain. Its FMN cofactor accepts two electrons from NADH and transfers them to ubiquinone via a chain of iron-sulphur centers. A central core of 14 highly conserved subunits can couple electron transfer to proton translocation. The mammalian enzyme has an additional ~30 accessory subunits. Complex I has important bioenergetic and metabolic functions and is a known source of reactive oxygen species; these functions link it to a number of hereditary and degenerative diseases. For many complex I deficiencies, the primary defect is not in a subunit-encoding gene, but rather in an assembly factor or chaperone that participates in the biogenesis of newly synthesized complex I from individual subunits and cofactors. NDUFAF6 encodes a complex I assembly factor and mutations result in complex I deficiency, Leigh syndrome or Acadian variant Fanconi syndrome. Human NDUFAF6 is a mitochondria-targeted 333-amino acid protein belonging to the family of squalene and phytoene synthases. Sequence and structural information suggests that NDUFAF6 likely has enzymatic activity, but one that has evolved considerable differences from canonical squalene and phytoene synthases. Most but not all metazoans have an NDUFAF6 ortholog, indicating that in some organisms, complex I biogenesis does not require this protein. NDUFAF6 is a peripheral membrane protein and predictions identify a conserved C-terminal attachment site that have implications for substrate access.
Collapse
Affiliation(s)
- Bernard D Lemire
- Department of Biochemistry, University of Alberta, Edmonton, Alberta T6G2H7, Canada.
| |
Collapse
|
74
|
Fristedt R. Chloroplast function revealed through analysis of GreenCut2 genes. JOURNAL OF EXPERIMENTAL BOTANY 2017; 68:2111-2120. [PMID: 28369575 DOI: 10.1093/jxb/erx082] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Chloroplasts are the green plastids responsible for light-powered photosynthetic reactions and carbon assimilation in the plant cell. Our knowledge of chloroplast functions is constantly increasing and we now know this plastid is predicted to house around 3000 proteins. However, even with generous estimates, we do not know the function of more than 10-15% of these proteins. The next frontier in chloroplast research is to identify and characterize the function of the whole chloroplast proteome, a challenging task due to the inherent complexity a proteome possesses. A logical starting point is to identify and study proteins that have been determined experimentally to be localized in the chloroplast, conserved only among the photosynthetic lineage. These are the proteins with the most probable and important roles in chloroplast function. This review gives an introduction to the GreenCut2, a collection of proteins present only in photosynthetic organisms. By using recent large scale proteomics data, this cut was narrowed to include only those proteins experimentally verified to be localized in the chloroplast, and more specifically to the photosynthetic thylakoid membrane. By using highly informative bioinformatic approaches, the theoretical functional prediction for several of these uncharacterized GreenCut2 proteins is discussed.
Collapse
Affiliation(s)
- Rikard Fristedt
- Biophysics of Photosynthesis, Faculty of Sciences, VU University Amsterdam,Amsterdam,the Netherlands
| |
Collapse
|
75
|
Medema MH, Osbourn A. Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways. Nat Prod Rep 2016; 33:951-62. [PMID: 27321668 PMCID: PMC4987707 DOI: 10.1039/c6np00035e] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Indexed: 01/09/2023]
Abstract
Covering: 2003 to 2016The last decade has seen the first major discoveries regarding the genomic basis of plant natural product biosynthetic pathways. Four key computationally driven strategies have been developed to identify such pathways, which make use of physical clustering, co-expression, evolutionary co-occurrence and epigenomic co-regulation of the genes involved in producing a plant natural product. Here, we discuss how these approaches can be used for the discovery of plant biosynthetic pathways encoded by both chromosomally clustered and non-clustered genes. Additionally, we will discuss opportunities to prioritize plant gene clusters for experimental characterization, and end with a forward-looking perspective on how synthetic biology technologies will allow effective functional reconstitution of candidate pathways using a variety of genetic systems.
Collapse
Affiliation(s)
- Marnix H. Medema
- Bioinformatics Group , Wageningen University , Wageningen , The Netherlands .
| | - Anne Osbourn
- Department of Metabolic Biology , John Innes Centre , Norwich Research Park , Norwich , UK .
| |
Collapse
|
76
|
Amick J, Roczniak-Ferguson A, Ferguson SM. C9orf72 binds SMCR8, localizes to lysosomes, and regulates mTORC1 signaling. Mol Biol Cell 2016; 27:3040-3051. [PMID: 27559131 PMCID: PMC5063613 DOI: 10.1091/mbc.e16-01-0003] [Citation(s) in RCA: 135] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 08/19/2016] [Indexed: 12/18/2022] Open
Abstract
C9orf72 interacts strongly with SMCR8 and depends on this interaction for its stability. Lysosomes are major sites of C9orf72 subcellular localization, and abnormal lysosome morphology is seen in its absence. Defects are found in the regulation of the lysosome-localized mTORC1 signaling pathway in C9orf72 KO cells. Hexanucleotide expansion in an intron of the C9orf72 gene causes amyotrophic lateral sclerosis and frontotemporal dementia. However, beyond bioinformatics predictions that suggested structural similarity to folliculin, the Birt-Hogg-Dubé syndrome tumor suppressor, little is known about the normal functions of the C9orf72 protein. To address this problem, we used genome-editing strategies to investigate C9orf72 interactions, subcellular localization, and knockout (KO) phenotypes. We found that C9orf72 robustly interacts with SMCR8 (a protein of previously unknown function). We also observed that C9orf72 localizes to lysosomes and that such localization is negatively regulated by amino acid availability. Analysis of C9orf72 KO, SMCR8 KO, and double-KO cell lines revealed phenotypes that are consistent with a function for C9orf72 at lysosomes. These include abnormally swollen lysosomes in the absence of C9orf72 and impaired responses of mTORC1 signaling to changes in amino acid availability (a lysosome-dependent process) after depletion of either C9orf72 or SMCR8. Collectively these results identify strong physical and functional interactions between C9orf72 and SMCR8 and support a lysosomal site of action for this protein complex.
Collapse
Affiliation(s)
- Joseph Amick
- Department of Cell Biology and Program in Cellular Neuroscience, Neurodegeneration and Repair, Yale University School of Medicine, New Haven, CT 06510
| | - Agnes Roczniak-Ferguson
- Department of Cell Biology and Program in Cellular Neuroscience, Neurodegeneration and Repair, Yale University School of Medicine, New Haven, CT 06510
| | - Shawn M Ferguson
- Department of Cell Biology and Program in Cellular Neuroscience, Neurodegeneration and Repair, Yale University School of Medicine, New Haven, CT 06510
| |
Collapse
|
77
|
Pers TH. Gene set analysis for interpreting genetic studies. Hum Mol Genet 2016; 25:R133-R140. [PMID: 27511725 DOI: 10.1093/hmg/ddw249] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 07/18/2016] [Indexed: 02/03/2023] Open
Abstract
Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways and functional annotations and may hence point towards novel biological insights. However, despite the growing availability of GSA tools, the sizeable amount of variants identified for a vast number of complex traits, and many irrefutably trait-associated gene sets, the gap between discovery and interpretation remains. More efficient interpretation requires more complete and consistent gene set representations of biological pathways, phenotypes and functional annotations. In this review, I examine different types of gene sets, discuss how inconsistencies in gene set definitions impact GSA, describe how GSA has helped to elucidate biology and outline potential future directions.
Collapse
Affiliation(s)
- Tune H Pers
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark Novo Nordisk Foundation Centre for Basic Metabolic Research, Section of Metabolic, Genetics, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
78
|
Mitochondrial DNA Replication Defects Disturb Cellular dNTP Pools and Remodel One-Carbon Metabolism. Cell Metab 2016; 23:635-48. [PMID: 26924217 DOI: 10.1016/j.cmet.2016.01.019] [Citation(s) in RCA: 200] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/10/2015] [Revised: 12/07/2015] [Accepted: 01/28/2016] [Indexed: 01/12/2023]
Abstract
Mitochondrial dysfunction affects cellular energy metabolism, but less is known about the consequences for cytoplasmic biosynthetic reactions. We report that mtDNA replication disorders caused by TWINKLE mutations-mitochondrial myopathy (MM) and infantile onset spinocerebellar ataxia (IOSCA)-remodel cellular dNTP pools in mice. MM muscle shows tissue-specific induction of the mitochondrial folate cycle, purine metabolism, and imbalanced and increased dNTP pools, consistent with progressive mtDNA mutagenesis. IOSCA-TWINKLE is predicted to hydrolyze dNTPs, consistent with low dNTP pools and mtDNA depletion in the disease. MM muscle also modifies the cytoplasmic one-carbon cycle, transsulfuration, and methylation, as well as increases glucose uptake and its utilization for de novo serine and glutathione biosynthesis. Our evidence indicates that the mitochondrial replication machinery communicates with cytoplasmic dNTP pools and that upregulation of glutathione synthesis through glucose-driven de novo serine biosynthesis contributes to the metabolic stress response. These results are important for disorders with primary or secondary mtDNA instability and offer targets for metabolic therapy.
Collapse
|
79
|
Gasser RB, Schwarz EM, Korhonen PK, Young ND. Understanding Haemonchus contortus Better Through Genomics and Transcriptomics. ADVANCES IN PARASITOLOGY 2016; 93:519-67. [PMID: 27238012 DOI: 10.1016/bs.apar.2016.02.015] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Parasitic roundworms (nematodes) cause substantial mortality and morbidity in animals globally. The barber's pole worm, Haemonchus contortus, is one of the most economically significant parasitic nematodes of small ruminants worldwide. Although this and related nematodes can be controlled relatively well using anthelmintics, resistance against most drugs in common use has become a major problem. Until recently, almost nothing was known about the molecular biology of H. contortus on a global scale. This chapter gives a brief background on H. contortus and haemonchosis, immune responses, vaccine research, chemotherapeutics and current problems associated with drug resistance. It also describes progress in transcriptomics before the availability of H. contortus genomes and the challenges associated with such work. It then reviews major progress on the two draft genomes and developmental transcriptomes of H. contortus, and summarizes their implications for the molecular biology of this worm in both the free-living and the parasitic stages of its life cycle. The chapter concludes by considering how genomics and transcriptomics can accelerate research on Haemonchus and related parasites, and can enable the development of new interventions against haemonchosis.
Collapse
Affiliation(s)
- R B Gasser
- The University of Melbourne, Parkville, VIC, Australia
| | - E M Schwarz
- The University of Melbourne, Parkville, VIC, Australia; Cornell University, Ithaca, NY, United States
| | - P K Korhonen
- The University of Melbourne, Parkville, VIC, Australia
| | - N D Young
- The University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
80
|
van der Lee TAJ, Medema MH. Computational strategies for genome-based natural product discovery and engineering in fungi. Fungal Genet Biol 2016; 89:29-36. [PMID: 26775250 DOI: 10.1016/j.fgb.2016.01.006] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2015] [Revised: 01/08/2016] [Accepted: 01/11/2016] [Indexed: 12/20/2022]
Abstract
Fungal natural products possess biological activities that are of great value to medicine, agriculture and manufacturing. Recent metagenomic studies accentuate the vastness of fungal taxonomic diversity, and the accompanying specialized metabolic diversity offers a great and still largely untapped resource for natural product discovery. Although fungal natural products show an impressive variation in chemical structures and biological activities, their biosynthetic pathways share a number of key characteristics. First, genes encoding successive steps of a biosynthetic pathway tend to be located adjacently on the chromosome in biosynthetic gene clusters (BGCs). Second, these BGCs are often are located on specific regions of the genome and show a discontinuous distribution among evolutionarily related species and isolates. Third, the same enzyme (super)families are often involved in the production of widely different compounds. Fourth, genes that function in the same pathway are often co-regulated, and therefore co-expressed across various growth conditions. In this mini-review, we describe how these partly interlinked characteristics can be exploited to computationally identify BGCs in fungal genomes and to connect them to their products. Particular attention will be given to novel algorithms to identify unusual classes of BGCs, as well as integrative pan-genomic approaches that use a combination of genomic and metabolomic data for parallelized natural product discovery across multiple strains. Such novel technologies will not only expedite the natural product discovery process, but will also allow the assembly of a high-quality toolbox for the re-design or even de novo design of biosynthetic pathways using synthetic biology approaches.
Collapse
Affiliation(s)
- Theo A J van der Lee
- Biointeractions & Plant Health, Plant Research International, Wageningen UR, Wageningen, The Netherlands.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
| |
Collapse
|
81
|
Franceschini A, Lin J, von Mering C, Jensen LJ. SVD-phy: improved prediction of protein functional associations through singular value decomposition of phylogenetic profiles. Bioinformatics 2015; 32:1085-7. [PMID: 26614125 PMCID: PMC4896368 DOI: 10.1093/bioinformatics/btv696] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 11/24/2015] [Indexed: 11/15/2022] Open
Abstract
Summary: A successful approach for predicting functional associations between non-homologous genes is to compare their phylogenetic distributions. We have devised a phylogenetic profiling algorithm, SVD-Phy, which uses truncated singular value decomposition to address the problem of uninformative profiles giving rise to false positive predictions. Benchmarking the algorithm against the KEGG pathway database, we found that it has substantially improved performance over existing phylogenetic profiling methods. Availability and implementation: The software is available under the open-source BSD license at https://bitbucket.org/andrea/svd-phy Contact:lars.juhl.jensen@cpr.ku.dk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrea Franceschini
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland, Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, Lausanne, 1015, Switzerland
| | - Jianyi Lin
- Department of Computer Science, University of Milan, via Comelico 39, Milan, 20135, Italy and
| | - Christian von Mering
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland, Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, Lausanne, 1015, Switzerland
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N, 2200, Denmark
| |
Collapse
|
82
|
Dey G, Meyer T. Phylogenetic Profiling for Probing the Modular Architecture of the Human Genome. Cell Syst 2015; 1:106-15. [PMID: 27135799 DOI: 10.1016/j.cels.2015.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 08/03/2015] [Accepted: 08/10/2015] [Indexed: 12/22/2022]
Abstract
Information about functional connections between genes can be derived from patterns of coupled loss of their homologs across multiple species. This comparative approach, termed phylogenetic profiling, has been successfully used to infer genetic interactions in bacteria and eukaryotes. Rapid progress in sequencing eukaryotic species has enabled the recent phylogenetic profiling of the human genome, resulting in systematic functional predictions for uncharacterized human genes. Importantly, groups of co-evolving genes reveal widespread modularity in the underlying genetic network, facilitating experimental analyses in human cells as well as comparative studies of conserved functional modules across species. This strategy is particularly successful in identifying novel metabolic proteins and components of multi-protein complexes. The targeted sequencing of additional key eukaryotes and the incorporation of improved methods to generate and compare phylogenetic profiles will further boost the predictive power and utility of this evolutionary approach to the functional analysis of gene interaction networks.
Collapse
Affiliation(s)
- Gautam Dey
- Chemical and Systems Biology, Stanford University, Stanford CA 94305, USA.
| | - Tobias Meyer
- Chemical and Systems Biology, Stanford University, Stanford CA 94305, USA.
| |
Collapse
|
83
|
Dey G, Jaimovich A, Collins SR, Seki A, Meyer T. Systematic Discovery of Human Gene Function and Principles of Modular Organization through Phylogenetic Profiling. Cell Rep 2015; 10:993-1006. [PMID: 25683721 DOI: 10.1016/j.celrep.2015.01.025] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Revised: 12/17/2014] [Accepted: 01/09/2015] [Indexed: 01/17/2023] Open
Abstract
Functional links between genes can be predicted using phylogenetic profiling, by correlating the appearance and loss of homologs in subsets of species. However, effective genome-wide phylogenetic profiling has been hindered by the large fraction of human genes related to each other through historical duplication events. Here, we overcame this challenge by automatically profiling over 30,000 groups of homologous human genes (orthogroups) representing the entire protein-coding genome across 177 eukaryotic species (hOP profiles). By generating a full pairwise orthogroup phylogenetic co-occurrence matrix, we derive unbiased genome-wide predictions of functional modules (hOP modules). Our approach predicts functions for hundreds of poorly characterized genes. The results suggest evolutionary constraints that lead components of protein complexes and metabolic pathways to co-evolve while genes in signaling and transcriptional networks do not. As a proof of principle, we validated two subsets of candidates experimentally for their predicted link to the actin-nucleating WASH complex and cilia/basal body function.
Collapse
Affiliation(s)
- Gautam Dey
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Ariel Jaimovich
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Sean R Collins
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Akiko Seki
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Tobias Meyer
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
84
|
Phillips-Krawczak CA, Singla A, Starokadomskyy P, Deng Z, Osborne DG, Li H, Dick CJ, Gomez TS, Koenecke M, Zhang JS, Dai H, Sifuentes-Dominguez LF, Geng LN, Kaufmann SH, Hein MY, Wallis M, McGaughran J, Gecz J, Sluis BVD, Billadeau DD, Burstein E. COMMD1 is linked to the WASH complex and regulates endosomal trafficking of the copper transporter ATP7A. Mol Biol Cell 2015; 26:91-103. [PMID: 25355947 PMCID: PMC4279232 DOI: 10.1091/mbc.e14-06-1073] [Citation(s) in RCA: 162] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Revised: 10/21/2014] [Accepted: 10/24/2014] [Indexed: 11/11/2022] Open
Abstract
COMMD1 deficiency results in defective copper homeostasis, but the mechanism for this has remained elusive. Here we report that COMMD1 is directly linked to early endosomes through its interaction with a protein complex containing CCDC22, CCDC93, and C16orf62. This COMMD/CCDC22/CCDC93 (CCC) complex interacts with the multisubunit WASH complex, an evolutionarily conserved system, which is required for endosomal deposition of F-actin and cargo trafficking in conjunction with the retromer. Interactions between the WASH complex subunit FAM21, and the carboxyl-terminal ends of CCDC22 and CCDC93 are responsible for CCC complex recruitment to endosomes. We show that depletion of CCC complex components leads to lack of copper-dependent movement of the copper transporter ATP7A from endosomes, resulting in intracellular copper accumulation and modest alterations in copper homeostasis in humans with CCDC22 mutations. This work provides a mechanistic explanation for the role of COMMD1 in copper homeostasis and uncovers additional genes involved in the regulation of copper transporter recycling.
Collapse
Affiliation(s)
| | | | | | - Zhihui Deng
- Department of Immunology, Department of Pathophysiology, Qiqihar Medical University, Qiqihar, Heilongjiang 161006, China
| | | | | | | | | | | | - Jin-San Zhang
- Department of Immunology, School of Pharmaceutical Sciences and Key Laboratory of Biotechnology and Pharmaceutical Engineering, Wenzhou Medical University, Wenzhou, Zhejiang 325035, China
| | - Haiming Dai
- Department of Molecular Pharmacology and Experimental Therapeutics, and
| | | | | | - Scott H Kaufmann
- Department of Molecular Pharmacology and Experimental Therapeutics, and
| | - Marco Y Hein
- Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Mathew Wallis
- Genetic Health Queensland at the Royal Brisbane and Women's Hospital, Herston, Queensland 4029, Australia
| | - Julie McGaughran
- Genetic Health Queensland at the Royal Brisbane and Women's Hospital, Herston, Queensland 4029, Australia School of Medicine, University of Queensland, Brisbane, Queensland 4072, Australia
| | - Jozef Gecz
- Robinson Institute and Department of Paediatrics, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Bart van de Sluis
- Section of Molecular Genetics at the Department of Pediatrics, University Medical Center Groningen, University of Groningen, 9713 Groningen, Netherlands
| | - Daniel D Billadeau
- Department of Immunology, Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Mayo Clinic, Rochester, MN 55905
| | - Ezra Burstein
- Department of Internal Medicine and Department of Molecular Biology, UT Southwestern Medical Center, Dallas, TX 75390-9151
| |
Collapse
|
85
|
Reynolds KA. Finding a common path: predicting gene function using inferred evolutionary trees. Dev Cell 2014; 30:4-5. [PMID: 25026031 DOI: 10.1016/j.devcel.2014.06.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Reporting in Cell, Li and colleagues (2014) describe an innovative method to functionally classify genes using evolutionary information. This approach demonstrates broad utility for eukaryotic gene annotation and suggests an intriguing new decomposition of pathways and complexes into evolutionarily conserved modules.
Collapse
Affiliation(s)
- Kimberly A Reynolds
- Green Center for Systems Biology, University of Texas Southwestern Medical Center, 6001 Forest Park Road, Dallas, TX 75390-8597, USA.
| |
Collapse
|