1
|
Volzhenin K, Bittner L, Carbone A. SENSE-PPI reconstructs interactomes within, across, and between species at the genome scale. iScience 2024; 27:110371. [PMID: 39055916 PMCID: PMC11269938 DOI: 10.1016/j.isci.2024.110371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 05/04/2024] [Accepted: 06/21/2024] [Indexed: 07/28/2024] Open
Abstract
Ab initio computational reconstructions of protein-protein interaction (PPI) networks will provide invaluable insights into cellular systems, enabling the discovery of novel molecular interactions and elucidating biological mechanisms within and between organisms. Leveraging the latest generation protein language models and recurrent neural networks, we present SENSE-PPI, a sequence-based deep learning model that efficiently reconstructs ab initio PPIs, distinguishing partners among tens of thousands of proteins and identifying specific interactions within functionally similar proteins. SENSE-PPI demonstrates high accuracy, limited training requirements, and versatility in cross-species predictions, even with non-model organisms and human-virus interactions. Its performance decreases for phylogenetically more distant model and non-model organisms, but signal alteration is very slow. In this regard, it demonstrates the important role of parameters in protein language models. SENSE-PPI is very fast and can test 10,000 proteins against themselves in a matter of hours, enabling the reconstruction of genome-wide proteomes.
Collapse
Affiliation(s)
- Konstantin Volzhenin
- Sorbonne Université, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Lucie Bittner
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
- Institut Universitaire de France, Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
2
|
Pollet L, Xia Y. Structure-guided Evolutionary Analysis of Interactome Network Rewiring at Single Residue Resolution in Yeasts. J Mol Biol 2024; 436:168641. [PMID: 38844045 DOI: 10.1016/j.jmb.2024.168641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/30/2024] [Accepted: 06/01/2024] [Indexed: 06/16/2024]
Abstract
Protein-protein interactions (PPIs) are known to rewire extensively during evolution leading to lineage-specific and species-specific changes in molecular processes. However, the detailed molecular evolutionary mechanisms underlying interactome network rewiring are not well-understood. Here, we combine high-confidence PPI data, high-resolution three-dimensional structures of protein complexes, and homology-based structural annotation transfer to construct structurally-resolved interactome networks for the two yeasts S. cerevisiae and S. pombe. We then classify PPIs according to whether they are preserved or different between the two yeast species and compare site-specific evolutionary rates of interfacial versus non-interfacial residues for these different categories of PPIs. We find that residues in PPI interfaces evolve significantly more slowly than non-interfacial residues when using lineage-specific measures of evolutionary rate, but not when using non-lineage-specific measures. Furthermore, both lineage-specific and non-lineage-specific evolutionary rate measures can distinguish interfacial residues from non-interfacial residues for preserved PPIs between the two yeasts, but only the lineage-specific measure is appropriate for rewired PPIs. Finally, both lineage-specific and non-lineage-specific evolutionary rate measures are appropriate for elucidating structural determinants of protein evolution for residues outside of PPI interfaces. Overall, our results demonstrate that unlike tertiary structures of single proteins, PPIs and PPI interfaces can be highly volatile in their evolution, thus requiring the use of lineage-specific measures when studying their evolution. These results yield insight into the evolutionary design principles of PPIs and the mechanisms by which interactions are preserved or rewired between species, improving our understanding of the molecular evolution of PPIs and PPI interfaces at the residue level.
Collapse
Affiliation(s)
- Léah Pollet
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Yu Xia
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada.
| |
Collapse
|
3
|
Santos TG, Silva KS, Lima RM, Silva LC, Pereira M. State of the art in protein-protein interactions within the fungi kingdom. Future Microbiol 2023; 18:1119-1131. [PMID: 37540069 DOI: 10.2217/fmb-2022-0274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023] Open
Abstract
Proteins rarely exert their function by themselves. Protein-protein interactions (PPIs) regulate virtually every biological process that takes place in a cell. Such interactions are targets for new therapeutic agents against all sorts of diseases, through the screening and design of a variety of inhibitors. Here we discuss several aspects of PPIs that contribute to prediction of protein function and drug discovery. As the high-throughput techniques continue to release biological data, targets for fungal therapeutics that rely on PPIs are being proposed worldwide. Computational approaches have reduced the time taken to develop new therapeutic approaches. The near future brings the possibility of developing new PPI and interaction network inhibitors and a revolution in the way we treat fungal diseases.
Collapse
Affiliation(s)
- Thaynara G Santos
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Kleber Sf Silva
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Raisa M Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Lívia C Silva
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| | - Maristela Pereira
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Goiás, 74 000, Brazil
| |
Collapse
|
4
|
Mohseni Behbahani Y, Saighi P, Corsi F, Laine E, Carbone A. LEVELNET to visualize, explore, and compare protein-protein interaction networks. Proteomics 2023; 23:e2200159. [PMID: 37403279 DOI: 10.1002/pmic.202200159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 07/06/2023]
Abstract
Physical interactions between proteins are central to all biological processes. Yet, the current knowledge of who interacts with whom in the cell and in what manner relies on partial, noisy, and highly heterogeneous data. Thus, there is a need for methods comprehensively describing and organizing such data. LEVELNET is a versatile and interactive tool for visualizing, exploring, and comparing protein-protein interaction (PPI) networks inferred from different types of evidence. LEVELNET helps to break down the complexity of PPI networks by representing them as multi-layered graphs and by facilitating the direct comparison of their subnetworks toward biological interpretation. It focuses primarily on the protein chains whose 3D structures are available in the Protein Data Bank. We showcase some potential applications, such as investigating the structural evidence supporting PPIs associated to specific biological processes, assessing the co-localization of interaction partners, comparing the PPI networks obtained through computational experiments versus homology transfer, and creating PPI benchmarks with desired properties.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Paul Saighi
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Flavia Corsi
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| |
Collapse
|
5
|
Ozger ZB. A robust protein language model for SARS-CoV-2 protein-protein interaction network prediction. Artif Intell Med 2023; 142:102574. [PMID: 37316102 DOI: 10.1016/j.artmed.2023.102574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 04/17/2023] [Accepted: 04/27/2023] [Indexed: 06/16/2023]
Abstract
Protein-protein interaction is one of the ways viruses interact with their hosts. Therefore, identifying protein interactions between viruses and hosts helps explain how virus proteins work, how they replicate, and how they cause disease. SARS-CoV-2 is a new type of virus that emerged from the coronavirus family in 2019 and caused a worldwide pandemic. Detection of human proteins interacting with this novel virus strain plays an important role in monitoring the cellular process of virus-associated infection. Within the scope of the study, a natural language processing-based collective learning method is proposed for the prediction of potential SARS-CoV-2-human PPIs. Protein language models were obtained with the prediction-based word2Vec and doc2Vec embedding methods and the frequency-based tf-idf method. Known interactions were represented by proposed language models and traditional feature extraction methods (conjoint triad and repeat pattern), and their performances were compared. The interaction data were trained with support vector machine, artificial neural network (ANN), k-nearest neighbor (KNN), naive Bayes (NB), decision tree (DT), and ensemble algorithms. Experimental results show that protein language models are a promising protein representation method for protein-protein interaction prediction. The term frequency-inverse document frequency-based language model performed the SARS-CoV-2 protein-protein interaction estimation with an error of 1.4%. Additionally, the decisions of high-performing learning models for different feature extraction methods were combined with a collective voting approach to make new interaction predictions. For 10,000 human proteins, 285 new potential interactions were predicted, with models combining decisions.
Collapse
Affiliation(s)
- Zeynep Banu Ozger
- Department of Computer Engineering, Sutcu Imam University, 46040, Kahramanmaras, Turkey.
| |
Collapse
|
6
|
Hao B, Kovács IA. A positive statistical benchmark to assess network agreement. Nat Commun 2023; 14:2988. [PMID: 37225699 DOI: 10.1038/s41467-023-38625-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 05/09/2023] [Indexed: 05/26/2023] Open
Abstract
Current computational methods for validating experimental network datasets compare overlap, i.e., shared links, with a reference network using a negative benchmark. However, this fails to quantify the level of agreement between the two networks. To address this, we propose a positive statistical benchmark to determine the maximum possible overlap between networks. Our approach can efficiently generate this benchmark in a maximum entropy framework and provides a way to assess whether the observed overlap is significantly different from the best-case scenario. We introduce a normalized overlap score, Normlap, to enhance comparisons between experimental networks. As an application, we compare molecular and functional networks, resulting in an agreement network of human as well as yeast network datasets. The Normlap score can improve the comparison between experimental networks by providing a computational alternative to network thresholding and validation.
Collapse
Affiliation(s)
- Bingjie Hao
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA
| | - István A Kovács
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, 60208, USA.
| |
Collapse
|
7
|
Wang XW, Madeddu L, Spirohn K, Martini L, Fazzone A, Becchetti L, Wytock TP, Kovács IA, Balogh OM, Benczik B, Pétervári M, Ágg B, Ferdinandy P, Vulliard L, Menche J, Colonnese S, Petti M, Scarano G, Cuomo F, Hao T, Laval F, Willems L, Twizere JC, Vidal M, Calderwood MA, Petrillo E, Barabási AL, Silverman EK, Loscalzo J, Velardi P, Liu YY. Assessment of community efforts to advance network-based prediction of protein-protein interactions. Nat Commun 2023; 14:1582. [PMID: 36949045 PMCID: PMC10033937 DOI: 10.1038/s41467-023-37079-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/02/2023] [Indexed: 03/24/2023] Open
Abstract
Comprehensive understanding of the human protein-protein interaction (PPI) network, aka the human interactome, can provide important insights into the molecular mechanisms of complex biological processes and diseases. Despite the remarkable experimental efforts undertaken to date to determine the structure of the human interactome, many PPIs remain unmapped. Computational approaches, especially network-based methods, can facilitate the identification of previously uncharacterized PPIs. Many such methods have been proposed. Yet, a systematic evaluation of existing network-based methods in predicting PPIs is still lacking. Here, we report community efforts initiated by the International Network Medicine Consortium to benchmark the ability of 26 representative network-based methods to predict PPIs across six different interactomes of four different organisms: A. thaliana, C. elegans, S. cerevisiae, and H. sapiens. Through extensive computational and experimental validations, we found that advanced similarity-based methods, which leverage the underlying network characteristics of PPIs, show superior performance over other general link prediction methods in the interactomes we considered.
Collapse
Affiliation(s)
- Xu-Wen Wang
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Lorenzo Madeddu
- Translational and Precision Medicine Department Sapienza University of Rome, Rome, Italy
| | - Kerstin Spirohn
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Leonardo Martini
- Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
| | | | - Luca Becchetti
- Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
| | - Thomas P Wytock
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA
| | - István A Kovács
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, 60208, USA
| | - Olivér M Balogh
- Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - Bettina Benczik
- Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, 6722, Szeged, Hungary
| | - Mátyás Pétervári
- Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
| | - Bence Ágg
- Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, 6722, Szeged, Hungary
| | - Péter Ferdinandy
- Cardiometabolic and MTA-SE System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary
- Pharmahungary Group, 6722, Szeged, Hungary
| | - Loan Vulliard
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Vienna, Austria
| | - Jörg Menche
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Vienna, Austria
- Faculty of Mathematics, University of Vienna, Vienna, Austria
| | - Stefania Colonnese
- Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
| | - Manuela Petti
- Department of Computer, Control, and Management Engineering "Antonio Rubert", Sapienza University of Rome, Rome, Italy
| | - Gaetano Scarano
- Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
| | - Francesca Cuomo
- Department of Information Engineering, Electronics, and Telecommunications (DIET), University of Rome "Sapienza", Rome, Italy
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Laboratory of Molecular and Cellular Epigenetic, GIGA Institute, University of Liège, Liège, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
| | - Luc Willems
- Laboratory of Molecular and Cellular Epigenetic, GIGA Institute, University of Liège, Liège, Belgium
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
| | - Jean-Claude Twizere
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Enrico Petrillo
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
- Department of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Albert-László Barabási
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
- Network Science Institute and Department of Physics, Northeastern University, Boston, MA, 02115, USA
- Department of Network and Data Science, Central European University, Budapest, H-1051, Hungary
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Paola Velardi
- Translational and Precision Medicine Department Sapienza University of Rome, Rome, Italy.
| | - Yang-Yu Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA.
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA.
| |
Collapse
|
8
|
Ozdemir ES, Nussinov R. Pathogen-driven cancers from a structural perspective: Targeting host-pathogen protein-protein interactions. Front Oncol 2023; 13:1061595. [PMID: 36910650 PMCID: PMC9997845 DOI: 10.3389/fonc.2023.1061595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 02/06/2023] [Indexed: 02/25/2023] Open
Abstract
Host-pathogen interactions (HPIs) affect and involve multiple mechanisms in both the pathogen and the host. Pathogen interactions disrupt homeostasis in host cells, with their toxins interfering with host mechanisms, resulting in infections, diseases, and disorders, extending from AIDS and COVID-19, to cancer. Studies of the three-dimensional (3D) structures of host-pathogen complexes aim to understand how pathogens interact with their hosts. They also aim to contribute to the development of rational therapeutics, as well as preventive measures. However, structural studies are fraught with challenges toward these aims. This review describes the state-of-the-art in protein-protein interactions (PPIs) between the host and pathogens from the structural standpoint. It discusses computational aspects of predicting these PPIs, including machine learning (ML) and artificial intelligence (AI)-driven, and overviews available computational methods and their challenges. It concludes with examples of how theoretical computational approaches can result in a therapeutic agent with a potential of being used in the clinics, as well as future directions.
Collapse
Affiliation(s)
- Emine Sila Ozdemir
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health & Science University, Portland, OR, United States
| | - Ruth Nussinov
- Cancer Innovation Laboratory, Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, MD, United States.,Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
9
|
Duque P, Vieira CP, Bastos B, Vieira J. The evolution of vitamin C biosynthesis and transport in animals. BMC Ecol Evol 2022; 22:84. [PMID: 35752765 PMCID: PMC9233358 DOI: 10.1186/s12862-022-02040-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 06/17/2022] [Indexed: 12/25/2022] Open
Abstract
Background Vitamin C (VC) is an indispensable antioxidant and co-factor for optimal function and development of eukaryotic cells. In animals, VC can be synthesized by the organism, acquired through the diet, or both. In the single VC synthesis pathway described in animals, the penultimate step is catalysed by Regucalcin, and the last step by l-gulonolactone oxidase (GULO). The GULO gene has been implicated in VC synthesis only, while Regucalcin has been shown to have multiple functions in mammals. Results Both GULO and Regucalcin can be found in non-bilaterian, protostome and deuterostome species. Regucalcin, as here shown, is involved in multiple functions such as VC synthesis, calcium homeostasis, and the oxidative stress response in both Deuterostomes and Protostomes, and in insects in receptor-mediated uptake of hexamerin storage proteins from haemolymph. In Insecta and Nematoda, however, there is no GULO gene, and in the latter no Regucalcin gene, but species from these lineages are still able to synthesize VC, implying at least one novel synthesis pathway. In vertebrates, SVCT1, a gene that belongs to a family with up to five members, as here shown, is the only gene involved in the uptake of VC in the gut. This specificity is likely the result of a subfunctionalization event that happened at the base of the Craniata subphylum. SVCT-like genes present in non-Vertebrate animals are likely involved in both VC and nucleobase transport. It is also shown that in lineages where GULO has been lost, SVCT1 is now an essential gene, while in lineages where SVCT1 gene has been lost, GULO is now an essential gene. Conclusions The simultaneous study, for the first time, of GULO, Regucalcin and SVCTs evolution provides a clear picture of VC synthesis/acquisition and reveals very different selective pressures in different animal taxonomic groups. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-022-02040-7.
Collapse
|
10
|
Mayol GF, Defelipe LA, Arcon JP, Turjanski AG, Marti MA. Solvent Sites Improve Docking Performance of Protein–Protein Complexes and Protein–Protein Interface-Targeted Drugs. J Chem Inf Model 2022; 62:3577-3588. [DOI: 10.1021/acs.jcim.2c00264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Gonzalo F. Mayol
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires C1428EHA, Argentina
| | - Lucas A. Defelipe
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires C1428EHA, Argentina
- European Molecular Biology Laboratory - Hamburg Unit, Notkestrasse 85, Hamburg 22607, Germany
| | - Juan Pablo Arcon
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires C1428EHA, Argentina
- Institute for Research in Biomedicine (IRB), 08028 Barcelona, Spain
- The Barcelona Institute of Science and Technology, 08036 Barcelona, Spain
| | - Adrian G. Turjanski
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires C1428EHA, Argentina
| | - Marcelo A. Marti
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires C1428EHA, Argentina
| |
Collapse
|
11
|
Pollet L, Lambourne L, Xia Y. Structural Determinants of Yeast Protein-Protein Interaction Interface Evolution at the Residue Level. J Mol Biol 2022; 434:167750. [PMID: 35850298 DOI: 10.1016/j.jmb.2022.167750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 06/09/2022] [Accepted: 07/12/2022] [Indexed: 12/01/2022]
Abstract
Interfaces of contact between proteins play important roles in determining the proper structure and function of protein-protein interactions (PPIs). Therefore, to fully understand PPIs, we need to better understand the evolutionary design principles of PPI interfaces. Previous studies have uncovered that interfacial sites are more evolutionarily conserved than other surface protein sites. Yet, little is known about the nature and relative importance of evolutionary constraints in PPI interfaces. Here, we explore constraints imposed by the structure of the microenvironment surrounding interfacial residues on residue evolutionary rate using a large dataset of over 700 structural models of baker's yeast PPIs. We find that interfacial residues are, on average, systematically more conserved than all other residues with a similar degree of total burial as measured by relative solvent accessibility (RSA). Besides, we find that RSA of the residue when the PPI is formed is a better predictor of interfacial residue evolutionary rate than RSA in the monomer state. Furthermore, we investigate four structure-based measures of residue interfacial involvement, including change in RSA upon binding (ΔRSA), number of residue-residue contacts across the interface, and distance from the center or the periphery of the interface. Integrated modeling for evolutionary rate prediction in interfaces shows that ΔRSA plays a dominant role among the four measures of interfacial involvement, with minor, but independent contributions from other measures. These results yield insight into the evolutionary design of interfaces, improving our understanding of the role that structure plays in the molecular evolution of PPIs at the residue level.
Collapse
Affiliation(s)
- Léah Pollet
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Yu Xia
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada.
| |
Collapse
|
12
|
Dunham B, Ganapathiraju MK. Benchmark Evaluation of Protein-Protein Interaction Prediction Algorithms. Molecules 2021; 27:41. [PMID: 35011283 PMCID: PMC8746451 DOI: 10.3390/molecules27010041] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 11/23/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resource limitations, computational prediction of PPIs is essential and accurately assessing the performance of algorithms is required before further application or translation. However, many published methods compose their evaluation datasets incorrectly, using a higher proportion of positive class data than occuring naturally, leading to exaggerated performance. We re-implemented various published algorithms and evaluated them on datasets with realistic data compositions and found that their performance is overstated in original publications; with several methods outperformed by our control models built on 'illogical' and random number features. We conclude that these methods are influenced by an over-characterization of some proteins in the literature and due to scale-free nature of PPI network and that they fail when tested on all possible protein pairs. Additionally, we found that sequence-only-based algorithms performed worse than those that employ functional and expression features. We present a benchmark evaluation of many published algorithms for PPI prediction. The source code of our implementations and the benchmark datasets created here are made available in open source.
Collapse
|
13
|
Kotlyar M, Pastrello C, Ahmed Z, Chee J, Varyova Z, Jurisica I. IID 2021: towards context-specific protein interaction analyses by increased coverage, enhanced annotation and enrichment analysis. Nucleic Acids Res 2021; 50:D640-D647. [PMID: 34755877 PMCID: PMC8728267 DOI: 10.1093/nar/gkab1034] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 11/03/2021] [Indexed: 01/02/2023] Open
Abstract
Improved bioassays have significantly increased the rate of identifying new protein-protein interactions (PPIs), and the number of detected human PPIs has greatly exceeded early estimates of human interactome size. These new PPIs provide a more complete view of disease mechanisms but precise understanding of how PPIs affect phenotype remains a challenge. It requires knowledge of PPI context (e.g. tissues, subcellular localizations), and functional roles, especially within pathways and protein complexes. The previous IID release focused on PPI context, providing networks with comprehensive tissue, disease, cellular localization, and druggability annotations. The current update adds developmental stages to the available contexts, and provides a way of assigning context to PPIs that could not be previously annotated due to insufficient data or incompatibility with available context categories (e.g. interactions between membrane and cytoplasmic proteins). This update also annotates PPIs with conservation across species, directionality in pathways, membership in large complexes, interaction stability (i.e. stable or transient), and mutation effects. Enrichment analysis is now available for all annotations, and includes multiple options; for example, context annotations can be analyzed with respect to PPIs or network proteins. In addition to tabular view or download, IID provides online network visualization. This update is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zuhaib Ahmed
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Justin Chee
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zofia Varyova
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
14
|
Sefik E, Purcell RH, Walker EF, Bassell GJ, Mulle JG. Convergent and distributed effects of the 3q29 deletion on the human neural transcriptome. Transl Psychiatry 2021; 11:357. [PMID: 34131099 PMCID: PMC8206125 DOI: 10.1038/s41398-021-01435-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 04/29/2021] [Accepted: 05/07/2021] [Indexed: 12/13/2022] Open
Abstract
The 3q29 deletion (3q29Del) confers high risk for schizophrenia and other neurodevelopmental and psychiatric disorders. However, no single gene in this interval is definitively associated with disease, prompting the hypothesis that neuropsychiatric sequelae emerge upon loss of multiple functionally-connected genes. 3q29 genes are unevenly annotated and the impact of 3q29Del on the human neural transcriptome is unknown. To systematically formulate unbiased hypotheses about molecular mechanisms linking 3q29Del to neuropsychiatric illness, we conducted a systems-level network analysis of the non-pathological adult human cortical transcriptome and generated evidence-based predictions that relate 3q29 genes to novel functions and disease associations. The 21 protein-coding genes located in the interval segregated into seven clusters of highly co-expressed genes, demonstrating both convergent and distributed effects of 3q29Del across the interrogated transcriptomic landscape. Pathway analysis of these clusters indicated involvement in nervous-system functions, including synaptic signaling and organization, as well as core cellular functions, including transcriptional regulation, posttranslational modifications, chromatin remodeling, and mitochondrial metabolism. Top network-neighbors of 3q29 genes showed significant overlap with known schizophrenia, autism, and intellectual disability-risk genes, suggesting that 3q29Del biology is relevant to idiopathic disease. Leveraging "guilt by association", we propose nine 3q29 genes, including one hub gene, as prioritized drivers of neuropsychiatric risk. These results provide testable hypotheses for experimental analysis on causal drivers and mechanisms of the largest known genetic risk factor for schizophrenia and highlight the study of normal function in non-pathological postmortem tissue to further our understanding of psychiatric genetics, especially for rare syndromes like 3q29Del, where access to neural tissue from carriers is unavailable or limited.
Collapse
Affiliation(s)
- Esra Sefik
- grid.189967.80000 0001 0941 6502Department of Human Genetics, Emory University School of Medicine, Atlanta, GA USA ,grid.189967.80000 0001 0941 6502Department of Psychology, Emory University, Atlanta, GA USA
| | - Ryan H. Purcell
- grid.189967.80000 0001 0941 6502Department of Cell Biology, Emory University School of Medicine, Atlanta, GA USA ,grid.189967.80000 0001 0941 6502Laboratory of Translational Cell Biology, Emory University School of Medicine, Atlanta, GA USA
| | | | - Elaine F. Walker
- grid.189967.80000 0001 0941 6502Department of Psychology, Emory University, Atlanta, GA USA
| | - Gary J. Bassell
- grid.189967.80000 0001 0941 6502Department of Cell Biology, Emory University School of Medicine, Atlanta, GA USA ,grid.189967.80000 0001 0941 6502Laboratory of Translational Cell Biology, Emory University School of Medicine, Atlanta, GA USA
| | - Jennifer G. Mulle
- grid.189967.80000 0001 0941 6502Department of Human Genetics, Emory University School of Medicine, Atlanta, GA USA ,grid.189967.80000 0001 0941 6502Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA USA
| |
Collapse
|
15
|
Target identification for small-molecule discovery in the FOXO3a tumor-suppressor pathway using a biodiverse peptide library. Cell Chem Biol 2021; 28:1602-1615.e9. [PMID: 34111400 PMCID: PMC8610377 DOI: 10.1016/j.chembiol.2021.05.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 03/03/2021] [Accepted: 05/14/2021] [Indexed: 12/12/2022]
Abstract
Genetic screening technologies to identify and validate macromolecular interactions (MMIs) essential for complex pathways remain an important unmet need for systems biology and therapeutics development. Here, we use a library of peptides from diverse prokaryal genomes to screen MMIs promoting the nuclear relocalization of Forkhead Box O3 (FOXO3a), a tumor suppressor more frequently inactivated by post-translational modification than mutation. A hit peptide engages the 14-3-3 family of signal regulators through a phosphorylation-dependent interaction, modulates FOXO3a-mediated transcription, and suppresses cancer cell growth. In a crystal structure, the hit peptide occupies the phosphopeptide-binding groove of 14-3-3ε in a conformation distinct from its natural peptide substrates. A biophysical screen identifies drug-like small molecules that displace the hit peptide from 14-3-3ε, providing starting points for structure-guided development. Our findings exemplify “protein interference,” an approach using evolutionarily diverse, natural peptides to rapidly identify, validate, and develop chemical probes against MMIs essential for complex cellular phenotypes. We describe protein interference, an approach to identify and validate new drug targets A genetic screen identifies a protein interference probe inducing FOXO3a reactivation The probe defines a druggable binding site in the 14-3-3 signal regulator family We illustrate a workflow to parse complex cellular pathways for new drug targets
Collapse
|
16
|
Drew K, Wallingford JB, Marcotte EM. hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies. Mol Syst Biol 2021; 17:e10016. [PMID: 33973408 PMCID: PMC8111494 DOI: 10.15252/msb.202010016] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 04/08/2021] [Accepted: 04/09/2021] [Indexed: 12/30/2022] Open
Abstract
A general principle of biology is the self‐assembly of proteins into functional complexes. Characterizing their composition is, therefore, required for our understanding of cellular functions. Unfortunately, we lack knowledge of the comprehensive set of identities of protein complexes in human cells. To address this gap, we developed a machine learning framework to identify protein complexes in over 15,000 mass spectrometry experiments which resulted in the identification of nearly 7,000 physical assemblies. We show our resource, hu.MAP 2.0, is more accurate and comprehensive than previous state of the art high‐throughput protein complex resources and gives rise to many new hypotheses, including for 274 completely uncharacterized proteins. Further, we identify 253 promiscuous proteins that participate in multiple complexes pointing to possible moonlighting roles. We have made hu.MAP 2.0 easily searchable in a web interface (http://humap2.proteincomplexes.org/), which will be a valuable resource for researchers across a broad range of interests including systems biology, structural biology, and molecular explanations of disease.
Collapse
Affiliation(s)
- Kevin Drew
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| | - John B Wallingford
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| |
Collapse
|
17
|
Kavran AJ, Clauset A. Denoising large-scale biological data using network filters. BMC Bioinformatics 2021; 22:157. [PMID: 33765911 PMCID: PMC7992843 DOI: 10.1186/s12859-021-04075-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/15/2021] [Indexed: 11/29/2022] Open
Abstract
Background Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation. Results We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data. Conclusions Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology. Supplementary Information The online version supplementary material available at 10.1186/s12859-021-04075-x.
Collapse
Affiliation(s)
- Andrew J Kavran
- Department of Biochemistry, University of Colorado, Boulder, CO, USA.,BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Aaron Clauset
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA. .,Department of Computer Science, University of Colorado, Boulder, CO, USA. .,Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
18
|
Floyd BM, Drew K, Marcotte EM. Systematic Identification of Protein Phosphorylation-Mediated Interactions. J Proteome Res 2021; 20:1359-1370. [PMID: 33476154 DOI: 10.1021/acs.jproteome.0c00750] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein phosphorylation is a key regulatory mechanism involved in nearly every eukaryotic cellular process. Increasingly sensitive mass spectrometry approaches have identified hundreds of thousands of phosphorylation sites, but the functions of a vast majority of these sites remain unknown, with fewer than 5% of sites currently assigned a function. To increase our understanding of functional protein phosphorylation we developed an approach (phospho-DIFFRAC) for identifying the phosphorylation-dependence of protein assemblies in a systematic manner. A combination of nonspecific protein phosphatase treatment, size-exclusion chromatography, and mass spectrometry allowed us to identify changes in protein interactions after the removal of phosphate modifications. With this approach we were able to identify 316 proteins involved in phosphorylation-sensitive interactions. We recovered known phosphorylation-dependent interactors such as the FACT complex and spliceosome, as well as identified novel interactions such as the tripeptidyl peptidase TPP2 and the supraspliceosome component ZRANB2. More generally, we find phosphorylation-dependent interactors to be strongly enriched for RNA-binding proteins, providing new insight into the role of phosphorylation in RNA binding. By searching directly for phosphorylated amino acid residues in mass spectrometry data, we identified the likely regulatory phosphosites on ZRANB2 and FACT complex subunit SSRP1. This study provides both a method and resource for obtaining a better understanding of the role of phosphorylation in native macromolecular assemblies. All mass spectrometry data are available through PRIDE (accession #PXD021422).
Collapse
Affiliation(s)
- Brendan M Floyd
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Kevin Drew
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Edward M Marcotte
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
19
|
An Integrative Computational Approach for the Prediction of Human- Plasmodium Protein-Protein Interactions. BIOMED RESEARCH INTERNATIONAL 2021; 2020:2082540. [PMID: 33426052 PMCID: PMC7771252 DOI: 10.1155/2020/2082540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/08/2020] [Accepted: 12/04/2020] [Indexed: 12/27/2022]
Abstract
Host-pathogen molecular cross-talks are critical in determining the pathophysiology of a specific infection. Most of these cross-talks are mediated via protein-protein interactions between the host and the pathogen (HP-PPI). Thus, it is essential to know how some pathogens interact with their hosts to understand the mechanism of infections. Malaria is a life-threatening disease caused by an obligate intracellular parasite belonging to the Plasmodium genus, of which P. falciparum is the most prevalent. Several previous studies predicted human-plasmodium protein-protein interactions using computational methods have demonstrated their utility, accuracy, and efficiency to identify the interacting partners and therefore complementing experimental efforts to characterize host-pathogen interaction networks. To predict potential putative HP-PPIs, we use an integrative computational approach based on the combination of multiple OMICS-based methods including human red blood cells (RBC) and Plasmodium falciparum 3D7 strain expressed proteins, domain-domain based PPI, similarity of gene ontology terms, structure similarity method homology identification, and machine learning prediction. Our results reported a set of 716 protein interactions involving 302 human proteins and 130 Plasmodium proteins. This work provides a list of potential human-Plasmodium interacting proteins. These findings will contribute to better understand the mechanisms underlying the molecular determinism of malaria disease and potentially to identify candidate pharmacological targets.
Collapse
|
20
|
Barel G, Herwig R. NetCore: a network propagation approach using node coreness. Nucleic Acids Res 2020; 48:e98. [PMID: 32735660 PMCID: PMC7515737 DOI: 10.1093/nar/gkaa639] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/22/2020] [Accepted: 07/21/2020] [Indexed: 02/07/2023] Open
Abstract
We present NetCore, a novel network propagation approach based on node coreness, for phenotype–genotype associations and module identification. NetCore addresses the node degree bias in PPI networks by using node coreness in the random walk with restart procedure, and achieves improved re-ranking of genes after propagation. Furthermore, NetCore implements a semi-supervised approach to identify phenotype-associated network modules, which anchors the identification of novel candidate genes at known genes associated with the phenotype. We evaluated NetCore on gene sets from 11 different GWAS traits and showed improved performance compared to the standard degree-based network propagation using cross-validation. Furthermore, we applied NetCore to identify disease genes and modules for Schizophrenia GWAS data and pan-cancer mutation data. We compared the novel approach to existing network propagation approaches and showed the benefits of using NetCore in comparison to those. We provide an easy-to-use implementation, together with a high confidence PPI network extracted from ConsensusPathDB, which can be applied to various types of genomics data in order to obtain a re-ranking of genes and functionally relevant network modules.
Collapse
Affiliation(s)
- Gal Barel
- Department of Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
| | - Ralf Herwig
- Department of Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
| |
Collapse
|
21
|
Liu Z, Miller D, Li F, Liu X, Levy SF. A large accessory protein interactome is rewired across environments. eLife 2020; 9:e62365. [PMID: 32924934 PMCID: PMC7577743 DOI: 10.7554/elife.62365] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 09/04/2020] [Indexed: 12/30/2022] Open
Abstract
To characterize how protein-protein interaction (PPI) networks change, we quantified the relative PPI abundance of 1.6 million protein pairs in the yeast Saccharomyces cerevisiae across nine growth conditions, with replication, for a total of 44 million measurements. Our multi-condition screen identified 13,764 pairwise PPIs, a threefold increase over PPIs identified in one condition. A few 'immutable' PPIs are present across all conditions, while most 'mutable' PPIs are rarely observed. Immutable PPIs aggregate into highly connected 'core' network modules, with most network remodeling occurring within a loosely connected 'accessory' module. Mutable PPIs are less likely to co-express, co-localize, and be explained by simple mass action kinetics, and more likely to contain proteins with intrinsically disordered regions, implying that environment-dependent association and binding is critical to cellular adaptation. Our results show that protein interactomes are larger than previously thought and contain highly dynamic regions that reorganize to drive or respond to cellular changes.
Collapse
Affiliation(s)
- Zhimin Liu
- Department of Biochemistry, Stony Brook UniversityStony BrookUnited States
- Laufer Center for Physical and Quantitative Biology, Stony Brook UniversityStony BrookUnited States
| | - Darach Miller
- Joint Initiative for Metrology in BiologyStanfordUnited States
- Department of Genetics, Stanford UniversityStanfordUnited States
| | - Fangfei Li
- Laufer Center for Physical and Quantitative Biology, Stony Brook UniversityStony BrookUnited States
- Department of Applied Mathematics and Statistics, Stony Brook UniversityStony BrookUnited States
| | - Xianan Liu
- Department of Biochemistry, Stony Brook UniversityStony BrookUnited States
- Laufer Center for Physical and Quantitative Biology, Stony Brook UniversityStony BrookUnited States
| | - Sasha F Levy
- Department of Biochemistry, Stony Brook UniversityStony BrookUnited States
- Laufer Center for Physical and Quantitative Biology, Stony Brook UniversityStony BrookUnited States
- Joint Initiative for Metrology in BiologyStanfordUnited States
- Department of Genetics, Stanford UniversityStanfordUnited States
- Department of Applied Mathematics and Statistics, Stony Brook UniversityStony BrookUnited States
- SLAC National Accelerator LaboratoryMenlo ParkUnited States
| |
Collapse
|
22
|
Athira K, Gopakumar G. An integrated method for identifying essential proteins from multiplex network model of protein-protein interactions. J Bioinform Comput Biol 2020; 18:2050020. [PMID: 32795133 DOI: 10.1142/s0219720020500201] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Cell survival requires the presence of essential proteins. Detection of essential proteins is relevant not only because of the critical biological functions they perform but also the role played by them as a drug target against pathogens. Several computational techniques are in place to identify essential proteins based on protein-protein interaction (PPI) network. Essential protein detection using only physical interaction data of proteins is challenging due to its inherent uncertainty. Hence, in this work, we propose a multiplex network-based framework that incorporates multiple protein interaction data from their physical, coexpression and phylogenetic profiles. An extended version termed as multiplex eigenvector centrality (MEC) is used to identify essential proteins from this network. The methodology integrates the score obtained from the multiplex analysis with subcellular localization and Gene Ontology information and is implemented using Saccharomyces cerevisiae datasets. The proposed method outperformed many recent essential protein prediction techniques in the literature.
Collapse
Affiliation(s)
- K Athira
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kozhikkode, Kerala 673601, India
| | - G Gopakumar
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kozhikkode, Kerala 673601, India
| |
Collapse
|
23
|
Affiliation(s)
- Jinyuan Chang
- School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
| | - Eric D. Kolaczyk
- Department of Mathematics and Statistics, Boston University, Boston, MA
| | - Qiwei Yao
- Department of Statistics, London School of Economics and Political Science, London, UK
| |
Collapse
|
24
|
Mellors T, Withers JB, Ameli A, Jones A, Wang M, Zhang L, Sanchez HN, Santolini M, Do Valle I, Sebek M, Cheng F, Pappas DA, Kremer JM, Curtis JR, Johnson KJ, Saleh A, Ghiassian SD, Akmaev VR. Clinical Validation of a Blood-Based Predictive Test for Stratification of Response to Tumor Necrosis Factor Inhibitor Therapies in Rheumatoid Arthritis Patients. NETWORK AND SYSTEMS MEDICINE 2020. [DOI: 10.1089/nsm.2020.0007] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
| | | | - Asher Ameli
- Scipher Medicine, Waltham, Massachusetts, USA
| | - Alex Jones
- Scipher Medicine, Waltham, Massachusetts, USA
| | | | - Lixia Zhang
- Scipher Medicine, Waltham, Massachusetts, USA
| | | | - Marc Santolini
- Center for Research and Interdisciplinarity (CRI), University Paris Descartes, Paris, France
| | - Italo Do Valle
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts, USA
| | - Michael Sebek
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts, USA
| | - Feixiong Cheng
- Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts, USA
| | - Dimitrios A. Pappas
- Division of Rheumatology, College of Physicians and Surgeons, Columbia University, New York, New York, USA
- CORRONA, LCC, Waltham, Massachusetts, USA
| | - Joel M. Kremer
- CORRONA, LCC, Waltham, Massachusetts, USA
- Albany Medical College, The Center for Rheumatology, Albany, New York, USA
| | - Jeffery R. Curtis
- Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | | | - Alif Saleh
- Scipher Medicine, Waltham, Massachusetts, USA
| | | | | |
Collapse
|
25
|
Poverennaya EV, Kiseleva OI, Ivanov AS, Ponomarenko EA. Methods of Computational Interactomics for Investigating Interactions of Human Proteoforms. BIOCHEMISTRY (MOSCOW) 2020; 85:68-79. [PMID: 32079518 DOI: 10.1134/s000629792001006x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Human genome contains ca. 20,000 protein-coding genes that could be translated into millions of unique protein species (proteoforms). Proteoforms coded by a single gene often have different functions, which implies different protein partners. By interacting with each other, proteoforms create a network reflecting the dynamics of cellular processes in an organism. Perturbations of protein-protein interactions change the network topology, which often triggers pathological processes. Studying proteoforms is a relatively new research area in proteomics, and this is why there are comparatively few experimental studies on the interaction of proteoforms. Bioinformatics tools can facilitate such studies by providing valuable complementary information to the experimental data and, in particular, expanding the possibilities of the studies of proteoform interactions.
Collapse
Affiliation(s)
| | - O I Kiseleva
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | - A S Ivanov
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | | |
Collapse
|
26
|
Informed Use of Protein-Protein Interaction Data: A Focus on the Integrated Interactions Database (IID). Methods Mol Biol 2020; 2074:125-134. [PMID: 31583635 DOI: 10.1007/978-1-4939-9873-9_10] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Protein-protein interaction data is fundamental in molecular biology, and numerous online databases provide access to this data. However, the huge quantity, complexity, and variety of PPI data can be overwhelming, and rather than helping to address research problems, the data may add to their complexity and reduce interpretability. This protocol focuses on solutions for some of the main challenges of using PPI data, including accessing data, ensuring relevance by integrating useful annotations, and improving interpretability. While the issues are generic, we highlight how to perform such operations using Integrated Interactions Database (IID; http://ophid.utoronto.ca/iid ).
Collapse
|
27
|
Alanis-Lobato G, Schaefer MH. Generation and Interpretation of Context-Specific Human Protein-Protein Interaction Networks with HIPPIE. Methods Mol Biol 2020; 2074:135-144. [PMID: 31583636 DOI: 10.1007/978-1-4939-9873-9_11] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
High-throughput techniques for the detection of protein-protein interactions (PPIs) have enabled a systems approach for the study of the living cell. However, the increasing amount of protein interaction data, the varying quality of these measurements, and the lack of context information make it difficult to construct meaningful and reliable protein networks.The Human Integrated Protein-Protein Interaction rEference (HIPPIE) is a web tool that integrates and annotates experimentally supported human PPIs from a heterogeneous set of data sources. In HIPPIE, one can query for the interactors of one or more proteins and generate high-quality and context-specific networks. This chapter highlights HIPPIE's most important features and exemplifies its functionality through a proposed use case.
Collapse
Affiliation(s)
| | - Martin H Schaefer
- Department of Experimental Oncology, European Institute of Oncology, Milan, Italy.
| |
Collapse
|
28
|
Li Q, Yang Z, Zhao Z, Luo L, Li Z, Wang L, Zhang Y, Lin H, Wang J, Zhang Y. HMNPPID-human malignant neoplasm protein-protein interaction database. Hum Genomics 2019; 13:44. [PMID: 31639057 PMCID: PMC6805303 DOI: 10.1186/s40246-019-0223-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) information extraction from biomedical literature helps unveil the molecular mechanisms of biological processes. Especially, the PPIs associated with human malignant neoplasms can unveil the biology behind these neoplasms. However, such PPI database is not currently available. RESULTS In this work, a database of protein-protein interactions associated with 171 kinds of human malignant neoplasms named HMNPPID is constructed. In addition, a visualization program, named VisualPPI, is provided to facilitate the analysis of the PPI network for a specific neoplasm. CONCLUSIONS HMNPPID can hopefully become an important resource for the research on PPIs of human malignant neoplasms since it provides readily available data for healthcare professionals. Thus, they do not need to dig into a large amount of biomedical literatures any more, which may accelerate the researches on the PPIs of malignant neoplasms.
Collapse
Affiliation(s)
- Qingqing Li
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Zhihao Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.
| | - Zhehuan Zhao
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Ling Luo
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Zhiheng Li
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Lei Wang
- Beijing Institute of Health Administration and Medical Information, Beijing, 100850, China.
| | - Yin Zhang
- Beijing Institute of Health Administration and Medical Information, Beijing, 100850, China
| | - Hongfei Lin
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Jian Wang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Yijia Zhang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| |
Collapse
|
29
|
Wu Z, Liao Q, Liu B. A comprehensive review and evaluation of computational methods for identifying protein complexes from protein–protein interaction networks. Brief Bioinform 2019; 21:1531-1548. [DOI: 10.1093/bib/bbz085] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/17/2019] [Accepted: 06/17/2019] [Indexed: 02/04/2023] Open
Abstract
Abstract
Protein complexes are the fundamental units for many cellular processes. Identifying protein complexes accurately is critical for understanding the functions and organizations of cells. With the increment of genome-scale protein–protein interaction (PPI) data for different species, various computational methods focus on identifying protein complexes from PPI networks. In this article, we give a comprehensive and updated review on the state-of-the-art computational methods in the field of protein complex identification, especially focusing on the newly developed approaches. The computational methods are organized into three categories, including cluster-quality-based methods, node-affinity-based methods and ensemble clustering methods. Furthermore, the advantages and disadvantages of different methods are discussed, and then, the performance of 17 state-of-the-art methods is evaluated on two widely used benchmark data sets. Finally, the bottleneck problems and their potential solutions in this important field are discussed.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
30
|
Steiner PJ, Bedewitz MA, Medina‐Cucurella AV, Cutler SR, Whitehead TA. A yeast surface display platform for plant hormone receptors: Toward directed evolution of new biosensors. AIChE J 2019. [DOI: 10.1002/aic.16767] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Paul J. Steiner
- Department of Chemical and Biological Engineering University of Colorado Boulder Colorado
| | - Matthew A. Bedewitz
- Department of Chemical and Biological Engineering University of Colorado Boulder Colorado
| | | | - Sean R. Cutler
- Department of Botany and Plant Sciences University of California Riverside California
| | - Timothy A. Whitehead
- Department of Chemical and Biological Engineering University of Colorado Boulder Colorado
- Department of Chemical Engineering and Materials Science Michigan State University East Lansing Michigan
| |
Collapse
|
31
|
Lugo-Martinez J, Bar-Joseph Z, Dengjel J, Murphy RF. Integration of Heterogeneous Experimental Data Improves Global Map of Human Protein Complexes. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2019; 2019:144-153. [PMID: 32457940 DOI: 10.1145/3307339.3342150] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Protein complexes play a significant role in the core functionality of cells. These complexes are typically identified by detecting densely connected subgraphs in protein-protein interaction (PPI) networks. Recently, multiple large-scale mass spectrometry-based experiments have significantly increased the availability of PPI data in order to further expand the set of known complexes. However, high-throughput experimental data generally are incomplete, show limited agreement between experiments, and show frequent false positive interactions. There is a need for computational approaches that can address these limitations in order to improve the coverage and accuracy of human protein complexes. Here, we present a new method that integrates data from multiple heterogeneous experiments and sources in order to increase the reliability and coverage of predicted protein complexes. We first fused the heterogeneous data into a feature matrix and trained classifiers to score pairwise protein interactions. We next used graph based methods to combine pairwise interactions into predicted protein complexes. Our approach improves the accuracy and coverage of protein pairwise interactions, accurately identifies known complexes, and suggests both novel additions to known complexes and entirely new complexes. Our results suggest that integration of heterogeneous experimental data helps improve the reliability and coverage of diverse high-throughput mass-spectrometry experiments, leading to an improved global map of human protein complexes.
Collapse
Affiliation(s)
- Jose Lugo-Martinez
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA
| | - Ziv Bar-Joseph
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA
| | - Jörn Dengjel
- Department of Biology, Université de Fribourg, 1700 Fribourg, Switzerland
| | - Robert F Murphy
- Computational Biology Department, Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
32
|
Bozhilova LV, Whitmore AV, Wray J, Reinert G, Deane CM. Measuring rank robustness in scored protein interaction networks. BMC Bioinformatics 2019; 20:446. [PMID: 31462221 PMCID: PMC6714100 DOI: 10.1186/s12859-019-3036-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 08/19/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Protein interaction databases often provide confidence scores for each recorded interaction based on the available experimental evidence. Protein interaction networks (PINs) are then built by thresholding on these scores, so that only interactions of sufficiently high quality are included. These networks are used to identify biologically relevant motifs or nodes using metrics such as degree or betweenness centrality. This type of analysis can be sensitive to the choice of threshold. If a node metric is to be useful for extracting biological signal, it should induce similar node rankings across PINs obtained at different reasonable confidence score thresholds. RESULTS We propose three measures-rank continuity, identifiability, and instability-to evaluate how robust a node metric is to changes in the score threshold. We apply our measures to twenty-five metrics and identify four as the most robust: the number of edges in the step-1 ego network, as well as the leave-one-out differences in average redundancy, average number of edges in the step-1 ego network, and natural connectivity. Our measures show good agreement across PINs from different species and data sources. Analysis of synthetically generated scored networks shows that robustness results are context-specific, and depend both on network topology and on how scores are placed across network edges. CONCLUSION Due to the uncertainty associated with protein interaction detection, and therefore network structure, for PIN analysis to be reproducible, it should yield similar results across different confidence score thresholds. We demonstrate that while certain node metrics are robust with respect to threshold choice, this is not always the case. Promisingly, our results suggest that there are some metrics that are robust across networks constructed from different databases, and different scoring procedures.
Collapse
Affiliation(s)
- Lyuba V Bozhilova
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Alan V Whitmore
- e-Therapeutics Plc, 17 Fenlock Rd, Long Hanborough, OX29 8LN, UK
| | - Jonny Wray
- e-Therapeutics Plc, 17 Fenlock Rd, Long Hanborough, OX29 8LN, UK
| | - Gesine Reinert
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK.
| |
Collapse
|
33
|
Schoeters F, Van Dijck P. Protein-Protein Interactions in Candida albicans. Front Microbiol 2019; 10:1792. [PMID: 31440220 PMCID: PMC6693483 DOI: 10.3389/fmicb.2019.01792] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 07/19/2019] [Indexed: 12/27/2022] Open
Abstract
Despite being one of the most important human fungal pathogens, Candida albicans has not been studied extensively at the level of protein-protein interactions (PPIs) and data on PPIs are not readily available in online databases. In January 2018, the database called "Biological General Repository for Interaction Datasets (BioGRID)" that contains the most PPIs for C. albicans, only documented 188 physical or direct PPIs (release 3.4.156) while several more can be found in the literature. Other databases such as the String database, the Molecular INTeraction Database (MINT), and the Database for Interacting Proteins (DIP) database contain even fewer interactions or do not even include C. albicans as a searchable term. Because of the non-canonical codon usage of C. albicans where CUG is translated as serine rather than leucine, it is often problematic to use the yeast two-hybrid system in Saccharomyces cerevisiae to study C. albicans PPIs. However, studying PPIs is crucial to gain a thorough understanding of the function of proteins, biological processes and pathways. PPIs can also be potential drug targets. To aid in creating PPI networks and updating the BioGRID, we performed an exhaustive literature search in order to provide, in an accessible format, a more extensive list of known PPIs in C. albicans.
Collapse
Affiliation(s)
- Floris Schoeters
- VIB-KU Leuven Center for Microbiology, Leuven, Belgium
- Laboratory of Molecular Cell Biology, Institute of Botany and Microbiology, KU Leuven, Leuven, Belgium
| | - Patrick Van Dijck
- VIB-KU Leuven Center for Microbiology, Leuven, Belgium
- Laboratory of Molecular Cell Biology, Institute of Botany and Microbiology, KU Leuven, Leuven, Belgium
| |
Collapse
|
34
|
Wang R, Wang C, Sun L, Liu G. A seed-extended algorithm for detecting protein complexes based on density and modularity with topological structure and GO annotations. BMC Genomics 2019; 20:637. [PMID: 31390979 PMCID: PMC6686515 DOI: 10.1186/s12864-019-5956-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Accepted: 07/04/2019] [Indexed: 12/28/2022] Open
Abstract
Background The detection of protein complexes is of great significance for researching mechanisms underlying complex diseases and developing new drugs. Thus, various computational algorithms have been proposed for protein complex detection. However, most of these methods are based on only topological information and are sensitive to the reliability of interactions. As a result, their performance is affected by false-positive interactions in PPINs. Moreover, these methods consider only density and modularity and ignore protein complexes with various densities and modularities. Results To address these challenges, we propose an algorithm to exploit protein complexes in PPINs by a Seed-Extended algorithm based on Density and Modularity with Topological structure and GO annotations, named SE-DMTG to improve the accuracy of protein complex detection. First, we use common neighbors and GO annotations to construct a weighted PPIN. Second, we define a new seed selection strategy to select seed nodes. Third, we design a new fitness function to detect protein complexes with various densities and modularities. We compare the performance of SE-DMTG with that of thirteen state-of-the-art algorithms on several real datasets. Conclusion The experimental results show that SE-DMTG not only outperforms some classical algorithms in yeast PPINs in terms of the F-measure and Jaccard but also achieves an ideal performance in terms of functional enrichment. Furthermore, we apply SE-DMTG to PPINs of several other species and demonstrate the outstanding accuracy and matching ratio in detecting protein complexes compared with other algorithms.
Collapse
Affiliation(s)
- Rongquan Wang
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, 24 Zhanlanguan Road, Xicheng District, Beijing, 100037, China
| | - Liyan Sun
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.
| |
Collapse
|
35
|
Bag AK, Mandloi S, Jarmalavicius S, Mondal S, Kumar K, Mandal C, Walden P, Chakrabarti S, Mandal C. Connecting signaling and metabolic pathways in EGF receptor-mediated oncogenesis of glioblastoma. PLoS Comput Biol 2019; 15:e1007090. [PMID: 31386654 PMCID: PMC6684045 DOI: 10.1371/journal.pcbi.1007090] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Accepted: 05/13/2019] [Indexed: 12/21/2022] Open
Abstract
As malignant transformation requires synchronization of growth-driving signaling (S) and metabolic (M) pathways, defining cancer-specific S-M interconnected networks (SMINs) could lead to better understanding of oncogenic processes. In a systems-biology approach, we developed a mathematical model for SMINs in mutated EGF receptor (EGFRvIII) compared to wild-type EGF receptor (EGFRwt) expressing glioblastoma multiforme (GBM). Starting with experimentally validated human protein-protein interactome data for S-M pathways, and incorporating proteomic data for EGFRvIII and EGFRwt GBM cells and patient transcriptomic data, we designed a dynamic model for EGFR-driven GBM-specific information flow. Key nodes and paths identified by in silico perturbation were validated experimentally when inhibition of signaling pathway proteins altered expression of metabolic proteins as predicted by the model. This demonstrated capacity of the model to identify unknown connections between signaling and metabolic pathways, explain the robustness of oncogenic SMINs, predict drug escape, and assist identification of drug targets and the development of combination therapies. Complex and highly dynamic interconnected networks allow cancer to take different routes and circumvent chemotherapy. Therefore, understanding these context-specific networks and their dynamics of molecular interactions driven by different oncogenic signaling and metabolic pathways is very much needed to predict drug targets and the effect of therapeutics. We incorporated high-throughput transcriptome and proteome data into mathematical models to deduce properties of cancer cells through systems biology approach. Here we report the development, testing and validation of an integrated systems biology model of information flow between signaling and metabolic pathways to understand the regulation of the interconnection between them in cancer. Our model efficiently identified unique connections and key nodes important in signaling-metabolic information flow. We predicted some potential novel targets before performing actual drug tests. We have successfully applied this model to identify the interconnections altered in the constitutive signaling of the mutated EGFR by comparing EGF-dependent and wild-type EGFR signaling in glioblastoma multiforme.
Collapse
Affiliation(s)
- Arup K. Bag
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Sapan Mandloi
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Saulius Jarmalavicius
- Department of Dermatology, Venerology and Allergology, Charité– Universitätsmedizin Berlin corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Susmita Mondal
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Krishna Kumar
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
| | - Chhabinath Mandal
- National Institute of Pharmaceutical Education and Research, Kolkata, India
| | - Peter Walden
- Department of Dermatology, Venerology and Allergology, Charité– Universitätsmedizin Berlin corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- * E-mail: (PW); , (SC); , (CM)
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Kolkata, India
- * E-mail: (PW); , (SC); , (CM)
| | - Chitra Mandal
- Cancer Biology and Inflammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India
- * E-mail: (PW); , (SC); , (CM)
| |
Collapse
|
36
|
Sonawane AR, Weiss ST, Glass K, Sharma A. Network Medicine in the Age of Biomedical Big Data. Front Genet 2019; 10:294. [PMID: 31031797 PMCID: PMC6470635 DOI: 10.3389/fgene.2019.00294] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 03/19/2019] [Indexed: 12/13/2022] Open
Abstract
Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare.
Collapse
Affiliation(s)
- Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women’s Hospital, Boston, MA, United States
| |
Collapse
|
37
|
Kovács IA, Luck K, Spirohn K, Wang Y, Pollis C, Schlabach S, Bian W, Kim DK, Kishore N, Hao T, Calderwood MA, Vidal M, Barabási AL. Network-based prediction of protein interactions. Nat Commun 2019; 10:1240. [PMID: 30886144 PMCID: PMC6423278 DOI: 10.1038/s41467-019-09177-y] [Citation(s) in RCA: 161] [Impact Index Per Article: 32.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 02/22/2019] [Indexed: 12/15/2022] Open
Abstract
Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other's partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.
Collapse
Affiliation(s)
- István A Kovács
- Network Science Institute and Department of Physics, Northeastern University, Boston, MA, 02115, USA.
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA.
- Wigner Research Centre for Physics, Institute for Solid State Physics and Optics, H-1525, Budapest, P.O.Box 49, Hungary.
| | - Katja Luck
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Kerstin Spirohn
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Yang Wang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Carl Pollis
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Sadie Schlabach
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Wenting Bian
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Dae-Kyum Kim
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Donnelly Centre, Toronto, Ontario, Canada, Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada, Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Nishka Kishore
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Donnelly Centre, Toronto, Ontario, Canada, Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada, Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Albert-László Barabási
- Network Science Institute and Department of Physics, Northeastern University, Boston, MA, 02115, USA.
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, 02115, USA.
- Division of Network Medicine and Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Network and Data Science, Central European University, Budapest, H-1051, Hungary.
| |
Collapse
|
38
|
Abstract
Phenotype robustness to environmental fluctuations is a common biological phenomenon. Although most phenotypes involve multiple proteins that interact with each other, the basic principles of how such interactome networks respond to environmental unpredictability and change during evolution are largely unknown. Here we study interactomes of 1,840 species across the tree of life involving a total of 8,762,166 protein-protein interactions. Our study focuses on the resilience of interactomes to network failures and finds that interactomes become more resilient during evolution, meaning that interactomes become more robust to network failures over time. In bacteria, we find that a more resilient interactome is in turn associated with the greater ability of the organism to survive in a more complex, variable, and competitive environment. We find that at the protein family level proteins exhibit a coordinated rewiring of interactions over time and that a resilient interactome arises through gradual change of the network topology. Our findings have implications for understanding molecular network structure in the context of both evolution and environment.
Collapse
|
39
|
Ashtiani M, Nickchi P, Jahangiri-Tazehkand S, Safari A, Mirzaie M, Jafari M. IMMAN: an R/Bioconductor package for Interolog protein network reconstruction, mapping and mining analysis. BMC Bioinformatics 2019; 20:73. [PMID: 30755155 PMCID: PMC6373071 DOI: 10.1186/s12859-019-2659-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 01/28/2019] [Indexed: 12/15/2022] Open
Abstract
Background Reconstruction of protein-protein interaction networks (PPIN) has been riddled with controversy for decades. Particularly, false-negative and -positive interactions make this progress even more complicated. Also, lack of a standard PPIN limits us in the comparison studies and results in the incompatible outcomes. Using an evolution-based concept, i.e. interolog which refers to interacting orthologous protein sets, pave the way toward an optimal benchmark. Results Here, we provide an R package, IMMAN, as a tool for reconstructing Interolog Protein Network (IPN) by integrating several Protein-protein Interaction Networks (PPINs). Users can unify different PPINs to mine conserved common networks among species. IMMAN is designed to retrieve IPNs with different degrees of conservation to engage prediction analysis of protein functions according to their networks. Conclusions IPN consists of evolutionarily conserved nodes and their related edges regarding low false positive rates, which can be considered as a gold standard network in the contexts of biological network analysis regarding to those PPINs which is derived from. Electronic supplementary material The online version of this article (10.1186/s12859-019-2659-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Minoo Ashtiani
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Payman Nickchi
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Soheil Jahangiri-Tazehkand
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.,Department of Computer Science, Shahid Beheshti University, Tehran, Iran
| | - Abdollah Safari
- Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.
| | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University, Tehran, Iran.
| | - Mohieddin Jafari
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran. .,Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
40
|
Yu L, Yao S, Gao L, Zha Y. Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments. Front Genet 2019; 9:745. [PMID: 30713550 PMCID: PMC6346701 DOI: 10.3389/fgene.2018.00745] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 12/27/2018] [Indexed: 12/29/2022] Open
Abstract
Disease relationship studies for understanding the pathogenesis of complex diseases, diagnosis, prognosis, and drug development are important. Traditional approaches consider one type of disease data or aggregating multiple types of disease data into a single network, which results in important temporal- or context-related information loss and may distort the actual organization. Therefore, it is necessary to apply multilayer network model to consider multiple types of relationships between diseases and the important interplays between different relationships. Further, modules extracted from multilayer networks are smaller and have more overlap that better capture the actual organization. Here, we constructed a weighted four-layer disease-disease similarity network to characterize the associations at different levels between diseases. Then, a tensor-based computational framework was used to extract Conserved Disease Modules (CDMs) from the four-layer disease network. After filtering, nine significant CDMs were reserved. The statistical significance test proved the significance of the nine CDMs. Comparing with modules got from four single layer networks, CMDs are smaller, better represent the actual relationships, and contain potential disease-disease relationships. KEGG pathways enrichment analysis and literature mining further contributed to confirm that these CDMs are highly reliable. Furthermore, the CDMs can be applied to predict potential drugs for diseases. The molecular docking techniques were used to provide the direct evidence for drugs to treat related disease. Taking Rheumatoid Arthritis (RA) as a case, we found its three potential drugs Carvedilol, Metoprolol, and Ramipril. And many studies have pointed out that Carvedilol and Ramipril have an effect on RA. Overall, the CMDs extracted from multilayer networks provide us with an impressive understanding disease mechanisms from the perspective of multi-layer network and also provide an effective way to predict potential drugs for diseases based on its neighbors in a same CDM.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Shunyu Yao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Yunhong Zha
- Department of Neurology, Institute of Neural Regeneration and Repair, Three Gorges University College of Medicine, The First Hospital of Yichang, Yichang, China
| |
Collapse
|
41
|
Kotlyar M, Pastrello C, Malik Z, Jurisica I. IID 2018 update: context-specific physical protein-protein interactions in human, model organisms and domesticated species. Nucleic Acids Res 2019; 47:D581-D589. [PMID: 30407591 PMCID: PMC6323934 DOI: 10.1093/nar/gky1037] [Citation(s) in RCA: 131] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/15/2018] [Accepted: 10/28/2018] [Indexed: 12/11/2022] Open
Abstract
Knowing the set of physical protein-protein interactions (PPIs) that occur in a particular context-a tissue, disease, or other condition-can provide valuable insights into key research questions. However, while the number of identified human PPIs is expanding rapidly, context information remains limited, and for most non-human species context-specific networks are completely unavailable. The Integrated Interactions Database (IID) provides one of the most comprehensive sets of context-specific human PPI networks, including networks for 133 tissues, 91 disease conditions, and many other contexts. Importantly, it also provides context-specific networks for 17 non-human species including model organisms and domesticated animals. These species are vitally important for drug discovery and agriculture. IID integrates interactions from multiple databases and datasets. It comprises over 4.8 million PPIs annotated with several types of context: tissues, subcellular localizations, diseases, and druggability information (the latter three are new annotations not available in the previous version). This update increases the number of species from 6 to 18, the number of PPIs from ∼1.5 million to ∼4.8 million, and the number of tissues from 30 to 133. IID also now supports topology and enrichment analyses of returned networks. IID is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zara Malik
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
42
|
Luecken MD, Page MJT, Crosby AJ, Mason S, Reinert G, Deane CM. CommWalker: correctly evaluating modules in molecular networks in light of annotation bias. Bioinformatics 2019; 34:994-1000. [PMID: 29112702 PMCID: PMC5860269 DOI: 10.1093/bioinformatics/btx706] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 11/02/2017] [Indexed: 11/24/2022] Open
Abstract
Motivation Detecting novel functional modules in molecular networks is an important step in biological research. In the absence of gold standard functional modules, functional annotations are often used to verify whether detected modules/communities have biological meaning. However, as we show, the uneven distribution of functional annotations means that such evaluation methods favor communities of well-studied proteins. Results We propose a novel framework for the evaluation of communities as functional modules. Our proposed framework, CommWalker, takes communities as inputs and evaluates them in their local network environment by performing short random walks. We test CommWalker’s ability to overcome annotation bias using input communities from four community detection methods on two protein interaction networks. We find that modules accepted by CommWalker are similarly co-expressed as those accepted by current methods. Crucially, CommWalker performs well not only in well-annotated regions, but also in regions otherwise obscured by poor annotation. CommWalker community prioritization both faithfully captures well-validated communities and identifies functional modules that may correspond to more novel biology. Availability and implementation The CommWalker algorithm is freely available at opig.stats.ox.ac.uk/resources or as a docker image on the Docker Hub at hub.docker.com/r/lueckenmd/commwalker/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- M D Luecken
- Department of Statistics, University of Oxford, Oxford, UK
- Doctoral Training Centre, University of Oxford, Oxford, UK
| | - M J T Page
- Department of Informatics, UCB Pharma, Slough, UK
| | - A J Crosby
- Immunology Therapeutic Area, UCB Pharma, Slough, UK
| | - S Mason
- Immunology Therapeutic Area, UCB Pharma, Slough, UK
| | - G Reinert
- Department of Statistics, University of Oxford, Oxford, UK
| | - C M Deane
- Department of Statistics, University of Oxford, Oxford, UK
- Doctoral Training Centre, University of Oxford, Oxford, UK
- To whom correspondence should be addressed.
| |
Collapse
|
43
|
Wang ZT, Tan CC, Tan L, Yu JT. Systems biology and gene networks in Alzheimer’s disease. Neurosci Biobehav Rev 2019; 96:31-44. [DOI: 10.1016/j.neubiorev.2018.11.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Revised: 11/18/2018] [Accepted: 11/18/2018] [Indexed: 12/25/2022]
|
44
|
MacGilvray ME, Shishkova E, Chasman D, Place M, Gitter A, Coon JJ, Gasch AP. Network inference reveals novel connections in pathways regulating growth and defense in the yeast salt response. PLoS Comput Biol 2018; 13:e1006088. [PMID: 29738528 PMCID: PMC5940180 DOI: 10.1371/journal.pcbi.1006088] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 03/13/2018] [Indexed: 11/18/2022] Open
Abstract
Cells respond to stressful conditions by coordinating a complex, multi-faceted response that spans many levels of physiology. Much of the response is coordinated by changes in protein phosphorylation. Although the regulators of transcriptome changes during stress are well characterized in Saccharomyces cerevisiae, the upstream regulatory network controlling protein phosphorylation is less well dissected. Here, we developed a computational approach to infer the signaling network that regulates phosphorylation changes in response to salt stress. We developed an approach to link predicted regulators to groups of likely co-regulated phospho-peptides responding to stress, thereby creating new edges in a background protein interaction network. We then use integer linear programming (ILP) to integrate wild type and mutant phospho-proteomic data and predict the network controlling stress-activated phospho-proteomic changes. The network we inferred predicted new regulatory connections between stress-activated and growth-regulating pathways and suggested mechanisms coordinating metabolism, cell-cycle progression, and growth during stress. We confirmed several network predictions with co-immunoprecipitations coupled with mass-spectrometry protein identification and mutant phospho-proteomic analysis. Results show that the cAMP-phosphodiesterase Pde2 physically interacts with many stress-regulated transcription factors targeted by PKA, and that reduced phosphorylation of those factors during stress requires the Rck2 kinase that we show physically interacts with Pde2. Together, our work shows how a high-quality computational network model can facilitate discovery of new pathway interactions during osmotic stress. Cells sense and respond to stressful environments by utilizing complex signaling networks that integrate diverse signals to coordinate a multi-faceted physiological response. Much of this response is controlled by post-translational protein phosphorylation. Although many regulators that mediate changes in protein phosphorylation are known, how these regulators inter-connect in a single regulatory network that can transmit cellular signals is not known. It is also unclear how regulators that promote growth and regulators that activate the stress response interconnect to reorganize resource allocation during stress. Here, we developed an integrated experimental and computational workflow to infer the signaling network that regulates phosphorylation changes during osmotic stress in the budding yeast Saccharomyces cerevisiae. The workflow integrates data measuring protein phosphorylation changes in response to osmotic stress with known physical interactions between yeast proteins from large-scale datasets, along with other information about how regulators recognize their targets. The resulting network suggested new signaling connections between regulators and pathways, including those involved in regulating growth and defense, and predicted new regulators involved in stress defense. Our work highlights the power of using network inference to deliver new insight on how cells coordinate a diverse adaptive strategy to stress.
Collapse
Affiliation(s)
- Matthew E. MacGilvray
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Evgenia Shishkova
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
| | - Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin–Madison, Madison, WI, United States of America
| | - Michael Place
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin -Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
| | - Joshua J. Coon
- Department of Biomolecular Chemistry, University of Wisconsin—Madison, Madison, WI, United States of America
- Morgridge Institute for Research, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- Genome Center of Wisconsin, Madison, WI, United States of America
| | - Audrey P. Gasch
- Laboratory of Genetics, University of Wisconsin—Madison, Madison, WI, United States of America
- Department of Chemistry, University of Wisconsin -Madison, Madison, WI, United States of America
- * E-mail:
| |
Collapse
|
45
|
Ignatius Pang CN, Goel A, Wilkins MR. Investigating the Network Basis of Negative Genetic Interactions in Saccharomyces cerevisiae with Integrated Biological Networks and Triplet Motif Analysis. J Proteome Res 2018; 17:1014-1030. [DOI: 10.1021/acs.jproteome.7b00649] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems
Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Apurv Goel
- Systems
Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Marc R. Wilkins
- Systems
Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
46
|
Xu WM, Yang K, Jiang LJ, Hu JQ, Zhou XZ. Integrated Modules Analysis to Explore the Molecular Mechanisms of Phlegm-Stasis Cementation Syndrome with Ischemic Heart Disease. Front Physiol 2018; 9:7. [PMID: 29403392 PMCID: PMC5786858 DOI: 10.3389/fphys.2018.00007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 01/04/2018] [Indexed: 12/15/2022] Open
Abstract
Background: Ischemic heart disease (IHD) has been the leading cause of death for several decades globally, IHD patients usually hold the symptoms of phlegm-stasis cementation syndrome (PSCS) as significant complications. However, the underlying molecular mechanisms of PSCS complicated with IHD have not yet been fully elucidated. Materials and Methods: Network medicine methods were utilized to elucidate the underlying molecular mechanisms of IHD phenotypes. Firstly, high-quality IHD-associated genes from both human curated disease-gene association database and biomedical literatures were integrated. Secondly, the IHD disease modules were obtained by dissecting the protein-protein interaction (PPI) topological modules in the String V9.1 database and the mapping of IHD-associated genes to the PPI topological modules. After that, molecular functional analyses (e.g., Gene Ontology and pathway enrichment analyses) for these IHD disease modules were conducted. Finally, the PSCS syndrome modules were identified by mapping the PSCS related symptom-genes to the IHD disease modules, which were further validated by both pharmacological and physiological evidences derived from published literatures. Results: The total of 1,056 high-quality IHD-associated genes were integrated and evaluated. In addition, eight IHD disease modules (the PPI sub-networks significantly relevant to IHD) were identified, in which two disease modules were relevant to PSCS syndrome (i.e., two PSCS syndrome modules). These two modules had enriched pathways on Toll-like receptor signaling pathway (hsa04620) and Renin-angiotensin system (hsa04614), with the molecular functions of angiotensin maturation (GO:0002003) and response to bacterium (GO:0009617), which had been validated by classical Chinese herbal formulas-related targets, IHD-related drug targets, and the phenotype features derived from human phenotype ontology (HPO) and published biomedical literatures. Conclusion: A network medicine-based approach was proposed to identify the underlying molecular modules of PSCS complicated with IHD, which could be used for interpreting the pharmacological mechanisms of well-established Chinese herbal formulas (e.g., Tao Hong Si Wu Tang, Dan Shen Yin, Hunag Lian Wen Dan Tang and Gua Lou Xie Bai Ban Xia Tang). In addition, these results delivered novel understandings of the molecular network mechanisms of IHD phenotype subtypes with PSCS complications, which would be both insightful for IHD precision medicine and the integration of disease and TCM syndrome diagnoses.
Collapse
Affiliation(s)
- Wei-Ming Xu
- Research Centre for Disease and Syndrome, Institute of Basic Theory for Traditional Chinese Medicine, China Academy of Chinese Medicine Sciences, Beijing, China
| | - Kuo Yang
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| | - Li-Jie Jiang
- Research Centre for Disease and Syndrome, Institute of Basic Theory for Traditional Chinese Medicine, China Academy of Chinese Medicine Sciences, Beijing, China
| | - Jing-Qing Hu
- Research Centre for Disease and Syndrome, Institute of Basic Theory for Traditional Chinese Medicine, China Academy of Chinese Medicine Sciences, Beijing, China
| | - Xue-Zhong Zhou
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
- Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| |
Collapse
|
47
|
Kotlyar M, Rossos AEM, Jurisica I. Prediction of Protein-Protein Interactions. ACTA ACUST UNITED AC 2017; 60:8.2.1-8.2.14. [PMID: 29220074 DOI: 10.1002/cpbi.38] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The authors provide an overview of physical protein-protein interaction prediction, covering the main strategies for predicting interactions, approaches for assessing predictions, and online resources for accessing predictions. This unit focuses on the main advancements in each of these areas over the last decade. The methods and resources that are presented here are not an exhaustive set, but characterize the current state of the field-highlighting key challenges and achievements. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Max Kotlyar
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Andrea E M Rossos
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, Ontario, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
48
|
Drew K, Müller CL, Bonneau R, Marcotte EM. Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. PLoS Comput Biol 2017; 13:e1005625. [PMID: 29023445 PMCID: PMC5638211 DOI: 10.1371/journal.pcbi.1005625] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 06/06/2017] [Indexed: 12/21/2022] Open
Abstract
Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available. Proteins physically associate into complexes in order to carry out the essential functions of life. Knowing how proteins are physically arranged three dimensionally in these complexes provides clues towards how they work. In principle, the associations between proteins in large-scale proteomics datasets should often reflect direct physical contacts between proteins in each complex. Here, we describe a statistical method to discover which subunits within complexes directly contact each other based on their co-purification behavior in published co-fractionation mass spectrometry datasets. Within our predictions, we recover many known protein-protein contacts, serving to validate our method, as well as unknown contacts that can inform future studies of these complexes. Specifically, we observe confident contacts between subunits within the exocyst and tRNA multi-synthetase complexes, two complexes that have incomplete structural information. Using our method, we further provide structural information for a previously missing subunit of the EKC/KEOPS complex. We anticipate that this method and the associated predictions will help to better inform our understanding of the functions and structures of diverse protein complexes.
Collapse
Affiliation(s)
- Kevin Drew
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (KD); (CLM); (EMM)
| | - Christian L. Müller
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, United States of America
- * E-mail: (KD); (CLM); (EMM)
| | - Richard Bonneau
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, United States of America
- New York University Center for Genomics and Systems Biology, New York University, New York, NY, United States of America
| | - Edward M. Marcotte
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (KD); (CLM); (EMM)
| |
Collapse
|
49
|
Studying protein-protein interactions: progress, pitfalls and solutions. Biochem Soc Trans 2017; 44:994-1004. [PMID: 27528744 DOI: 10.1042/bst20160092] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Indexed: 12/27/2022]
Abstract
Signalling proteins are intrinsic to all biological processes and interact with each other in tightly regulated and orchestrated signalling complexes and pathways. Characterization of protein binding can help to elucidate protein function within signalling pathways. This information is vital for researchers to gain a more comprehensive knowledge of cellular networks which can then be used to develop new therapeutic strategies for disease. However, studying protein-protein interactions (PPIs) can be challenging as the interactions can be extremely transient downstream of specific environmental cues. There are many powerful techniques currently available to identify and confirm PPIs. Choosing the most appropriate range of techniques merits serious consideration. The aim of this review is to provide a starting point for researchers embarking on a PPI study. We provide an overview and point of reference for some of the many methods available to identify interactions from in silico analysis and large scale screening tools through to the methods used to validate potential PPIs. We discuss the advantages and disadvantages of each method and we also provide a workflow chart to highlight the main experimental questions to consider when planning cell lysis to maximize experimental success.
Collapse
|
50
|
Drew K, Lee C, Huizar RL, Tu F, Borgeson B, McWhite CD, Ma Y, Wallingford JB, Marcotte EM. Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Mol Syst Biol 2017; 13:932. [PMID: 28596423 PMCID: PMC5488662 DOI: 10.15252/msb.20167490] [Citation(s) in RCA: 143] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Macromolecular protein complexes carry out many of the essential functions of cells, and many genetic diseases arise from disrupting the functions of such complexes. Currently, there is great interest in defining the complete set of human protein complexes, but recent published maps lack comprehensive coverage. Here, through the synthesis of over 9,000 published mass spectrometry experiments, we present hu.MAP, the most comprehensive and accurate human protein complex map to date, containing > 4,600 total complexes, > 7,700 proteins, and > 56,000 unique interactions, including thousands of confident protein interactions not identified by the original publications. hu.MAP accurately recapitulates known complexes withheld from the learning procedure, which was optimized with the aid of a new quantitative metric (k‐cliques) for comparing sets of sets. The vast majority of complexes in our map are significantly enriched with literature annotations, and the map overall shows improved coverage of many disease‐associated proteins, as we describe in detail for ciliopathies. Using hu.MAP, we predicted and experimentally validated candidate ciliopathy disease genes in vivo in a model vertebrate, discovering CCDC138, WDR90, and KIAA1328 to be new cilia basal body/centriolar satellite proteins, and identifying ANKRD55 as a novel member of the intraflagellar transport machinery. By offering significant improvements to the accuracy and coverage of human protein complexes, hu.MAP (http://proteincomplexes.org) serves as a valuable resource for better understanding the core cellular functions of human proteins and helping to determine mechanistic foundations of human disease.
Collapse
Affiliation(s)
- Kevin Drew
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Chanjae Lee
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Ryan L Huizar
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Fan Tu
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Blake Borgeson
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Claire D McWhite
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Yun Ma
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA.,The Otolaryngology Hospital, The First Affiliated Hospital of Sun Yat-sen University Sun Yat-sen University, Guangzhou, China
| | - John B Wallingford
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA.,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA .,Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|