1
|
Network assisted analysis of de novo variants using protein-protein interaction information identified 46 candidate genes for congenital heart disease. PLoS Genet 2022; 18:e1010252. [PMID: 35671298 PMCID: PMC9205499 DOI: 10.1371/journal.pgen.1010252] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/17/2022] [Accepted: 05/12/2022] [Indexed: 11/19/2022] Open
Abstract
De novo variants (DNVs) with deleterious effects have proved informative in identifying risk genes for early-onset diseases such as congenital heart disease (CHD). A number of statistical methods have been proposed for family-based studies or case/control studies to identify risk genes by screening genes with more DNVs than expected by chance in Whole Exome Sequencing (WES) studies. However, the statistical power is still limited for cohorts with thousands of subjects. Under the hypothesis that connected genes in protein-protein interaction (PPI) networks are more likely to share similar disease association status, we developed a Markov Random Field model that can leverage information from publicly available PPI databases to increase power in identifying risk genes. We identified 46 candidate genes with at least 1 DNV in the CHD study cohort, including 18 known human CHD genes and 35 highly expressed genes in mouse developing heart. Our results may shed new insight on the shared protein functionality among risk genes for CHD. The topologic information in a pathway may be informative to identify functionally interrelated genes and help improve statistical power in DNV studies. Under the hypothesis that connected genes in PPI networks are more likely to share similar disease association status, we developed a novel statistical model that can leverage information from publicly available PPI databases. Through simulation studies under multiple settings, we proved our method can increase statistical power in identifying additional risk genes compared to methods without using the PPI network information. We then applied our method to a real example for CHD DNV data, and then visualized the subnetwork of candidate genes to find potential functional gene clusters for CHD.
Collapse
|
2
|
Erdogan F, Radu TB, Orlova A, Qadree AK, de Araujo ED, Israelian J, Valent P, Mustjoki SM, Herling M, Moriggl R, Gunning PT. JAK-STAT core cancer pathway: An integrative cancer interactome analysis. J Cell Mol Med 2022; 26:2049-2062. [PMID: 35229974 PMCID: PMC8980946 DOI: 10.1111/jcmm.17228] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 12/14/2021] [Accepted: 12/22/2021] [Indexed: 12/25/2022] Open
Abstract
Through a comprehensive review and in silico analysis of reported data on STAT-linked diseases, we analysed the communication pathways and interactome of the seven STATs in major cancer categories and proposed rational targeting approaches for therapeutic intervention to disrupt critical pathways and addictions to hyperactive JAK/STAT in neoplastic states. Although all STATs follow a similar molecular activation pathway, STAT1, STAT2, STAT4 and STAT6 exert specific biological profiles associated with a more restricted pattern of activation by cytokines. STAT3 and STAT5A as well as STAT5B have pleiotropic roles in the body and can act as critical oncogenes that promote many processes involved in cancer development. STAT1, STAT3 and STAT5 also possess tumour suppressive action in certain mutational and cancer type context. Here, we demonstrated member-specific STAT activity in major cancer types. Through systems biology approaches, we found surprising roles for EGFR family members, sex steroid hormone receptor ESR1 interplay with oncogenic STAT function and proposed new drug targeting approaches of oncogenic STAT pathway addiction.
Collapse
Affiliation(s)
- Fettah Erdogan
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of ChemistryUniversity of TorontoTorontoOntarioCanada
| | - Tudor Bogdan Radu
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of ChemistryUniversity of TorontoTorontoOntarioCanada
| | - Anna Orlova
- Institute of Animal Breeding and GeneticsUniversity of Veterinary MedicineViennaAustria
| | - Abdul Khawazak Qadree
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of ChemistryUniversity of TorontoTorontoOntarioCanada
| | - Elvin Dominic de Araujo
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
| | - Johan Israelian
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of ChemistryUniversity of TorontoTorontoOntarioCanada
| | - Peter Valent
- Division of Hematology and HemostaseologyDepartment of Internal Medicine IMedical University of ViennaViennaAustria
- Ludwig Boltzmann Institute for Hematology and OncologyMedical University of ViennaViennaAustria
| | - Satu M. Mustjoki
- Translational Immunology Research Program and Department of Clinical Chemistry and HematologyUniversity of HelsinkiHelsinkiFinland
- Hematology Research UnitHelsinki University Hospital Comprehensive Cancer CenterHelsinkiFinland
- iCAN Digital Precision Cancer Medicine FlagshipHelsinkiFinland
| | - Marco Herling
- Department of Hematology, Cellular Therapy, and HemostaseologyUniversity of LeipzigLeipzigGermany
| | - Richard Moriggl
- Institute of Animal Breeding and GeneticsUniversity of Veterinary MedicineViennaAustria
| | - Patrick Thomas Gunning
- Department of Chemical and Physical SciencesUniversity of Toronto MississaugaMississaugaOntarioCanada
- Department of ChemistryUniversity of TorontoTorontoOntarioCanada
| |
Collapse
|
3
|
Erdogan F, Qadree AK, Radu TB, Orlova A, de Araujo ED, Israelian J, Valent P, Mustjoki SM, Herling M, Moriggl R, Gunning PT. Structural and mutational analysis of member-specific STAT functions. Biochim Biophys Acta Gen Subj 2022; 1866:130058. [PMID: 34774983 DOI: 10.1016/j.bbagen.2021.130058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 10/29/2021] [Accepted: 11/05/2021] [Indexed: 12/21/2022]
Abstract
BACKGROUND The STAT family of transcription factors control gene expression in response to signals from various stimulus. They display functions in diseases ranging from autoimmunity and chronic inflammatory disease to cancer and infectious disease. SCOPE OF REVIEW This work uses an approach informed by structural data to explore how domain-specific structural variations, post-translational modifications, and the cancer genome mutational landscape dictate STAT member-specific activities. MAJOR CONCLUSIONS We illustrated the structure-function relationship of STAT proteins and highlighted their effect on member-specific activity. We correlated disease-linked STAT mutations to the structure and cancer genome mutational landscape and proposed rational drug targeting approaches of oncogenic STAT pathway addiction. GENERAL SIGNIFICANCE Hyper-activated STATs and their variants are associated with multiple diseases and are considered high value oncology targets. A full understanding of the molecular basis of member-specific STAT-mediated signaling and the strategies to selectively target them requires examination of the difference in their structures and sequences.
Collapse
Affiliation(s)
- Fettah Erdogan
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada; Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, Canada
| | - Abdul K Qadree
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada; Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, Canada
| | - Tudor B Radu
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada; Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, Canada
| | - Anna Orlova
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine, A-1210 Vienna, Austria
| | - Elvin D de Araujo
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada
| | - Johan Israelian
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada; Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, Canada
| | - Peter Valent
- Department of Internal Medicine I, Division of Hematology and Hemostaseology, Medical University of Vienna, Vienna, Austria; Ludwig Boltzmann Institute for Hematology and Oncology, Medical University of Vienna, Vienna, Austria
| | - Satu M Mustjoki
- Hematology Research Unit, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland; Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland; iCAN Digital Precision Cancer Medicine Flagship, Helsinki, Finland
| | - Marco Herling
- Department of Hematology, Cellular Therapy, and Hemostaseology, University of Leipzig, Leipzig, Germany
| | - Richard Moriggl
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine, A-1210 Vienna, Austria
| | - Patrick T Gunning
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, 3359 Mississauga Rd N., Mississauga, Canada; Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, Canada.
| |
Collapse
|
4
|
Kotlyar M, Pastrello C, Ahmed Z, Chee J, Varyova Z, Jurisica I. IID 2021: towards context-specific protein interaction analyses by increased coverage, enhanced annotation and enrichment analysis. Nucleic Acids Res 2021; 50:D640-D647. [PMID: 34755877 PMCID: PMC8728267 DOI: 10.1093/nar/gkab1034] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 11/03/2021] [Indexed: 01/02/2023] Open
Abstract
Improved bioassays have significantly increased the rate of identifying new protein-protein interactions (PPIs), and the number of detected human PPIs has greatly exceeded early estimates of human interactome size. These new PPIs provide a more complete view of disease mechanisms but precise understanding of how PPIs affect phenotype remains a challenge. It requires knowledge of PPI context (e.g. tissues, subcellular localizations), and functional roles, especially within pathways and protein complexes. The previous IID release focused on PPI context, providing networks with comprehensive tissue, disease, cellular localization, and druggability annotations. The current update adds developmental stages to the available contexts, and provides a way of assigning context to PPIs that could not be previously annotated due to insufficient data or incompatibility with available context categories (e.g. interactions between membrane and cytoplasmic proteins). This update also annotates PPIs with conservation across species, directionality in pathways, membership in large complexes, interaction stability (i.e. stable or transient), and mutation effects. Enrichment analysis is now available for all annotations, and includes multiple options; for example, context annotations can be analyzed with respect to PPIs or network proteins. In addition to tabular view or download, IID provides online network visualization. This update is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zuhaib Ahmed
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Justin Chee
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zofia Varyova
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
5
|
Zhong M, Luo Q, Ye T, Zhu X, Chen X, Liu J. Identification of Candidate Genes Associated with Charcot-Marie-Tooth Disease by Network and Pathway Analysis. BIOMED RESEARCH INTERNATIONAL 2020; 2020:1353516. [PMID: 33029488 PMCID: PMC7532371 DOI: 10.1155/2020/1353516] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 07/21/2020] [Accepted: 08/12/2020] [Indexed: 12/15/2022]
Abstract
Charcot-Marie-Tooth Disease (CMT) is the most common clinical genetic disease of the peripheral nervous system. Although many studies have focused on elucidating the pathogenesis of CMT, few focuses on achieving a systematic analysis of biology to decode the underlying pathological molecular mechanisms and the mechanism of its disease remains to be elucidated. So our study may provide further useful insights into the molecular mechanisms of CMT based on a systematic bioinformatics analysis. In the current study, by reviewing the literatures deposited in PUBMED, we identified 100 genes genetically related to CMT. Then, the functional features of the CMT-related genes were examined by R software and KOBAS, and the selected biological process crosstalk was visualized with the software Cytoscape. Moreover, CMT specific molecular network analysis was conducted by the Molecular Complex Detection (MCODE) Algorithm. The biological function enrichment analysis suggested that myelin sheath, axon, peripheral nervous system, mitochondrial function, various metabolic processes, and autophagy played important roles in CMT development. Aminoacyl-tRNA biosynthesis, metabolic pathways, and vasopressin-regulated water reabsorption were significantly enriched in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway network, suggesting that these pathways may play key roles in CMT occurrence and development. According to the crosstalk, the biological processes could be roughly divided into a correlative module and two separate modules. MCODE clusters showed that in top 3 clusters, 13 of CMT-related genes were included in the network and 30 candidate genes were discovered which might be potentially related to CMT. The study may help to update the new understanding of the pathogenesis of CMT and expand the potential genes of CMT for further exploration.
Collapse
Affiliation(s)
- Min Zhong
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| | - Qing Luo
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| | - Ting Ye
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| | - XiDan Zhu
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| | - Xiu Chen
- Department of Neurology, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| | - JinBo Liu
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, 25 Taiping Street, Luzhou, 646000 Sichuan, China
| |
Collapse
|
6
|
Poot Velez AH, Fontove F, Del Rio G. Protein-Protein Interactions Efficiently Modeled by Residue Cluster Classes. Int J Mol Sci 2020; 21:E4787. [PMID: 32640745 PMCID: PMC7370293 DOI: 10.3390/ijms21134787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Revised: 06/20/2020] [Accepted: 06/28/2020] [Indexed: 01/22/2023] Open
Abstract
Predicting protein-protein interactions (PPI) represents an important challenge in structural bioinformatics. Current computational methods display different degrees of accuracy when predicting these interactions. Different factors were proposed to help improve these predictions, including choosing the proper descriptors of proteins to represent these interactions, among others. In the current work, we provide a representative protein structure that is amenable to PPI classification using machine learning approaches, referred to as residue cluster classes. Through sampling and optimization, we identified the best algorithm-parameter pair to classify PPI from more than 360 different training sets. We tested these classifiers against PPI datasets that were not included in the training set but shared sequence similarity with proteins in the training set to reproduce the situation of most proteins sharing sequence similarity with others. We identified a model with almost no PPI error (96-99% of correctly classified instances) and showed that residue cluster classes of protein pairs displayed a distinct pattern between positive and negative protein interactions. Our results indicated that residue cluster classes are structural features relevant to model PPI and provide a novel tool to mathematically model the protein structure/function relationship.
Collapse
Affiliation(s)
- Albros Hermes Poot Velez
- Department of biochemistry and structural biology, Instituto de fisiologia celular, UNAM Mexico City 04510, Mexico;
| | | | - Gabriel Del Rio
- Department of biochemistry and structural biology, Instituto de fisiologia celular, UNAM Mexico City 04510, Mexico;
| |
Collapse
|
7
|
Sinsky J, Majerova P, Kovac A, Kotlyar M, Jurisica I, Hanes J. Physiological Tau Interactome in Brain and Its Link to Tauopathies. J Proteome Res 2020; 19:2429-2442. [PMID: 32357304 DOI: 10.1021/acs.jproteome.0c00137] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Alzheimer's disease (AD) and most of the other tauopathies are incurable neurodegenerative diseases with unpleasant symptoms and consequences. The common hallmark of all of these diseases is tau pathology, but its connection with disease progress has not been completely understood so far. Therefore, uncovering novel tau-interacting partners and pathology affected molecular pathways can reveal the causes of diseases as well as potential targets for the development of AD treatment. Despite the large number of known tau-interacting partners, a limited number of studies focused on in vivo tau interactions in disease or healthy conditions are available. Here, we applied an in vivo cross-linking approach, capable of capturing weak and transient protein-protein interactions, to a unique transgenic rat model of progressive tau pathology similar to human AD. We have identified 175 potential novel and known tau-interacting proteins by MALDI-TOF mass spectrometry. Several of the most promising candidates for possible drug development were selected for validation by coimmunoprecipitation and colocalization experiments in animal and cellular models. Three proteins, Baiap2, Gpr37l1, and Nptx1, were confirmed as novel tau-interacting partners, and on the basis of their known functions and implications in neurodegenerative or psychiatric disorders, we proposed their potential role in tau pathology.
Collapse
Affiliation(s)
- Jakub Sinsky
- Institute of Neuroimmunology, Slovak Academy of Sciences, Dubravska cesta 9, Bratislava 84510, Slovakia
| | - Petra Majerova
- Institute of Neuroimmunology, Slovak Academy of Sciences, Dubravska cesta 9, Bratislava 84510, Slovakia.,AXON Neuroscience R&D Services SE, Dvorakovo nabrezie 10, Bratislava 811 02, Slovakia
| | - Andrej Kovac
- Institute of Neuroimmunology, Slovak Academy of Sciences, Dubravska cesta 9, Bratislava 84510, Slovakia.,AXON Neuroscience R&D Services SE, Dvorakovo nabrezie 10, Bratislava 811 02, Slovakia
| | - Max Kotlyar
- Krembil Research Institute, UHN, 60 Leonard Avenue, Toronto, Ontario M5T 0S8, Canada
| | - Igor Jurisica
- Institute of Neuroimmunology, Slovak Academy of Sciences, Dubravska cesta 9, Bratislava 84510, Slovakia.,Krembil Research Institute, UHN, 60 Leonard Avenue, Toronto, Ontario M5T 0S8, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, 27 King's College Circle, Toronto, Ontario ON M5S, Canada
| | - Jozef Hanes
- Institute of Neuroimmunology, Slovak Academy of Sciences, Dubravska cesta 9, Bratislava 84510, Slovakia.,AXON Neuroscience R&D Services SE, Dvorakovo nabrezie 10, Bratislava 811 02, Slovakia
| |
Collapse
|
8
|
A Computational Framework for Predicting Direct Contacts and Substructures within Protein Complexes. Biomolecules 2019; 9:biom9110656. [PMID: 31717703 PMCID: PMC6921016 DOI: 10.3390/biom9110656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 10/20/2019] [Accepted: 10/23/2019] [Indexed: 11/17/2022] Open
Abstract
Understanding the physical arrangement of subunits within protein complexes potentially provides valuable clues about how the subunits work together and how the complexes function. The majority of recent research focuses on identifying protein complexes as a whole and seldom studies the inner structures within complexes. In this study, we propose a computational framework to predict direct contacts and substructures within protein complexes. In this framework, we first train a supervised learning model of l2-regularized logistic regression to learn the patterns of direct and indirect interactions within complexes, from where physical subunit interaction networks are predicted. Then, to infer substructures within complexes, we apply a graph clustering method (i.e., maximum modularity clustering (MMC)) and a gene ontology (GO) semantic similarity based functional clustering on partially- and fully-connected networks, respectively. Computational results show that the proposed framework achieves fairly good performance of cross validation and independent test in terms of detecting direct contacts between subunits. Functional analyses further demonstrate the rationality of partitioning the subunits into substructures via the MMC algorithm and functional clustering.
Collapse
|
9
|
Guala D, Ogris C, Müller N, Sonnhammer ELL. Genome-wide functional association networks: background, data & state-of-the-art resources. Brief Bioinform 2019; 21:1224-1237. [PMID: 31281921 PMCID: PMC7373183 DOI: 10.1093/bib/bbz064] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/29/2019] [Accepted: 05/04/2019] [Indexed: 02/06/2023] Open
Abstract
The vast amount of experimental data from recent advances in the field of high-throughput biology begs for integration into more complex data structures such as genome-wide functional association networks. Such networks have been used for elucidation of the interplay of intra-cellular molecules to make advances ranging from the basic science understanding of evolutionary processes to the more translational field of precision medicine. The allure of the field has resulted in rapid growth of the number of available network resources, each with unique attributes exploitable to answer different biological questions. Unfortunately, the high volume of network resources makes it impossible for the intended user to select an appropriate tool for their particular research question. The aim of this paper is to provide an overview of the underlying data and representative network resources as well as to mention methods of integration, allowing a customized approach to resource selection. Additionally, this report will provide a primer for researchers venturing into the field of network integration.
Collapse
Affiliation(s)
- Dimitri Guala
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Christoph Ogris
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Nikola Müller
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Erik L L Sonnhammer
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
10
|
Sumonja N, Gemovic B, Veljkovic N, Perovic V. Automated feature engineering improves prediction of protein-protein interactions. Amino Acids 2019; 51:1187-1200. [PMID: 31278492 DOI: 10.1007/s00726-019-02756-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Accepted: 06/26/2019] [Indexed: 10/26/2022]
Abstract
Over the last decade, various machine learning (ML) and statistical approaches for protein-protein interaction (PPI) predictions have been developed to help annotating functional interactions among proteins, essential for our system-level understanding of life. Efficient ML approaches require informative and non-redundant features. In this paper, we introduce novel types of expert-crafted sequence, evolutionary and graph features and apply automatic feature engineering to further expand feature space to improve predictive modeling. The two-step automatic feature-engineering process encompasses the hybrid method for feature generation and unsupervised feature selection, followed by supervised feature selection through a genetic algorithm (GA). The optimization of both steps allows the feature-engineering procedure to operate on a large transformed feature space with no considerable computational cost and to efficiently provide newly engineered features. Based on GA and correlation filtering, we developed a stacking algorithm GA-STACK for automatic ensembling of different ML algorithms to improve prediction performance. We introduced a unified method, HP-GAS, for the prediction of human PPIs, which incorporates GA-STACK and rests on both expert-crafted and 40% of newly engineered features. The extensive cross validation and comparison with the state-of-the-art methods showed that HP-GAS represents currently the most efficient method for proteome-wide forecasting of protein interactions, with prediction efficacy of 0.93 AUC and 0.85 accuracy. We implemented the HP-GAS method as a free standalone application which is a time-efficient and easy-to-use tool. HP-GAS software with supplementary data can be downloaded from: http://www.vinca.rs/180/tools/HP-GAS.php .
Collapse
Affiliation(s)
- Neven Sumonja
- Laboratory for Bioinformatics and Computational Chemistry, Vinca Institute of Nuclear Sciences, University of Belgrade, Mike Petrovica Alasa 12-14, Vinca, Belgrade, 11351, Serbia
| | - Branislava Gemovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinca Institute of Nuclear Sciences, University of Belgrade, Mike Petrovica Alasa 12-14, Vinca, Belgrade, 11351, Serbia
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinca Institute of Nuclear Sciences, University of Belgrade, Mike Petrovica Alasa 12-14, Vinca, Belgrade, 11351, Serbia
| | - Vladimir Perovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinca Institute of Nuclear Sciences, University of Belgrade, Mike Petrovica Alasa 12-14, Vinca, Belgrade, 11351, Serbia.
| |
Collapse
|
11
|
Park OH, Ha H, Lee Y, Boo SH, Kwon DH, Song HK, Kim YK. Endoribonucleolytic Cleavage of m6A-Containing RNAs by RNase P/MRP Complex. Mol Cell 2019; 74:494-507.e8. [DOI: 10.1016/j.molcel.2019.02.034] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Revised: 01/14/2019] [Accepted: 02/22/2019] [Indexed: 12/21/2022]
|
12
|
Kotlyar M, Pastrello C, Malik Z, Jurisica I. IID 2018 update: context-specific physical protein-protein interactions in human, model organisms and domesticated species. Nucleic Acids Res 2019; 47:D581-D589. [PMID: 30407591 PMCID: PMC6323934 DOI: 10.1093/nar/gky1037] [Citation(s) in RCA: 131] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/15/2018] [Accepted: 10/28/2018] [Indexed: 12/11/2022] Open
Abstract
Knowing the set of physical protein-protein interactions (PPIs) that occur in a particular context-a tissue, disease, or other condition-can provide valuable insights into key research questions. However, while the number of identified human PPIs is expanding rapidly, context information remains limited, and for most non-human species context-specific networks are completely unavailable. The Integrated Interactions Database (IID) provides one of the most comprehensive sets of context-specific human PPI networks, including networks for 133 tissues, 91 disease conditions, and many other contexts. Importantly, it also provides context-specific networks for 17 non-human species including model organisms and domesticated animals. These species are vitally important for drug discovery and agriculture. IID integrates interactions from multiple databases and datasets. It comprises over 4.8 million PPIs annotated with several types of context: tissues, subcellular localizations, diseases, and druggability information (the latter three are new annotations not available in the previous version). This update increases the number of species from 6 to 18, the number of PPIs from ∼1.5 million to ∼4.8 million, and the number of tissues from 30 to 133. IID also now supports topology and enrichment analyses of returned networks. IID is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zara Malik
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
13
|
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken. BMC Genomics 2018; 19:594. [PMID: 30086717 PMCID: PMC6081845 DOI: 10.1186/s12864-018-4972-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 07/31/2018] [Indexed: 12/20/2022] Open
Abstract
Background The domestic chicken (Gallus gallus) is widely used as a model in developmental biology and is also an important livestock species. We describe a novel approach to data integration to generate an mRNA expression atlas for the chicken spanning major tissue types and developmental stages, using a diverse range of publicly-archived RNA-seq datasets and new data derived from immune cells and tissues. Results Randomly down-sampling RNA-seq datasets to a common depth and quantifying expression against a reference transcriptome using the mRNA quantitation tool Kallisto ensured that disparate datasets explored comparable transcriptomic space. The network analysis tool Graphia was used to extract clusters of co-expressed genes from the resulting expression atlas, many of which were tissue or cell-type restricted, contained transcription factors that have previously been implicated in their regulation, or were otherwise associated with biological processes, such as the cell cycle. The atlas provides a resource for the functional annotation of genes that currently have only a locus ID. We cross-referenced the RNA-seq atlas to a publicly available embryonic Cap Analysis of Gene Expression (CAGE) dataset to infer the developmental time course of organ systems, and to identify a signature of the expansion of tissue macrophage populations during development. Conclusion Expression profiles obtained from public RNA-seq datasets – despite being generated by different laboratories using different methodologies – can be made comparable to each other. This meta-analytic approach to RNA-seq can be extended with new datasets from novel tissues, and is applicable to any species. Electronic supplementary material The online version of this article (10.1186/s12864-018-4972-7) contains supplementary material, which is available to authorized users.
Collapse
|
14
|
Mei S, Flemington EK, Zhang K. A computational framework for distinguishing direct versus indirect interactions in human functional protein-protein interaction networks. Integr Biol (Camb) 2018; 9:595-606. [PMID: 28524201 DOI: 10.1039/c7ib00013h] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Recognition of indirect interactions is instrumental to in silico reconstruction of signaling pathways and sheds light on the exploration of unknown physical paths between two indirectly interacting genes. However, very limited computational methods have explicitly exploited the indirect interactions with experimental evidence thus far. In this work, we attempt to distinguish direct versus indirect interactions in human functional protein-protein interaction (PPI) networks via a predictive l2-regularized logistic regression model built on the experimental data. The l2-regularized logistic regression method is adopted to counteract the potential homolog noise and reduce the computational complexity on large training data. Computational results show that the proposed model demonstrates promising performance even though the training data are highly skewed. From the 304 799 PPIs that are curated in several databases, the proposed method detects 23 131 indirect interactions, most of which have been verified by the breadth-first graph search algorithm to find dozens of physical paths between the interacting partners. Pathway enrichment analysis shows that most of the physical paths can be mapped onto more than one human signaling pathway, indicating that there do exist a series of biochemical signals between the two indirectly interacting genes. The interactome-scale computational results promise to provide useful cues to the following applications: (1) exploration of unknown physical PPIs or physical paths between two indirectly interacting genes; (2) amending or extending the existing signaling pathways; (3) recognition of the physical PPIs for druggable target discovery.
Collapse
Affiliation(s)
- Suyu Mei
- Software College, Shenyang Normal University, Shenyang, 110034, China.
| | | | | |
Collapse
|
15
|
Recabarren-Leiva D, Alarcón M. New insights into the gene expression associated to amyotrophic lateral sclerosis. Life Sci 2018; 193:110-123. [DOI: 10.1016/j.lfs.2017.12.016] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2017] [Revised: 12/01/2017] [Accepted: 12/10/2017] [Indexed: 12/11/2022]
|
16
|
Kotlyar M, Rossos AEM, Jurisica I. Prediction of Protein-Protein Interactions. ACTA ACUST UNITED AC 2017; 60:8.2.1-8.2.14. [PMID: 29220074 DOI: 10.1002/cpbi.38] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The authors provide an overview of physical protein-protein interaction prediction, covering the main strategies for predicting interactions, approaches for assessing predictions, and online resources for accessing predictions. This unit focuses on the main advancements in each of these areas over the last decade. The methods and resources that are presented here are not an exhaustive set, but characterize the current state of the field-highlighting key challenges and achievements. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Max Kotlyar
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Andrea E M Rossos
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, Ontario, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
17
|
Abstract
Three neurodegenerative diseases [Amyotrophic Lateral Sclerosis (ALS), Parkinson's disease (PD) and Alzheimer's disease (AD)] have many characteristics like pathological mechanisms and genes. In this sense some researchers postulate that these diseases share the same alterations and that one alteration in a specific protein triggers one of these diseases. Analyses of gene expression may shed more light on how to discover pathways, pathologic mechanisms associated with the disease, biomarkers and potential therapeutic targets. In this review, we analyze four microarrays related to three neurodegenerative diseases. We will systematically examine seven genes (CHN1, MDH1, PCP4, RTN1, SLC14A1, SNAP25 and VSNL1) that are altered in the three neurodegenerative diseases. A network was built and used to identify pathways, miRNA and drugs associated with ALS, AD and PD using Cytoscape software an interaction network based on the protein interactions of these genes. The most important affected pathway is PI3K-Akt signalling. Thirteen microRNAs (miRNA-19B1, miRNA-107, miRNA-124-1, miRNA-124-2, miRNA-9-2, miRNA-29A, miRNA-9-3, miRNA-328, miRNA-19B2, miRNA-29B2, miRNA-124-3, miRNA-15A and miRNA-9-1) and four drugs (Estradiol, Acetaminophen, Resveratrol and Progesterone) for new possible treatments were identified.
Collapse
Affiliation(s)
| | - Marcelo Alarcón
- Department of Clinical Biochemistry and Immunohematology, Faculty of Health Sciences, Universidad de Talca, Talca 3460000, Chile; Interdisciplinary Excellence Research Program on Healthy Aging (PIEI-ES), Universidad de Talca, Talca 3460000, Chile.
| |
Collapse
|
18
|
D'Souza M, Sulakhe D, Wang S, Xie B, Hashemifar S, Taylor A, Dubchak I, Conrad Gilliam T, Maltsev N. Strategic Integration of Multiple Bioinformatics Resources for System Level Analysis of Biological Networks. Methods Mol Biol 2017; 1613:85-99. [PMID: 28849559 DOI: 10.1007/978-1-4939-7027-8_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Recent technological advances in genomics allow the production of biological data at unprecedented tera- and petabyte scales. Efficient mining of these vast and complex datasets for the needs of biomedical research critically depends on a seamless integration of the clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships. Such experimental data accumulated in publicly available databases should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining.We present an integrated computational platform Lynx (Sulakhe et al., Nucleic Acids Res 44:D882-D887, 2016) ( http://lynx.cri.uchicago.edu ), a web-based database and knowledge extraction engine. It provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization. It gives public access to the Lynx integrated knowledge base (LynxKB) and its analytical tools via user-friendly web services and interfaces. The Lynx service-oriented architecture supports annotation and analysis of high-throughput experimental data. Lynx tools assist the user in extracting meaningful knowledge from LynxKB and experimental data, and in the generation of weighted hypotheses regarding the genes and molecular mechanisms contributing to human phenotypes or conditions of interest. The goal of this integrated platform is to support the end-to-end analytical needs of various translational projects.
Collapse
Affiliation(s)
- Mark D'Souza
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA.
- Argonne National Laboratory, Building 221, Room: A142, 9700 South Cass Avenue, Argonne, IL, 60439, USA.
| | - Dinanath Sulakhe
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
- Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, 60637, USA
| | - Sheng Wang
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
- Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, 60637, USA
| | - Bing Xie
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
- Department of Computer Science, Illinois Institute of Technology, Chicago, IL, 60616, USA
| | - Somaye Hashemifar
- Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, 60637, USA
| | - Andrew Taylor
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
| | - Inna Dubchak
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
| | - T Conrad Gilliam
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
- Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, 60637, USA
| | - Natalia Maltsev
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, 60637, USA
- Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, 60637, USA
| |
Collapse
|
19
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
20
|
Roy J, Winter C, Schroeder M. Meta-analysis of Cancer Gene Profiling Data. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2016; 1381:211-22. [PMID: 26667463 DOI: 10.1007/978-1-4939-3204-7_12] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The simultaneous measurement of thousands of genes gives the opportunity to personalize and improve cancer therapy. In addition, the integration of meta-data such as protein-protein interaction (PPI) information into the analyses helps in the identification and prioritization of genes from these screens. Here, we describe a computational approach that identifies genes prognostic for outcome by combining gene profiling data from any source with a network of known relationships between genes.
Collapse
Affiliation(s)
- Janine Roy
- Biotechnology Center, Technische Universität Dresden, Dresden, Germany
| | - Christof Winter
- Faculty of Medicine, Department of Clinical Sciences, Oncology MV, University of Lund, Lund, Sweden
| | - Michael Schroeder
- Biotechnology Center, Technische Universität Dresden, Dresden, Germany.
| |
Collapse
|
21
|
Ohta S, Montaño-Gutierrez LF, de Lima Alves F, Ogawa H, Toramoto I, Sato N, Morrison CG, Takeda S, Hudson DF, Rappsilber J, Earnshaw WC. Proteomics Analysis with a Nano Random Forest Approach Reveals Novel Functional Interactions Regulated by SMC Complexes on Mitotic Chromosomes. Mol Cell Proteomics 2016; 15:2802-18. [PMID: 27231315 PMCID: PMC4974353 DOI: 10.1074/mcp.m116.057885] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 05/04/2016] [Indexed: 12/31/2022] Open
Abstract
Packaging of DNA into condensed chromosomes during mitosis is essential for the faithful segregation of the genome into daughter nuclei. Although the structure and composition of mitotic chromosomes have been studied for over 30 years, these aspects are yet to be fully elucidated. Here, we used stable isotope labeling with amino acids in cell culture to compare the proteomes of mitotic chromosomes isolated from cell lines harboring conditional knockouts of members of the condensin (SMC2, CAP-H, CAP-D3), cohesin (Scc1/Rad21), and SMC5/6 (SMC5) complexes. Our analysis revealed that these complexes associate with chromosomes independently of each other, with the SMC5/6 complex showing no significant dependence on any other chromosomal proteins during mitosis. To identify subtle relationships between chromosomal proteins, we employed a nano Random Forest (nanoRF) approach to detect protein complexes and the relationships between them. Our nanoRF results suggested that as few as 113 of 5058 detected chromosomal proteins are functionally linked to chromosome structure and segregation. Furthermore, nanoRF data revealed 23 proteins that were not previously suspected to have functional interactions with complexes playing important roles in mitosis. Subsequent small-interfering-RNA-based validation and localization tracking by green fluorescent protein-tagging highlighted novel candidates that might play significant roles in mitotic progression.
Collapse
Affiliation(s)
- Shinya Ohta
- From the ‡Center for Innovative and Translational Medicine, Medical School, Kochi University Kohasu, Oko-cho, Nankoku, Kochi 783-8505, Japan; §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK;
| | - Luis F Montaño-Gutierrez
- §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK
| | - Flavia de Lima Alves
- §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK
| | - Hiromi Ogawa
- §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK
| | - Iyo Toramoto
- From the ‡Center for Innovative and Translational Medicine, Medical School, Kochi University Kohasu, Oko-cho, Nankoku, Kochi 783-8505, Japan
| | - Nobuko Sato
- From the ‡Center for Innovative and Translational Medicine, Medical School, Kochi University Kohasu, Oko-cho, Nankoku, Kochi 783-8505, Japan
| | - Ciaran G Morrison
- ¶Centre for Chromosome Biology, School of Natural Sciences, National University of Ireland Galway, Galway, Ireland
| | - Shunichi Takeda
- ‖Department of Radiation Genetics, Kyoto University Graduate School of Medicine, Yoshida Konoe, Sakyo-ku, Kyoto 606-8501, Japan
| | - Damien F Hudson
- **Murdoch Childrens Research Institute, Royal Children's Hospital, Melbourne, Victoria 3052, Australia
| | - Juri Rappsilber
- §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK; ‡‡Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355 Berlin, Germany
| | - William C Earnshaw
- §Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3BF, UK
| |
Collapse
|
22
|
Snider J, Kotlyar M, Saraon P, Yao Z, Jurisica I, Stagljar I. Fundamentals of protein interaction network mapping. Mol Syst Biol 2015; 11:848. [PMID: 26681426 PMCID: PMC4704491 DOI: 10.15252/msb.20156351] [Citation(s) in RCA: 180] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Studying protein interaction networks of all proteins in an organism (“interactomes”) remains one of the major challenges in modern biomedicine. Such information is crucial to understanding cellular pathways and developing effective therapies for the treatment of human diseases. Over the past two decades, diverse biochemical, genetic, and cell biological methods have been developed to map interactomes. In this review, we highlight basic principles of interactome mapping. Specifically, we discuss the strengths and weaknesses of individual assays, how to select a method appropriate for the problem being studied, and provide general guidelines for carrying out the necessary follow‐up analyses. In addition, we discuss computational methods to predict, map, and visualize interactomes, and provide a summary of some of the most important interactome resources. We hope that this review serves as both a useful overview of the field and a guide to help more scientists actively employ these powerful approaches in their research.
Collapse
Affiliation(s)
- Jamie Snider
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Max Kotlyar
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Punit Saraon
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Zhong Yao
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Igor Stagljar
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
23
|
Zhang X, Kuivenhoven JA, Groen AK. Forward Individualized Medicine from Personal Genomes to Interactomes. Front Physiol 2015; 6:364. [PMID: 26696898 PMCID: PMC4673427 DOI: 10.3389/fphys.2015.00364] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 11/16/2015] [Indexed: 12/23/2022] Open
Abstract
When considering the variation in the genome, transcriptome, proteome and metabolome, and their interaction with the environment, every individual can be rightfully considered as a unique biological entity. Individualized medicine promises to take this uniqueness into account to optimize disease treatment and thereby improve health benefits for every patient. The success of individualized medicine relies on a precise understanding of the genotype-phenotype relationship. Although omics technologies advance rapidly, there are several challenges that need to be overcome: Next generation sequencing can efficiently decipher genomic sequences, epigenetic changes, and transcriptomic variation in patients, but it does not automatically indicate how or whether the identified variation will cause pathological changes. This is likely due to the inability to account for (1) the consequences of gene-gene and gene-environment interactions, and (2) (post)transcriptional as well as (post)translational processes that eventually determine the concentration of key metabolites. The technologies to accurately measure changes in these latter layers are still under development, and such measurements in humans are also mainly restricted to blood and circulating cells. Despite these challenges, it is already possible to track dynamic changes in the human interactome in healthy and diseased states by using the integration of multi-omics data. In this review, we evaluate the potential value of current major bioinformatics and systems biology-based approaches, including genome wide association studies, epigenetics, gene regulatory and protein-protein interaction networks, and genome-scale metabolic modeling. Moreover, we address the question whether integrative analysis of personal multi-omics data will help understanding of personal genotype-phenotype relationships.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Pediatrics, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| | - Jan A Kuivenhoven
- Section Molecular Genetics, Department of Pediatrics, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| | - Albert K Groen
- Department of Pediatrics, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands ; Department of Laboratory Medicine, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| |
Collapse
|
24
|
Kotlyar M, Pastrello C, Sheahan N, Jurisica I. Integrated interactions database: tissue-specific view of the human and model organism interactomes. Nucleic Acids Res 2015; 44:D536-41. [PMID: 26516188 PMCID: PMC4702811 DOI: 10.1093/nar/gkv1115] [Citation(s) in RCA: 167] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 10/13/2015] [Indexed: 01/28/2023] Open
Abstract
IID (Integrated Interactions Database) is the first database providing tissue-specific protein–protein interactions (PPIs) for model organisms and human. IID covers six species (S. cerevisiae (yeast), C. elegans (worm), D. melonogaster (fly), R. norvegicus (rat), M. musculus (mouse) and H. sapiens (human)) and up to 30 tissues per species. Users query IID by providing a set of proteins or PPIs from any of these organisms, and specifying species and tissues where IID should search for interactions. If query proteins are not from the selected species, IID enables searches across species and tissues automatically by using their orthologs; for example, retrieving interactions in a given tissue, conserved in human and mouse. Interaction data in IID comprises three types of PPI networks: experimentally detected PPIs from major databases, orthologous PPIs and high-confidence computationally predicted PPIs. Interactions are assigned to tissues where their proteins pairs or encoding genes are expressed. IID is a major replacement of the I2D interaction database, with larger PPI networks (a total of 1,566,043 PPIs among 68,831 proteins), tissue annotations for interactions, and new query, analysis and data visualization capabilities. IID is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, M5G 1L7, Canada
| | - Chiara Pastrello
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, M5G 1L7, Canada
| | - Nicholas Sheahan
- School of Computing, Queen's University, Kingston, ON, K7L 2N8, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, M5G 1L7, Canada Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON, M5S 1A4, Canada
| |
Collapse
|
25
|
Liu JL, Peng Y, Fu YS. Efficient prediction of progesterone receptor interactome using a support vector machine model. Int J Mol Sci 2015; 16:4774-85. [PMID: 25741764 PMCID: PMC4394448 DOI: 10.3390/ijms16034774] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Revised: 02/20/2015] [Accepted: 02/25/2015] [Indexed: 12/20/2022] Open
Abstract
Protein-protein interaction (PPI) is essential for almost all cellular processes and identification of PPI is a crucial task for biomedical researchers. So far, most computational studies of PPI are intended for pair-wise prediction. Theoretically, predicting protein partners for a single protein is likely a simpler problem. Given enough data for a particular protein, the results can be more accurate than general PPI predictors. In the present study, we assessed the potential of using the support vector machine (SVM) model with selected features centered on a particular protein for PPI prediction. As a proof-of-concept study, we applied this method to identify the interactome of progesterone receptor (PR), a protein which is essential for coordinating female reproduction in mammals by mediating the actions of ovarian progesterone. We achieved an accuracy of 91.9%, sensitivity of 92.8% and specificity of 91.2%. Our method is generally applicable to any other proteins and therefore may be of help in guiding biomedical experiments.
Collapse
Affiliation(s)
- Ji-Long Liu
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China.
| | - Ying Peng
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China.
| | - Yong-Sheng Fu
- College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China.
| |
Collapse
|
26
|
Schoenrock A, Samanfar B, Pitre S, Hooshyar M, Jin K, Phillips CA, Wang H, Phanse S, Omidi K, Gui Y, Alamgir M, Wong A, Barrenäs F, Babu M, Benson M, Langston MA, Green JR, Dehne F, Golshani A. Efficient prediction of human protein-protein interactions at a global scale. BMC Bioinformatics 2014; 15:383. [PMID: 25492630 PMCID: PMC4272565 DOI: 10.1186/s12859-014-0383-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 11/12/2014] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. RESULTS On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments. CONCLUSIONS The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.
Collapse
Affiliation(s)
| | | | - Sylvain Pitre
- School of Computer Science, Carleton University, Ottawa, Canada.
| | | | - Ke Jin
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada.
| | - Charles A Phillips
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, USA.
| | - Hui Wang
- Department of Pediatrics, Gothenburg University, Gothenburg, Sweden. .,The Centre for Individualized Medication, Linköping University, Linköping, Sweden.
| | - Sadhna Phanse
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada.
| | - Katayoun Omidi
- Department of Biology, Carleton University, Ottawa, Canada.
| | - Yuan Gui
- Department of Biology, Carleton University, Ottawa, Canada.
| | - Md Alamgir
- Department of Biology, Carleton University, Ottawa, Canada.
| | - Alex Wong
- Department of Biology, Carleton University, Ottawa, Canada.
| | - Fredrik Barrenäs
- Department of Pediatrics, Gothenburg University, Gothenburg, Sweden. .,The Centre for Individualized Medication, Linköping University, Linköping, Sweden.
| | - Mohan Babu
- Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, Saskatchewan, Canada.
| | - Mikael Benson
- Department of Pediatrics, Gothenburg University, Gothenburg, Sweden. .,The Centre for Individualized Medication, Linköping University, Linköping, Sweden.
| | - Michael A Langston
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, USA.
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada.
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, Canada.
| | | |
Collapse
|
27
|
Kotlyar M, Pastrello C, Pivetta F, Lo Sardo A, Cumbaa C, Li H, Naranian T, Niu Y, Ding Z, Vafaee F, Broackes-Carter F, Petschnigg J, Mills GB, Jurisicova A, Stagljar I, Maestro R, Jurisica I. In silico prediction of physical protein interactions and characterization of interactome orphans. Nat Methods 2014; 12:79-84. [PMID: 25402006 DOI: 10.1038/nmeth.3178] [Citation(s) in RCA: 112] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 08/14/2014] [Indexed: 12/12/2022]
Abstract
Protein-protein interactions (PPIs) are useful for understanding signaling cascades, predicting protein function, associating proteins with disease and fathoming drug mechanism of action. Currently, only ∼ 10% of human PPIs may be known, and about one-third of human proteins have no known interactions. We introduce FpClass, a data mining-based method for proteome-wide PPI prediction. At an estimated false discovery rate of 60%, we predicted 250,498 PPIs among 10,531 human proteins; 10,647 PPIs involved 1,089 proteins without known interactions. We experimentally tested 233 high- and medium-confidence predictions and validated 137 interactions, including seven novel putative interactors of the tumor suppressor p53. Compared to previous PPI prediction methods, FpClass achieved better agreement with experimentally detected PPIs. We provide an online database of annotated PPI predictions (http://ophid.utoronto.ca/fpclass/) and the prediction software (http://www.cs.utoronto.ca/~juris/data/fpclass/).
Collapse
Affiliation(s)
- Max Kotlyar
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Chiara Pastrello
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | - Flavia Pivetta
- Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | | | - Christian Cumbaa
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Han Li
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Taline Naranian
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Yun Niu
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Zhiyong Ding
- Department of Systems Biology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Fatemeh Vafaee
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Fiona Broackes-Carter
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Julia Petschnigg
- Donnelly Centre, Departments of Molecular Genetics and Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Gordon B Mills
- Department of Systems Biology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Andrea Jurisicova
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Igor Stagljar
- Donnelly Centre, Departments of Molecular Genetics and Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Roberta Maestro
- Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | - Igor Jurisica
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada. [3] Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. [4] TECHNA Institute for the Advancement of Technology for Health, Toronto, Ontario, Canada
| |
Collapse
|
28
|
Zahiri J, Mohammad-Noori M, Ebrahimpour R, Saadat S, Bozorgmehr JH, Goldberg T, Masoudi-Nejad A. LocFuse: human protein-protein interaction prediction via classifier fusion using protein localization information. Genomics 2014; 104:496-503. [PMID: 25458812 DOI: 10.1016/j.ygeno.2014.10.006] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Revised: 09/28/2014] [Accepted: 10/02/2014] [Indexed: 12/20/2022]
Abstract
UNLABELLED Protein-protein interaction (PPI) detection is one of the central goals of functional genomics and systems biology. Knowledge about the nature of PPIs can help fill the widening gap between sequence information and functional annotations. Although experimental methods have produced valuable PPI data, they also suffer from significant limitations. Computational PPI prediction methods have attracted tremendous attentions. Despite considerable efforts, PPI prediction is still in its infancy in complex multicellular organisms such as humans. Here, we propose a novel ensemble learning method, LocFuse, which is useful in human PPI prediction. This method uses eight different genomic and proteomic features along with four types of different classifiers. The prediction performance of this classifier selection method was found to be considerably better than methods employed hitherto. This confirms the complex nature of the PPI prediction problem and also the necessity of using biological information for classifier fusion. The LocFuse is available at: http://lbb.ut.ac.ir/Download/LBBsoft/LocFuse. BIOLOGICAL SIGNIFICANCE The results revealed that if we divide proteome space according to the cellular localization of proteins, then the utility of some classifiers in PPI prediction can be improved. Therefore, to predict the interaction for any given protein pair, we can select the most accurate classifier with regard to the cellular localization information. Based on the results, we can say that the importance of different features for PPI prediction varies between differently localized proteins; however in general, our novel features, which were extracted from position-specific scoring matrices (PSSMs), are the most important ones and the Random Forest (RF) classifier performs best in most cases. LocFuse was developed with a user-friendly graphic interface and it is freely available for Linux, Mac OSX and MS Windows operating systems.
Collapse
Affiliation(s)
- Javad Zahiri
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran; Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Morteza Mohammad-Noori
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Reza Ebrahimpour
- Brain and Intelligent Systems Research Lab, Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Samaneh Saadat
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Joseph H Bozorgmehr
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Tatyana Goldberg
- Department for Bioinformatics and Computational Biology, Faculty of Informatics, TUM, Garching 85748, Germany
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
29
|
Abstract
The past decade has seen a dramatic expansion in the number and range of techniques available to obtain genome-wide information and to analyze this information so as to infer both the functions of individual molecules and how they interact to modulate the behavior of biological systems. Here, we review these techniques, focusing on the construction of physical protein-protein interaction networks, and highlighting approaches that incorporate protein structure, which is becoming an increasingly important component of systems-level computational techniques. We also discuss how network analyses are being applied to enhance our basic understanding of biological systems and their disregulation, as well as how these networks are being used in drug development.
Collapse
Affiliation(s)
- Donald Petrey
- Center for Computational Biology and Bioinformatics, Department of Systems Biology
| | | |
Collapse
|
30
|
Hsu WCJ, Nilsson CL, Laezza F. Role of the axonal initial segment in psychiatric disorders: function, dysfunction, and intervention. Front Psychiatry 2014; 5:109. [PMID: 25191280 PMCID: PMC4139700 DOI: 10.3389/fpsyt.2014.00109] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Accepted: 08/06/2014] [Indexed: 12/22/2022] Open
Abstract
The progress of developing effective interventions against psychiatric disorders has been limited due to a lack of understanding of the underlying cellular and functional mechanisms. Recent research findings focused on exploring novel causes of psychiatric disorders have highlighted the importance of the axonal initial segment (AIS), a highly specialized neuronal structure critical for spike initiation of the action potential. In particular, the role of voltage-gated sodium channels, and their interactions with other protein partners in a tightly regulated macromolecular complex has been emphasized as a key component in the regulation of neuronal excitability. Deficits and excesses of excitability have been linked to the pathogenesis of brain disorders. Identification of the factors and regulatory pathways involved in proper AIS function, or its disruption, can lead to the development of novel interventions that target these mechanistic interactions, increasing treatment efficacy while reducing deleterious off-target effects for psychiatric disorders.
Collapse
Affiliation(s)
- Wei-Chun Jim Hsu
- Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- Graduate Program in Biochemistry and Molecular Biology, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- M.D.–Ph.D. Combined Degree Program, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
| | - Carol Lynn Nilsson
- Department of Pharmacology and Toxicology, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- Sealy Center for Molecular Medicine, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
| | - Fernanda Laezza
- Department of Pharmacology and Toxicology, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- Center for Addiction Research, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- Center for Biomedical Engineering, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
- Mitchell Center for Neurodegenerative Diseases, The University of Texas Medical Branch at Galveston, Galveston, TX, USA
| |
Collapse
|
31
|
Stunnenberg HG, Hubner NC. Genomics meets proteomics: identifying the culprits in disease. Hum Genet 2014; 133:689-700. [PMID: 24135908 PMCID: PMC4021166 DOI: 10.1007/s00439-013-1376-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 10/01/2013] [Indexed: 12/20/2022]
Abstract
Genome-wide association studies (GWAS) revealed genomic risk loci that potentially have an impact on disease and phenotypic traits. This extensive resource holds great promise in providing novel directions for personalized medicine, including disease risk prediction, prevention and targeted medication. One of the major challenges that researchers face on the path between the initial identification of an association and precision treatment of patients is the comprehension of the biological mechanisms that underlie these associations. Currently, the focus to solve these questions lies on the integrative analysis of system-wide data on global genome variation, gene expression, transcription factor binding, epigenetic profiles and chromatin conformation. The generation of this data mainly relies on next-generation sequencing. However, due to multiple recent developments, mass spectrometry-based proteomics now offers additional, by the GWAS field so far hardly recognized possibilities for the identification of functional genome variants and, in particular, for the identification and characterization of (differentially) bound protein complexes as well as physiological target genes. In this review, we introduce these proteomics advances and suggest how they might be integrated in post-GWAS workflows. We argue that the combination of highly complementary techniques is powerful and can provide an unbiased, detailed picture of GWAS loci and their mechanistic involvement in disease.
Collapse
Affiliation(s)
- Hendrik G. Stunnenberg
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, 6525 GA Nijmegen, The Netherlands
| | - Nina C. Hubner
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, 6525 GA Nijmegen, The Netherlands
| |
Collapse
|
32
|
Schmitt T, Ogris C, Sonnhammer ELL. FunCoup 3.0: database of genome-wide functional coupling networks. Nucleic Acids Res 2013; 42:D380-8. [PMID: 24185702 PMCID: PMC3965084 DOI: 10.1093/nar/gkt984] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
We present an update of the FunCoup database (http://FunCoup.sbc.su.se) of functional couplings, or functional associations, between genes and gene products. Identifying these functional couplings is an important step in the understanding of higher level mechanisms performed by complex cellular processes. FunCoup distinguishes between four classes of couplings: participation in the same signaling cascade, participation in the same metabolic process, co-membership in a protein complex and physical interaction. For each of these four classes, several types of experimental and statistical evidence are combined by Bayesian integration to predict genome-wide functional coupling networks. The FunCoup framework has been completely re-implemented to allow for more frequent future updates. It contains many improvements, such as a regularization procedure to automatically downweight redundant evidences and a novel method to incorporate phylogenetic profile similarity. Several datasets have been updated and new data have been added in FunCoup 3.0. Furthermore, we have developed a new Web site, which provides powerful tools to explore the predicted networks and to retrieve detailed information about the data underlying each prediction.
Collapse
Affiliation(s)
- Thomas Schmitt
- Stockholm Bioinformatics Centre, Science for Life Laboratory, Box 1031, Solna SE-17121, Sweden, Department of Biochemistry and Biophysics, Stockholm University and Swedish eScience Research Center
| | | | | |
Collapse
|
33
|
Stelzl U. Molecular interaction networks in the analyses of sequence variation and proteomics data. Proteomics Clin Appl 2013; 7:727-32. [PMID: 24039079 DOI: 10.1002/prca.201300039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 08/02/2013] [Accepted: 08/04/2013] [Indexed: 01/05/2023]
Abstract
Protein-protein interaction networks are typically generated in standard cell lines or model organisms as it is prohibitively difficult to record large interaction datasets from specific tissues or disease models at a reasonable pace. Although the interaction data are of high confidence, they thus do not reflect in vivo relationships as such. A wealth of physiologically relevant protein information, obtained under different conditions and from different systems, is available including information on genetic variation, protein levels, and PTMs. However, these data are difficult to assess comprehensively because the relationships between the entities remain elusive from the measurements. Here, we exemplarily highlight recent studies that gained deeper insight from genetic variation, protein, and PTM measurements using interaction information pointing toward the importance and potential of interaction networks for the interpretation of sequencing and proteomics data.
Collapse
Affiliation(s)
- Ulrich Stelzl
- Max Planck Institute for Molecular Genetics (MPIMG), Otto-Warburg Laboratory, Berlin, Germany
| |
Collapse
|
34
|
PPIevo : Protein–protein interaction prediction from PSSM based evolutionary information. Genomics 2013; 102:237-42. [DOI: 10.1016/j.ygeno.2013.05.006] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2013] [Revised: 05/20/2013] [Accepted: 05/28/2013] [Indexed: 01/23/2023]
|
35
|
Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration. Proteomes 2013; 1:3-24. [PMID: 28250396 PMCID: PMC5314489 DOI: 10.3390/proteomes1010003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 05/16/2013] [Accepted: 05/21/2013] [Indexed: 12/31/2022] Open
Abstract
Understanding protein interaction networks and their dynamic changes is a major challenge in modern biology. Currently, several experimental and in silico approaches allow the screening of protein interactors in a large-scale manner. Therefore, the bulk of information on protein interactions deposited in databases and peer-reviewed published literature is constantly growing. Multiple databases interfaced from user-friendly web tools recently emerged to facilitate the task of protein interaction data retrieval and data integration. Nevertheless, as we evidence in this report, despite the current efforts towards data integration, the quality of the information on protein interactions retrieved by in silico approaches is frequently incomplete and may even list false interactions. Here we point to some obstacles precluding confident data integration, with special emphasis on protein interactions, which include gene acronym redundancies and protein synonyms. Three human proteins (choline kinase, PPIase and uromodulin) and three different web-based data search engines focused on protein interaction data retrieval (PSICQUIC, DASMI and BIPS) were used to explain the potential occurrence of undesired errors that should be considered by researchers in the field. We demonstrate that, despite the recent initiatives towards data standardization, manual curation of protein interaction networks based on literature searches are still required to remove potential false positives. A three-step workflow consisting of: (i) data retrieval from multiple databases, (ii) peer-reviewed literature searches, and (iii) data curation and integration, is proposed as the best strategy to gather updated information on protein interactions. Finally, this strategy was applied to compile bona fide information on human DREAM protein interactome, which constitutes liable training datasets that can be used to improve computational predictions.
Collapse
|
36
|
Kuzmanov U, Emili A. Protein-protein interaction networks: probing disease mechanisms using model systems. Genome Med 2013; 5:37. [PMID: 23635424 PMCID: PMC3706760 DOI: 10.1186/gm441] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Protein-protein interactions (PPIs) and multi-protein complexes perform central roles in the cellular systems of all living organisms. In humans, disruptions of the normal patterns of PPIs and protein complexes can be causative or indicative of a disease state. Recent developments in the biological applications of mass spectrometry (MS)-based proteomics have expanded the horizon for the application of systematic large-scale mapping of physical interactions to probe disease mechanisms. In this review, we examine the application of MS-based approaches for the experimental analysis of PPI networks and protein complexes, focusing on the different model systems (including human cells) used to study the molecular basis of common diseases such as cancer, cardiomyopathies, diabetes, microbial infections, and genetic and neurodegenerative disorders.
Collapse
Affiliation(s)
- Uros Kuzmanov
- Banting and Best Department of Medical Research and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S 3E1, Canada
| | - Andrew Emili
- Banting and Best Department of Medical Research and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
37
|
Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 2013; 41:D808-15. [PMID: 23203871 PMCID: PMC3531103 DOI: 10.1093/nar/gks1094] [Citation(s) in RCA: 3246] [Impact Index Per Article: 295.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2012] [Revised: 10/15/2012] [Accepted: 10/18/2012] [Indexed: 12/12/2022] Open
Abstract
Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made-particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data repositories to highly formalized pathway databases. For many applications, a global view of all the available interaction data is desirable, including lower-quality data and/or computational predictions. The STRING database (http://string-db.org/) aims to provide such a global perspective for as many organisms as feasible. Known and predicted associations are scored and integrated, resulting in comprehensive protein networks covering >1100 organisms. Here, we describe the update to version 9.1 of STRING, introducing several improvements: (i) we extend the automated mining of scientific texts for interaction information, to now also include full-text articles; (ii) we entirely re-designed the algorithm for transferring interactions from one model organism to the other; and (iii) we provide users with statistical information on any functional enrichment observed in their networks.
Collapse
Affiliation(s)
- Andrea Franceschini
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Sune Frankild
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Michael Kuhn
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Jianyi Lin
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Pablo Minguez
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Peer Bork
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Lars J. Jensen
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| |
Collapse
|
38
|
Diversity in genetic in vivo methods for protein-protein interaction studies: from the yeast two-hybrid system to the mammalian split-luciferase system. Microbiol Mol Biol Rev 2012; 76:331-82. [PMID: 22688816 DOI: 10.1128/mmbr.05021-11] [Citation(s) in RCA: 134] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The yeast two-hybrid system pioneered the field of in vivo protein-protein interaction methods and undisputedly gave rise to a palette of ingenious techniques that are constantly pushing further the limits of the original method. Sensitivity and selectivity have improved because of various technical tricks and experimental designs. Here we present an exhaustive overview of the genetic approaches available to study in vivo binary protein interactions, based on two-hybrid and protein fragment complementation assays. These methods have been engineered and employed successfully in microorganisms such as Saccharomyces cerevisiae and Escherichia coli, but also in higher eukaryotes. From single binary pairwise interactions to whole-genome interactome mapping, the self-reassembly concept has been employed widely. Innovative studies report the use of proteins such as ubiquitin, dihydrofolate reductase, and adenylate cyclase as reconstituted reporters. Protein fragment complementation assays have extended the possibilities in protein-protein interaction studies, with technologies that enable spatial and temporal analyses of protein complexes. In addition, one-hybrid and three-hybrid systems have broadened the types of interactions that can be studied and the findings that can be obtained. Applications of these technologies are discussed, together with the advantages and limitations of the available assays.
Collapse
|
39
|
A computational framework for boosting confidence in high-throughput protein-protein interaction datasets. Genome Biol 2012; 13:R76. [PMID: 22937800 PMCID: PMC4053744 DOI: 10.1186/gb-2012-13-8-r76] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 08/31/2012] [Indexed: 12/28/2022] Open
Abstract
Improving the quality and coverage of the protein interactome is of tantamount importance for biomedical research, particularly given the various sources of uncertainty in high-throughput techniques. We introduce a structure-based framework, Coev2Net, for computing a single confidence score that addresses both false-positive and false-negative rates. Coev2Net is easily applied to thousands of binary protein interactions and has superior predictive performance over existing methods. We experimentally validate selected high-confidence predictions in the human MAPK network and show that predicted interfaces are enriched for cancer -related or damaging SNPs. Coev2Net can be downloaded at http://struct2net.csail.mit.edu.
Collapse
|
40
|
Saraç OS, Pancaldi V, Bähler J, Beyer A. Topology of functional networks predicts physical binding of proteins. ACTA ACUST UNITED AC 2012; 28:2137-45. [PMID: 22718785 DOI: 10.1093/bioinformatics/bts351] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
MOTIVATION It has been recognized that the topology of molecular networks provides information about the certainty and nature of individual interactions. Thus, network motifs have been used for predicting missing links in biological networks and for removing false positives. However, various different measures can be inferred from the structure of a given network and their predictive power varies depending on the task at hand. RESULTS Herein, we present a systematic assessment of seven different network features extracted from the topology of functional genetic networks and we quantify their ability to classify interactions into different types of physical protein associations. Using machine learning, we combine features based on network topology with non-network features and compare their importance of the classification of interactions. We demonstrate the utility of network features based on human and budding yeast networks; we show that network features can distinguish different sub-types of physical protein associations and we apply the framework to fission yeast, which has a much sparser known physical interactome than the other two species. Our analysis shows that network features are at least as predictive for the tasks we tested as non-network features. However, feature importance varies between species owing to different topological characteristics of the networks. The application to fission yeast shows that small maps of physical interactomes can be extended based on functional networks, which are often more readily available. AVAILABILITY AND IMPLEMENTATION The R-code for computing the network features is available from www.cellularnetworks.org
Collapse
Affiliation(s)
- Omer Sinan Saraç
- Biotechnology Center, Technische Universitt Dresden, D-01062 Dresden, Germany
| | | | | | | |
Collapse
|