1
|
Zhang G, Kuang X, Zhang Y, Liu Y, Su Z, Zhang T, Wu Y. Machine-learning-based structural analysis of interactions between antibodies and antigens. Biosystems 2024; 243:105264. [PMID: 38964652 DOI: 10.1016/j.biosystems.2024.105264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 06/21/2024] [Accepted: 07/01/2024] [Indexed: 07/06/2024]
Abstract
Computational analysis of paratope-epitope interactions between antibodies and their corresponding antigens can facilitate our understanding of the molecular mechanism underlying humoral immunity and boost the design of new therapeutics for many diseases. The recent breakthrough in artificial intelligence has made it possible to predict protein-protein interactions and model their structures. Unfortunately, detecting antigen-binding sites associated with a specific antibody is still a challenging problem. To tackle this challenge, we implemented a deep learning model to characterize interaction patterns between antibodies and their corresponding antigens. With high accuracy, our model can distinguish between antibody-antigen complexes and other types of protein-protein complexes. More intriguingly, we can identify antigens from other common protein binding regions with an accuracy of higher than 70% even if we only have the epitope information. This indicates that antigens have distinct features on their surface that antibodies can recognize. Additionally, our model was unable to predict the partnerships between antibodies and their particular antigens. This result suggests that one antigen may be targeted by more than one antibody and that antibodies may bind to previously unidentified proteins. Taken together, our results support the precision of antibody-antigen interactions while also suggesting positive future progress in the prediction of specific pairing.
Collapse
Affiliation(s)
- Grace Zhang
- Staples High School, 70 North Avenue, Westport, CT, 06880, USA
| | - Xiaohan Kuang
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Yuhao Zhang
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Yunchao Liu
- Department of Computer Science, Vanderbilt University, 1400 18th Ave S, Nashville, TN, 37212, USA
| | - Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Tom Zhang
- California Institute of Technology, 1200 East California Boulevard, Pasadena, CA, 91125, USA.
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| |
Collapse
|
2
|
Geist JL, Lee CY, Strom JM, de Jesús Naveja J, Luck K. Generation of a high confidence set of domain-domain interface types to guide protein complex structure predictions by AlphaFold. Bioinformatics 2024; 40:btae482. [PMID: 39171834 PMCID: PMC11361816 DOI: 10.1093/bioinformatics/btae482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 07/10/2024] [Accepted: 08/20/2024] [Indexed: 08/23/2024] Open
Abstract
MOTIVATION While the release of AlphaFold (AF) represented a breakthrough for the prediction of protein complex structures, its sensitivity, especially when using full length protein sequences, still remains limited. Modeling success rates might increase if AF predictions were guided by likely interacting protein fragments. This approach requires available sets of highly confident protein-protein interface types. Computational resources, such as 3did, infer interacting globular domain types from observed contacts in protein structures. Assessing the accuracy of these predicted interface types is difficult because we lack hand-curated reference sets of verified domain-domain interface (DDI) types. RESULTS To improve protein complex modeling of DDIs by AF, we manually inspected 80 randomly selected DDI types from the 3did resource to generate a first reference set of DDI types. Identified cases of DDI type nonapproval (40%) primarily resulted from inaccurate Pfam domain matches, crystal contacts, and synthetic protein constructs. Using logistic regression, we predicted a subset of 2411 out of 5724 considered DDI types in 3did to be of high confidence, which we subsequently applied to 53 000 human-protein interactions to predict DDIs followed by AF modeling. We obtained highly confident AF models for 604 out of 1129 predicted DDIs. Of note, for 47% of them no confident AF structural model could be obtained using full length protein sequences. AVAILABILITY AND IMPLEMENTATION Code is available at https://github.com/KatjaLuckLab/DDI_manuscript.
Collapse
Affiliation(s)
| | - Chop Yan Lee
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
| | | | - José de Jesús Naveja
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
- 3rd Medical Department, University Medical Center, Johannes Gutenberg University Mainz, Mainz 55131, Germany
- University Cancer Center, University Medical Center, Johannes Gutenberg University Mainz, Mainz 55131, Germany
| | - Katja Luck
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
| |
Collapse
|
3
|
Zhang G, Su Z, Zhang T, Wu Y. Machine-learning-based Structural Analysis of Interactions between Antibodies and Antigens. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.06.570397. [PMID: 38106177 PMCID: PMC10723427 DOI: 10.1101/2023.12.06.570397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Computational analysis of paratope-epitope interactions between antibodies and their corresponding antigens can facilitate our understanding of the molecular mechanism underlying humoral immunity and boost the design of new therapeutics for many diseases. The recent breakthrough in artificial intelligence has made it possible to predict protein-protein interactions and model their structures. Unfortunately, detecting antigen-binding sites associated with a specific antibody is still a challenging problem. To tackle this challenge, we implemented a deep learning model to characterize interaction patterns between antibodies and their corresponding antigens. With high accuracy, our model can distinguish between antibody-antigen complexes and other types of protein-protein complexes. More intriguingly, we can identify antigens from other common protein binding regions with an accuracy of higher than 70% even if we only have the epitope information. This indicates that antigens have distinct features on their surface that antibodies can recognize. Additionally, our model was unable to predict the partnerships between antibodies and their particular antigens. This result suggests that one antigen may be targeted by more than one antibody and that antibodies may bind to previously unidentified proteins. Taken together, our results support the precision of antibody-antigen interactions while also suggesting positive future progress in the prediction of specific pairing.
Collapse
Affiliation(s)
- Grace Zhang
- Staples High School, 70 North Avenue, Westport, CT 06880
| | - Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212
| | - Tom Zhang
- California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461
| |
Collapse
|
4
|
Sen N, Madhusudhan MS. A structural database of chain–chain and domain–domain interfaces of proteins. Protein Sci 2022. [DOI: 10.1002/pro.4406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Neeladri Sen
- Indian Institute of Science Education and Research Pune India
- Institute of Structural and Molecular Biology University College London London UK
| | | |
Collapse
|
5
|
Panditrao G, Ganguli P, Sarkar RR. Delineating infection strategies of Leishmania donovani secretory proteins in Human through host-pathogen protein Interactome prediction. Pathog Dis 2021; 79:6408463. [PMID: 34677584 DOI: 10.1093/femspd/ftab051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 10/20/2021] [Indexed: 12/11/2022] Open
Abstract
Interactions of Leishmania donovani secretory virulence factors with the host proteins and their interplay during the infection process in humans is poorly studied in Visceral Leishmaniasis. Lack of a holistic study of pathway level de-regulations caused due to these virulence factors leads to a poor understanding of the parasite strategies to subvert the host immune responses, secure its survival inside the host and further the spread of infection to the visceral organs. In this study, we propose a computational workflow to predict host-pathogen protein interactome of L.donovani secretory virulence factors with human proteins combining sequence-based Interolog mapping and structure-based Domain Interaction mapping techniques. We further employ graph theoretical approaches and shortest path methods to analyze the interactome. Our study deciphers the infection paths involving some unique and understudied disease-associated signaling pathways influencing the cellular phenotypic responses in the host. Our statistical analysis based in silico knockout study unveils for the first time UBC, 1433Z and HS90A mediator proteins as potential immunomodulatory candidates through which the virulence factors employ the infection paths. These identified pathways and novel mediator proteins can be effectively used as possible targets to control and modulate the infection process further aiding in the treatment of Visceral Leishmaniasis.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune 411008, Maharashtra, India
| | - Piyali Ganguli
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune 411008, Maharashtra, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, Uttar Pradesh, India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune 411008, Maharashtra, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, Uttar Pradesh, India
| |
Collapse
|
6
|
Alborzi SZ, Ahmed Nacer A, Najjar H, Ritchie DW, Devignes MD. PPIDomainMiner: Inferring domain-domain interactions from multiple sources of protein-protein interactions. PLoS Comput Biol 2021; 17:e1008844. [PMID: 34370723 PMCID: PMC8376228 DOI: 10.1371/journal.pcbi.1008844] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 08/19/2021] [Accepted: 07/12/2021] [Indexed: 12/26/2022] Open
Abstract
Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing. We describe a new computational approach called “PPIDM” (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described “CODAC” (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as “Gold-Standard” a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided. Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/. We revisit at a large scale the question of inferring DDIs from PPIs. Compared to previous studies, we take a unified approach accross multiple sources of PPIs. This approach is a method for inferring new edges in a tripartite graph setting and can be compared to link prediction approaches in knowledge graphs. Aggregation of several sources is performed using an optimized weighted average of the individual scores calculated in each source. A huge dataset of over 84K DDIs is produced which far exceeds the previous datasets. We show that a significant portion of the PPIDM dataset covers a large number of PPIs from curated (IMEx) or non curated (STRING) databases. Such a reservoir of DDIs deserves further exploration and can be combined with high-throughput methods such as cross-linking mass spectrometry to identify plausible protein partners of proteins of interest.
Collapse
|
7
|
Diwan GD, Carlos Gonzalez-Sanchez J, Apic G, Russell RB. Next generation protein structure predictions and genetic variant interpretation. J Mol Biol 2021; 433:167180. [PMID: 34358547 DOI: 10.1016/j.jmb.2021.167180] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 07/24/2021] [Accepted: 07/26/2021] [Indexed: 10/20/2022]
Abstract
The need to make sense of the thousands of genetic variants uncovered every day in terms of pathology or biological mechanism is acute. Many insights into how genetic changes impact protein function can be gleaned if three-dimensional structures of the associated proteins are available. The availability of a highly accurate method of predicting structures from amino acid sequences is thus potentially a great boost to those wanting to understand genetic changes. In this paper we discuss the current state of protein structures known for the human and other proteomes and how better structure predictions might impact on variant interpretation efforts. For the human proteome in particular, the state of the available structural data suggests that the impact on variant interpretation might be less than anticipated. We also discuss additional efforts in structure prediction that could further aid the understanding of genetic variants.
Collapse
Affiliation(s)
- Gaurav D Diwan
- BioQuant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, Germany; Heidelberg University Biochemistry Center (BZH), Im Neuenheimer Feld
| | - Juan Carlos Gonzalez-Sanchez
- BioQuant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, Germany; Heidelberg University Biochemistry Center (BZH), Im Neuenheimer Feld
| | - Gordana Apic
- BioQuant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, Germany; Heidelberg University Biochemistry Center (BZH), Im Neuenheimer Feld
| | - Robert B Russell
- BioQuant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, Germany; Heidelberg University Biochemistry Center (BZH), Im Neuenheimer Feld.
| |
Collapse
|
8
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
9
|
Lian X, Yang X, Shao J, Hou F, Yang S, Pan D, Zhang Z. Prediction and analysis of human-herpes simplex virus type 1 protein-protein interactions by integrating multiple methods. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0222-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
10
|
Galaxy InteractoMIX: An Integrated Computational Platform for the Study of Protein-Protein Interaction Data. J Mol Biol 2020; 433:166656. [PMID: 32976910 DOI: 10.1016/j.jmb.2020.09.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/30/2020] [Accepted: 09/16/2020] [Indexed: 12/19/2022]
Abstract
Protein interactions play a crucial role among the different functions of a cell and are central to our understanding of cellular processes both in health and disease. Here we present Galaxy InteractoMIX (http://galaxy.interactomix.com), a platform composed of 13 different computational tools each addressing specific aspects of the study of protein-protein interactions, ranging from large-scale cross-species protein-wide interactomes to atomic resolution level of protein complexes. Galaxy InteractoMIX provides an intuitive interface where users can retrieve consolidated interactomics data distributed across several databases or uncover links between diseases and genes by analyzing the interactomes underlying these diseases. The platform makes possible large-scale prediction and curation protein interactions using the conservation of motifs, interology, or presence or absence of key sequence signatures. The range of structure-based tools includes modeling and analysis of protein complexes, delineation of interfaces and the modeling of peptides acting as inhibitors of protein-protein interactions. Galaxy InteractoMIX includes a range of ready-to-use workflows to run complex analyses requiring minimal intervention by users. The potential range of applications of the platform covers different aspects of life science, biomedicine, biotechnology and drug discovery where protein associations are studied.
Collapse
|
11
|
Abstract
All proteins end with a carboxyl terminus that has unique biophysical properties and is often disordered. Although there are examples of important C-termini functions, a more global role for the C-terminus is not yet established. In this review, we summarize research on C-termini, a unique region in proteins that cells exploit. Alternative splicing and proteolysis increase the diversity of proteins and peptides in cells with unique C-termini. The C-termini of proteins contain minimotifs, short peptides with an encoded function generally characterized as binding, posttranslational modifications, and trafficking. Many of these activities are specific to minimotifs on the C-terminus. Approximately 13% of C-termini in the human proteome have a known minimotif, and the majority, if not all of the remaining termini have conserved motifs inferring a function that remains to be discovered. C-termini, their predictions, and their functions are collated in the C-terminome, Proteus, and Terminus Oriented Protein Function INferred Database (TopFIND) database/web systems. Many C-termini are well conserved, and some have a known role in health and disease. We envision that this summary of C-termini will guide future investigation of their biochemical and physiological significance.
Collapse
Affiliation(s)
- Surbhi Sharma
- a Nevada Institute of Personalized Medicine and School of Life Sciences , University of Nevada , Las Vegas , NV , USA
| | - Martin R Schiller
- a Nevada Institute of Personalized Medicine and School of Life Sciences , University of Nevada , Las Vegas , NV , USA
| |
Collapse
|
12
|
Lian X, Yang S, Li H, Fu C, Zhang Z. Machine-Learning-Based Predictor of Human–Bacteria Protein–Protein Interactions by Incorporating Comprehensive Host-Network Properties. J Proteome Res 2019; 18:2195-2205. [DOI: 10.1021/acs.jproteome.9b00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hong Li
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, 570228, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
13
|
Wong AKC, Sze-To HY, Johanning GL. Pattern to Knowledge: Deep Knowledge-Directed Machine Learning for Residue-Residue Interaction Prediction. Sci Rep 2018; 8:14841. [PMID: 30287904 PMCID: PMC6172270 DOI: 10.1038/s41598-018-32834-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 09/17/2018] [Indexed: 11/21/2022] Open
Abstract
Residue-residue close contact (R2R-C) data procured from three-dimensional protein-protein interaction (PPI) experiments is currently used for predicting residue-residue interaction (R2R-I) in PPI. However, due to complex physiochemical environments, R2R-I incidences, facilitated by multiple factors, are usually entangled in the source environment and masked in the acquired data. Here we present a novel method, P2K (Pattern to Knowledge), to disentangle R2R-I patterns and render much succinct discriminative information expressed in different specific R2R-I statistical/functional spaces. Since such knowledge is not visible in the data acquired, we refer to it as deep knowledge. Leveraging the deep knowledge discovered to construct machine learning models for sequence-based R2R-I prediction, without trial-and-error combination of the features over external knowledge of sequences, our R2R-I predictor was validated for its effectiveness under stringent leave-one-complex-out-alone cross-validation in a benchmark dataset, and was surprisingly demonstrated to perform better than an existing sequence-based R2R-I predictor by 28% (p: 1.9E-08). P2K is accessible via our web server on https://p2k.uwaterloo.ca .
Collapse
Affiliation(s)
- Andrew K C Wong
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada.
| | - Ho Yin Sze-To
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada
| | - Gary L Johanning
- Biosciences Division, SRI International, 333 Ravenswood Ave, Menlo Park, CA, USA
| |
Collapse
|
14
|
Dey S, Levy ED. Inferring and Using Protein Quaternary Structure Information from Crystallographic Data. Methods Mol Biol 2018; 1764:357-375. [PMID: 29605927 DOI: 10.1007/978-1-4939-7759-8_23] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A precise knowledge of the quaternary structure of proteins is essential to illuminate both their function and their evolution. The major part of our knowledge on quaternary structure is inferred from X-ray crystallography data, but this inference process is hard and error-prone. The difficulty lies in discriminating fortuitous protein contacts, which make up the lattice of protein crystals, from biological protein contacts that exist in the native cellular environment. Here, we review methods devised to discriminate between both types of contacts and describe resources for downloading protein quaternary structure information and identifying high-confidence quaternary structures. The use of high-confidence datasets of quaternary structures will be critical for the analysis of structural, functional, and evolutionary properties of proteins.
Collapse
Affiliation(s)
- Sucharita Dey
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
15
|
Abstract
Comparing and classifying protein domain interactions according to their three-dimensional (3D) structures can help to understand protein structure-function and evolutionary relationships. Additionally, structural knowledge of existing domain-domain interactions can provide a useful way to find structural templates with which to model the 3D structures of unsolved protein complexes. Here we present a straightforward guide to using the "Kbdock" protein domain structure database and its associated web site for exploring and comparing protein domain-domain interactions (DDIs) and domain-peptide interactions (DPIs) at the Pfam domain family level. We also briefly explain how the Kbdock web site works, and we provide some notes and suggestions which should help to avoid some common pitfalls when working with 3D protein domain structures.
Collapse
|
16
|
Comprehensive Analysis of the Human SH3 Domain Family Reveals a Wide Variety of Non-canonical Specificities. Structure 2017; 25:1598-1610.e3. [DOI: 10.1016/j.str.2017.07.017] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Revised: 06/20/2017] [Accepted: 07/28/2017] [Indexed: 01/31/2023]
|
17
|
Zanzoni A, Spinelli L, Braham S, Brun C. Perturbed human sub-networks by Fusobacterium nucleatum candidate virulence proteins. MICROBIOME 2017; 5:89. [PMID: 28793925 PMCID: PMC5551000 DOI: 10.1186/s40168-017-0307-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2016] [Accepted: 07/13/2017] [Indexed: 05/10/2023]
Abstract
BACKGROUND Fusobacterium nucleatum is a gram-negative anaerobic species residing in the oral cavity and implicated in several inflammatory processes in the human body. Although F. nucleatum abundance is increased in inflammatory bowel disease subjects and is prevalent in colorectal cancer patients, the causal role of the bacterium in gastrointestinal disorders and the mechanistic details of host cell functions subversion are not fully understood. RESULTS We devised a computational strategy to identify putative secreted F. nucleatum proteins (FusoSecretome) and to infer their interactions with human proteins based on the presence of host molecular mimicry elements. FusoSecretome proteins share similar features with known bacterial virulence factors thereby highlighting their pathogenic potential. We show that they interact with human proteins that participate in infection-related cellular processes and localize in established cellular districts of the host-pathogen interface. Our network-based analysis identified 31 functional modules in the human interactome preferentially targeted by 138 FusoSecretome proteins, among which we selected 26 as main candidate virulence proteins, representing both putative and known virulence proteins. Finally, six of the preferentially targeted functional modules are implicated in the onset and progression of inflammatory bowel diseases and colorectal cancer. CONCLUSIONS Overall, our computational analysis identified candidate virulence proteins potentially involved in the F. nucleatum-human cross-talk in the context of gastrointestinal diseases.
Collapse
Affiliation(s)
- Andreas Zanzoni
- Aix-Marseille Université, Inserm, TAGC UMR_S1090, Marseille, France.
| | - Lionel Spinelli
- Aix-Marseille Université, Inserm, TAGC UMR_S1090, Marseille, France
| | - Shérazade Braham
- Aix-Marseille Université, Inserm, TAGC UMR_S1090, Marseille, France
| | - Christine Brun
- Aix-Marseille Université, Inserm, TAGC UMR_S1090, Marseille, France
- CNRS, Marseille, France
| |
Collapse
|
18
|
Korkin D. Effect-specific analysis of pathogenic SNVs in human interactome: Leveraging edge-based network robustness. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2016:2929-2932. [PMID: 28268927 DOI: 10.1109/embc.2016.7591343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Study of genetic variants in the context of molecular networks has recently gained much attention. However, many of these studies suffer from the lack of functional information about the network rewiring effect of genetic variants. After large-scale homology modeling, plus extracting native structure from PDB database, we performed structure-based prediction about the rewiring effect using our SNP-IN tool, and it covers significantly more variants than experimental characterization. The analysis result confirms the widespread perturbations in human interactome and reveals the network rewiring behavior based on edge-based network robustness concept.
Collapse
|
19
|
Du T, Liao L, Wu CH. Enhancing interacting residue prediction with integrated contact matrix prediction in protein-protein interaction. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2016:17. [PMID: 27818677 PMCID: PMC5075339 DOI: 10.1186/s13637-016-0051-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 09/25/2016] [Indexed: 11/10/2022]
Abstract
Identifying the residues in a protein that are involved in protein-protein interaction and identifying the contact matrix for a pair of interacting proteins are two computational tasks at different levels of an in-depth analysis of protein-protein interaction. Various methods for solving these two problems have been reported in the literature. However, the interacting residue prediction and contact matrix prediction were handled by and large independently in those existing methods, though intuitively good prediction of interacting residues will help with predicting the contact matrix. In this work, we developed a novel protein interacting residue prediction system, contact matrix-interaction profile hidden Markov model (CM-ipHMM), with the integration of contact matrix prediction and the ipHMM interaction residue prediction. We propose to leverage what is learned from the contact matrix prediction and utilize the predicted contact matrix as "feedback" to enhance the interaction residue prediction. The CM-ipHMM model showed significant improvement over the previous method that uses the ipHMM for predicting interaction residues only. It indicates that the downstream contact matrix prediction could help the interaction site prediction.
Collapse
Affiliation(s)
- Tianchuan Du
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Cathy H Wu
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| |
Collapse
|
20
|
Du T, Liao L, Wu CH, Sun B. Prediction of residue-residue contact matrix for protein-protein interaction with Fisher score features and deep learning. Methods 2016; 110:97-105. [DOI: 10.1016/j.ymeth.2016.06.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 06/03/2016] [Indexed: 11/28/2022] Open
|
21
|
Prediction of human protein–protein interaction by a domain-based approach. J Theor Biol 2016; 396:144-53. [DOI: 10.1016/j.jtbi.2016.02.026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Revised: 01/29/2016] [Accepted: 02/20/2016] [Indexed: 02/04/2023]
|
22
|
Kuang X, Dhroso A, Han JG, Shyu CR, Korkin D. DOMMINO 2.0: integrating structurally resolved protein-, RNA-, and DNA-mediated macromolecular interactions. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:bav114. [PMID: 26827237 PMCID: PMC4733329 DOI: 10.1093/database/bav114] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 11/16/2015] [Indexed: 11/14/2022]
Abstract
Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction's mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein-protein interactions or protein-DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1,040,000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43,000 RNA-mediated interactions, and ∼12,000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org.
Collapse
Affiliation(s)
- Xingyan Kuang
- Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Andi Dhroso
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Jing Ginger Han
- Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Chi-Ren Shyu
- Informatics Institute, University of Missouri, Columbia, MO, USA, Department of Electrical and Computer Engineering, Department of Computer Science, University of Missouri, Columbia, MO, USA
| | - Dmitry Korkin
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA,
| |
Collapse
|
23
|
Kiran M, Nagarajaram HA. Interaction and localization diversities of global and local hubs in human protein–protein interaction networks. MOLECULAR BIOSYSTEMS 2016; 12:2875-82. [DOI: 10.1039/c6mb00104a] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Hubs, the highly connected nodes in protein–protein interaction networks (PPINs), are associated with several characteristic properties and are known to perform vital roles in cells.
Collapse
Affiliation(s)
- M. Kiran
- Laboratory of Computational Biology
- Centre for DNA Fingerprinting and Diagnostics
- Gruhakalpa
- Hyderabad 500 001
- India
| | - H. A. Nagarajaram
- Laboratory of Computational Biology
- Centre for DNA Fingerprinting and Diagnostics
- Gruhakalpa
- Hyderabad 500 001
- India
| |
Collapse
|
24
|
Zhang W, Chang JW, Lin L, Minn K, Wu B, Chien J, Yong J, Zheng H, Kuang R. Network-Based Isoform Quantification with RNA-Seq Data for Cancer Transcriptome Analysis. PLoS Comput Biol 2015; 11:e1004465. [PMID: 26699225 PMCID: PMC4689380 DOI: 10.1371/journal.pcbi.1004465] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 07/11/2015] [Indexed: 11/18/2022] Open
Abstract
High-throughput mRNA sequencing (RNA-Seq) is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-Seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ) to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA), the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification. Net-RSTQ toolbox is available at http://compbio.cs.umn.edu/Net-RSTQ/.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Jae-Woong Chang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Lilong Lin
- Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, Peoples Republic of China
| | - Kay Minn
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, Kansas, United States of America
| | - Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Jeremy Chien
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, Kansas, United States of America
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
| | - Hui Zheng
- Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, Peoples Republic of China
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
25
|
Ghoorah AW, Devignes MD, Smaïl-Tabbone M, Ritchie DW. Protein docking using case-based reasoning. Proteins 2015; 81:2150-8. [PMID: 24123156 DOI: 10.1002/prot.24433] [Citation(s) in RCA: 86] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Protein docking algorithms aim to calculate the three-dimensional (3D) structure of a protein complex starting from its unbound components. Although ab initio docking algorithms are improving, there is a growing need to use homology modeling techniques to exploit the rapidly increasing volumes of structural information that now exist. However, most current homology modeling approaches involve finding a pair of complete single-chain structures in a homologous protein complex to use as a 3D template, despite the fact that protein complexes are often formed from one or more domain-domain interactions (DDIs). To model 3D protein complexes by domain-domain homology, we have developed a case-based reasoning approach called KBDOCK which systematically identifies and reuses domain family binding sites from our database of nonredundant DDIs. When tested on 54 protein complexes from the Protein Docking Benchmark, our approach provides a near-perfect way to model single-domain protein complexes when full-homology templates are available, and it extends our ability to model more difficult cases when only partial or incomplete templates exist. These promising early results highlight the need for a new and diverse docking benchmark set, specifically designed to assess homology docking approaches.
Collapse
Affiliation(s)
- Anisah W Ghoorah
- Department of Obstetrics and Gynaecology, University of Tuebingen, Tuebingen, Germany
| | | | | | | |
Collapse
|
26
|
Surfing the Protein-Protein Interaction Surface Using Docking Methods: Application to the Design of PPI Inhibitors. Molecules 2015; 20:11569-603. [PMID: 26111183 PMCID: PMC6272567 DOI: 10.3390/molecules200611569] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Revised: 06/02/2015] [Accepted: 06/15/2015] [Indexed: 02/06/2023] Open
Abstract
Blocking protein-protein interactions (PPI) using small molecules or peptides modulates biochemical pathways and has therapeutic significance. PPI inhibition for designing drug-like molecules is a new area that has been explored extensively during the last decade. Considering the number of available PPI inhibitor databases and the limited number of 3D structures available for proteins, docking and scoring methods play a major role in designing PPI inhibitors as well as stabilizers. Docking methods are used in the design of PPI inhibitors at several stages of finding a lead compound, including modeling the protein complex, screening for hot spots on the protein-protein interaction interface and screening small molecules or peptides that bind to the PPI interface. There are three major challenges to the use of docking on the relatively flat surfaces of PPI. In this review we will provide some examples of the use of docking in PPI inhibitor design as well as its limitations. The combination of experimental and docking methods with improved scoring function has thus far resulted in few success stories of PPI inhibitors for therapeutic purposes. Docking algorithms used for PPI are in the early stages, however, and as more data are available docking will become a highly promising area in the design of PPI inhibitors or stabilizers.
Collapse
|
27
|
Huo T, Liu W, Guo Y, Yang C, Lin J, Rao Z. Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs. BMC Bioinformatics 2015; 16:100. [PMID: 25887594 PMCID: PMC4456996 DOI: 10.1186/s12859-015-0535-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 03/13/2015] [Indexed: 12/28/2022] Open
Abstract
Background Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. Results We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by ‘interolog’ method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. Conclusions This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0535-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tong Huo
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Wei Liu
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Yu Guo
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Cheng Yang
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Jianping Lin
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Zihe Rao
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| |
Collapse
|
28
|
Emerson AI, Andrews S, Ahmed I, Azis TK, Malek JA. K-core decomposition of a protein domain co-occurrence network reveals lower cancer mutation rates for interior cores. J Clin Bioinforma 2015; 5:1. [PMID: 25767694 PMCID: PMC4357223 DOI: 10.1186/s13336-015-0016-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 02/18/2015] [Indexed: 11/10/2022] Open
Abstract
Background Network biology currently focuses primarily on metabolic pathways, gene regulatory, and protein-protein interaction networks. While these approaches have yielded critical information, alternative methods to network analysis will offer new perspectives on biological information. A little explored area is the interactions between domains that can be captured using domain co-occurrence networks (DCN). A DCN can be used to study the function and interaction of proteins by representing protein domains and their co-existence in genes and by mapping cancer mutations to the individual protein domains to identify signals. Results The domain co-occurrence network was constructed for the human proteome based on PFAM domains in proteins. Highly connected domains in the central cores were identified using the k-core decomposition technique. Here we show that these domains were found to be more evolutionarily conserved than the peripheral domains. The somatic mutations for ovarian, breast and prostate cancer diseases were obtained from the TCGA database. We mapped the somatic mutations to the individual protein domains and the local false discovery rate was used to identify significantly mutated domains in each cancer type. Significantly mutated domains were found to be enriched in cancer disease pathways. However, we found that the inner cores of the DCN did not contain any of the significantly mutated domains. We observed that the inner core protein domains are highly conserved and these domains co-exist in large numbers with other protein domains. Conclusion Mutations and domain co-occurrence networks provide a framework for understanding hierarchal designs in protein function from a network perspective. This study provides evidence that a majority of protein domains in the inner core of the DCN have a lower mutation frequency and that protein domains present in the peripheral regions of the k-core contribute more heavily to the disease. These findings may contribute further to drug development. Electronic supplementary material The online version of this article (doi:10.1186/s13336-015-0016-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arnold I Emerson
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Simeon Andrews
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Ikhlak Ahmed
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Thasni Ka Azis
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Joel A Malek
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| |
Collapse
|
29
|
Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015; 10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open
Abstract
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
Collapse
|
30
|
Flexibility and small pockets at protein-protein interfaces: New insights into druggability. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2015; 119:2-9. [PMID: 25662442 PMCID: PMC4726663 DOI: 10.1016/j.pbiomolbio.2015.01.009] [Citation(s) in RCA: 98] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Revised: 01/06/2015] [Accepted: 01/28/2015] [Indexed: 01/04/2023]
Abstract
The transient assembly of multiprotein complexes mediates many aspects of cell regulation and signalling in living organisms. Modulation of the formation of these complexes through targeting protein-protein interfaces can offer greater selectivity than the inhibition of protein kinases, proteases or other post-translational regulatory enzymes using substrate, co-factor or transition state mimetics. However, capitalising on protein-protein interaction interfaces as drug targets has been hindered by the nature of interfaces that tend to offer binding sites lacking the well-defined large cavities of classical drug targets. In this review we posit that interfaces formed by concerted folding and binding (disorder-to-order transitions on binding) of one partner and other examples of interfaces where a protein partner is bound through a continuous epitope from a surface-exposed helix, flexible loop or chain extension may be more tractable for the development of "orthosteric", competitive chemical modulators; these interfaces tend to offer small-volume but deep pockets and/or larger grooves that may be bound tightly by small chemical entities. We discuss examples of such protein-protein interaction interfaces for which successful chemical modulators are being developed.
Collapse
|
31
|
A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genet 2014; 10:e1004819. [PMID: 25502805 PMCID: PMC4263371 DOI: 10.1371/journal.pgen.1004819] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 10/14/2014] [Indexed: 12/13/2022] Open
Abstract
Understanding the functional relevance of DNA variants is essential for all exome and genome sequencing projects. However, current mutagenesis cloning protocols require Sanger sequencing, and thus are prohibitively costly and labor-intensive. We describe a massively-parallel site-directed mutagenesis approach, "Clone-seq", leveraging next-generation sequencing to rapidly and cost-effectively generate a large number of mutant alleles. Using Clone-seq, we further develop a comparative interactome-scanning pipeline integrating high-throughput GFP, yeast two-hybrid (Y2H), and mass spectrometry assays to systematically evaluate the functional impact of mutations on protein stability and interactions. We use this pipeline to show that disease mutations on protein-protein interaction interfaces are significantly more likely than those away from interfaces to disrupt corresponding interactions. We also find that mutation pairs with similar molecular phenotypes in terms of both protein stability and interactions are significantly more likely to cause the same disease than those with different molecular phenotypes, validating the in vivo biological relevance of our high-throughput GFP and Y2H assays, and indicating that both assays can be used to determine candidate disease mutations in the future. The general scheme of our experimental pipeline can be readily expanded to other types of interactome-mapping methods to comprehensively evaluate the functional relevance of all DNA variants, including those in non-coding regions.
Collapse
|
32
|
Rolland T, Taşan M, Charloteaux B, Pevzner SJ, Zhong Q, Sahni N, Yi S, Lemmens I, Fontanillo C, Mosca R, Kamburov A, Ghiassian SD, Yang X, Ghamsari L, Balcha D, Begg BE, Braun P, Brehme M, Broly MP, Carvunis AR, Convery-Zupan D, Corominas R, Coulombe-Huntington J, Dann E, Dreze M, Dricot A, Fan C, Franzosa E, Gebreab F, Gutierrez BJ, Hardy MF, Jin M, Kang S, Kiros R, Lin GN, Luck K, MacWilliams A, Menche J, Murray RR, Palagi A, Poulin MM, Rambout X, Rasla J, Reichert P, Romero V, Ruyssinck E, Sahalie JM, Scholz A, Shah AA, Sharma A, Shen Y, Spirohn K, Tam S, Tejeda AO, Wanamaker SA, Twizere JC, Vega K, Walsh J, Cusick ME, Xia Y, Barabási AL, Iakoucheva LM, Aloy P, De Las Rivas J, Tavernier J, Calderwood MA, Hill DE, Hao T, Roth FP, Vidal M. A proteome-scale map of the human interactome network. Cell 2014; 159:1212-1226. [PMID: 25416956 PMCID: PMC4266588 DOI: 10.1016/j.cell.2014.10.050] [Citation(s) in RCA: 929] [Impact Index Per Article: 92.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 10/21/2014] [Accepted: 10/30/2014] [Indexed: 12/12/2022]
Abstract
Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ?14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ?30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a "broader" human interactome network than currently appreciated. The map also uncovers significant interconnectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high-quality interactome models will help "connect the dots" of the genomic revolution.
Collapse
Affiliation(s)
- Thomas Rolland
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Murat Taşan
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON M5S 3E1, Canada; Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada
| | - Benoit Charloteaux
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Samuel J Pevzner
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Boston University School of Medicine, Boston, MA 02118, USA
| | - Quan Zhong
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Biological Sciences, Wright State University, Dayton, OH 45435, USA
| | - Nidhi Sahni
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Song Yi
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Irma Lemmens
- Department of Medical Protein Research, VIB, 9000 Ghent, Belgium
| | - Celia Fontanillo
- Cancer Research Center (Centro de Investigación del Cancer), University of Salamanca and Consejo Superior de Investigaciones Científicas, Salamanca 37008, Spain
| | - Roberto Mosca
- Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), Barcelona 08028, Spain
| | - Atanas Kamburov
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Susan D Ghiassian
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Center for Complex Network Research (CCNR) and Department of Physics, Northeastern University, Boston, MA 02115, USA
| | - Xinping Yang
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Lila Ghamsari
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Bridget E Begg
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Pascal Braun
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Marc Brehme
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Martin P Broly
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Anne-Ruxandra Carvunis
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Dan Convery-Zupan
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Roser Corominas
- Department of Psychiatry, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jasmin Coulombe-Huntington
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Bioengineering, McGill University, Montreal, QC H3A 0C3, Canada
| | - Elizabeth Dann
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Matija Dreze
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Amélie Dricot
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Changyu Fan
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Eric Franzosa
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Bioengineering, McGill University, Montreal, QC H3A 0C3, Canada
| | - Fana Gebreab
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Bryan J Gutierrez
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Madeleine F Hardy
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Mike Jin
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Shuli Kang
- Department of Psychiatry, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ruth Kiros
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Guan Ning Lin
- Department of Psychiatry, University of California, San Diego, La Jolla, CA 92093, USA
| | - Katja Luck
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Andrew MacWilliams
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jörg Menche
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Center for Complex Network Research (CCNR) and Department of Physics, Northeastern University, Boston, MA 02115, USA
| | - Ryan R Murray
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Alexandre Palagi
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Matthew M Poulin
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Xavier Rambout
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Protein Signaling and Interactions Lab, GIGA-R, University of Liege, 4000 Liege, Belgium
| | - John Rasla
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Patrick Reichert
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Viviana Romero
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Elien Ruyssinck
- Department of Medical Protein Research, VIB, 9000 Ghent, Belgium
| | - Julie M Sahalie
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Annemarie Scholz
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Akash A Shah
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Amitabh Sharma
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Center for Complex Network Research (CCNR) and Department of Physics, Northeastern University, Boston, MA 02115, USA
| | - Yun Shen
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Kerstin Spirohn
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Stanley Tam
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Alexander O Tejeda
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Shelly A Wanamaker
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jean-Claude Twizere
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Protein Signaling and Interactions Lab, GIGA-R, University of Liege, 4000 Liege, Belgium
| | - Kerwin Vega
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jennifer Walsh
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Michael E Cusick
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Yu Xia
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Bioengineering, McGill University, Montreal, QC H3A 0C3, Canada
| | - Albert-László Barabási
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Center for Complex Network Research (CCNR) and Department of Physics, Northeastern University, Boston, MA 02115, USA; Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California, San Diego, La Jolla, CA 92093, USA
| | - Patrick Aloy
- Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), Barcelona 08028, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Spain
| | - Javier De Las Rivas
- Cancer Research Center (Centro de Investigación del Cancer), University of Salamanca and Consejo Superior de Investigaciones Científicas, Salamanca 37008, Spain
| | - Jan Tavernier
- Department of Medical Protein Research, VIB, 9000 Ghent, Belgium
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Frederick P Roth
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON M5S 3E1, Canada; Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada; Canadian Institute for Advanced Research, Toronto M5G 1Z8, Canada.
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
33
|
Betts MJ, Lu Q, Jiang Y, Drusko A, Wichmann O, Utz M, Valtierra-Gutiérrez IA, Schlesner M, Jaeger N, Jones DT, Pfister S, Lichter P, Eils R, Siebert R, Bork P, Apic G, Gavin AC, Russell RB. Mechismo: predicting the mechanistic impact of mutations and modifications on molecular interactions. Nucleic Acids Res 2014; 43:e10. [PMID: 25392414 PMCID: PMC4333368 DOI: 10.1093/nar/gku1094] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Systematic interrogation of mutation or protein modification data is important to identify sites with functional consequences and to deduce global consequences from large data sets. Mechismo (mechismo.russellab.org) enables simultaneous consideration of thousands of 3D structures and biomolecular interactions to predict rapidly mechanistic consequences for mutations and modifications. As useful functional information often only comes from homologous proteins, we benchmarked the accuracy of predictions as a function of protein/structure sequence similarity, which permits the use of relatively weak sequence similarities with an appropriate confidence measure. For protein–protein, protein–nucleic acid and a subset of protein–chemical interactions, we also developed and benchmarked a measure of whether modifications are likely to enhance or diminish the interactions, which can assist the detection of modifications with specific effects. Analysis of high-throughput sequencing data shows that the approach can identify interesting differences between cancers, and application to proteomics data finds potential mechanistic insights for how post-translational modifications can alter biomolecular interactions.
Collapse
Affiliation(s)
- Matthew J Betts
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Qianhao Lu
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - YingYing Jiang
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Armin Drusko
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Oliver Wichmann
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Mathias Utz
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Ilse A Valtierra-Gutiérrez
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Matthias Schlesner
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Natalie Jaeger
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - David T Jones
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Stefan Pfister
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Peter Lichter
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Roland Eils
- Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany Department for Bioinformatics and Functional Genomics, Institute for Pharmacy and Molecular Biotechnology (IPMB), University of Heidelberg, Heidelberg, Germany
| | - Reiner Siebert
- Institut für Humangenetik, Universitätsklinikum Schleswig-Holstein, Christian-Albrechts-Universität zu Kiel, Arnold Heller Straße 3, 24105 Kiel, Germany
| | - Peer Bork
- EMBL, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Gordana Apic
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Cambridge Cell Networks Ltd, St John's Innovation Centre, Cowley Road, CB3 0WS, Cambridge, UK
| | | | - Robert B Russell
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| |
Collapse
|
34
|
Wang J, Yang J, Mao S, Chai X, Hu Y, Hou X, Tang Y, Bi C, Li X. MitProNet: A knowledgebase and analysis platform of proteome, interactome and diseases for mammalian mitochondria. PLoS One 2014; 9:e111187. [PMID: 25347823 PMCID: PMC4210245 DOI: 10.1371/journal.pone.0111187] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Accepted: 09/26/2014] [Indexed: 12/18/2022] Open
Abstract
Mitochondrion plays a central role in diverse biological processes in most eukaryotes, and its dysfunctions are critically involved in a large number of diseases and the aging process. A systematic identification of mitochondrial proteomes and characterization of functional linkages among mitochondrial proteins are fundamental in understanding the mechanisms underlying biological functions and human diseases associated with mitochondria. Here we present a database MitProNet which provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. First an inventory of mammalian mitochondrial proteins was compiled by widely collecting proteomic datasets, and the proteins were classified by machine learning to achieve a high-confidence list of mitochondrial proteins. The current version of MitProNet covers 1124 high-confidence proteins, and the remainders were further classified as middle- or low-confidence. An organelle-specific network of functional linkages among mitochondrial proteins was then generated by integrating genomic features encoded by a wide range of datasets including genomic context, gene expression profiles, protein-protein interactions, functional similarity and metabolic pathways. The functional-linkage network should be a valuable resource for the study of biological functions of mitochondrial proteins and human mitochondrial diseases. Furthermore, we utilized the network to predict candidate genes for mitochondrial diseases using prioritization algorithms. All proteins, functional linkages and disease candidate genes in MitProNet were annotated according to the information collected from their original sources including GO, GEO, OMIM, KEGG, MIPS, HPRD and so on. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases. MitProNet is freely accessible at http://bio.scu.edu.cn:8085/MitProNet.
Collapse
Affiliation(s)
- Jiabin Wang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Jian Yang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Song Mao
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xiaoqiang Chai
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Yuling Hu
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xugang Hou
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Yiheng Tang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Cheng Bi
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xiao Li
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| |
Collapse
|
35
|
Aran M, Smal C, Pellizza L, Gallo M, Otero LH, Klinke S, Goldbaum FA, Ithurralde ER, Bercovich A, Mac Cormack WP, Turjanski AG, Cicero DO. Solution and crystal structure of BA42, a protein from the Antarctic bacteriumBizionia argentinensiscomprised of a stand-alone TPM domain. Proteins 2014; 82:3062-78. [DOI: 10.1002/prot.24667] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Revised: 08/01/2014] [Accepted: 08/06/2014] [Indexed: 11/11/2022]
Affiliation(s)
- Martin Aran
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Clara Smal
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Leonardo Pellizza
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Mariana Gallo
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Lisandro H. Otero
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
- Plataforma Argentina de Biología Estructural y Metabolómica PLABEM, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Sebastián Klinke
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
- Plataforma Argentina de Biología Estructural y Metabolómica PLABEM, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Fernando A. Goldbaum
- Fundación Instituto Leloir, IIBBA-CONICET, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
- Plataforma Argentina de Biología Estructural y Metabolómica PLABEM, Patricias Argentinas 435 (C1405BWE); Buenos Aires Argentina
| | - Esteban R. Ithurralde
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales; Universidad de Buenos Aires, e INQUIMAE-CONICET, Intendente Güiraldes 2160 (C1428EGA); Buenos Aires Argentina
| | - Andrés Bercovich
- Biosidus S.A., Constitución 4234 (C1254ABX); Buenos Aires Argentina
| | | | - Adrián G. Turjanski
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales; Universidad de Buenos Aires, e INQUIMAE-CONICET, Intendente Güiraldes 2160 (C1428EGA); Buenos Aires Argentina
| | - Daniel O. Cicero
- Dipartimento di Scienze e Tecnologie Chimiche; Università di Roma “Tor Vergata”, via della Ricerca Scientifica SNC (00133); Rome Italy
| |
Collapse
|
36
|
Nchongboh CG, Wu GW, Hong N, Wang GP. Protein–protein interactions between proteins of Citrus tristeza virus isolates. Virus Genes 2014; 49:456-65. [DOI: 10.1007/s11262-014-1100-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 06/20/2014] [Indexed: 12/01/2022]
|
37
|
van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, Fuxreiter M, Gough J, Gsponer J, Jones D, Kim PM, Kriwacki R, Oldfield CJ, Pappu RV, Tompa P, Uversky VN, Wright P, Babu MM. Classification of intrinsically disordered regions and proteins. Chem Rev 2014; 114:6589-631. [PMID: 24773235 PMCID: PMC4095912 DOI: 10.1021/cr400525m] [Citation(s) in RCA: 1440] [Impact Index Per Article: 144.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Indexed: 12/11/2022]
Affiliation(s)
- Robin van der Lee
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
- Centre
for Molecular and Biomolecular Informatics, Radboud University Medical Centre, 6500 HB Nijmegen, The
Netherlands
| | - Marija Buljan
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Benjamin Lang
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Robert J. Weatheritt
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Gary W. Daughdrill
- Department
of Cell Biology, Microbiology, and Molecular Biology, University of South Florida, 3720 Spectrum Boulevard, Suite 321, Tampa, Florida 33612, United States
| | - A. Keith Dunker
- Department
of Biochemistry and Molecular Biology, Indiana
University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Monika Fuxreiter
- MTA-DE
Momentum Laboratory of Protein Dynamics, Department of Biochemistry
and Molecular Biology, University of Debrecen, H-4032 Debrecen, Nagyerdei krt 98, Hungary
| | - Julian Gough
- Department
of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, United Kingdom
| | - Joerg Gsponer
- Department
of Biochemistry and Molecular Biology, Centre for High-Throughput
Biology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - David
T. Jones
- Bioinformatics
Group, Department of Computer Science, University
College London, London, WC1E 6BT, United Kingdom
| | - Philip M. Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, Department of Molecular
Genetics, and Department of Computer Science, University
of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Richard
W. Kriwacki
- Department
of Structural Biology, St. Jude Children’s
Research Hospital, Memphis, Tennessee 38105, United States
| | - Christopher J. Oldfield
- Department
of Biochemistry and Molecular Biology, Indiana
University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Rohit V. Pappu
- Department
of Biomedical Engineering and Center for Biological Systems Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Peter Tompa
- VIB Department
of Structural Biology, Vrije Universiteit
Brussel, Brussels, Belgium
- Institute
of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Vladimir N. Uversky
- Department
of Molecular Medicine and USF Health Byrd Alzheimer’s Research
Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, United States
- Institute for Biological Instrumentation,
Russian Academy of Sciences, Pushchino,
Moscow Region, Russia
| | - Peter
E. Wright
- Department
of Integrative Structural and Computational Biology and Skaggs Institute
of Chemical Biology, The Scripps Research
Institute, 10550 North
Torrey Pines Road, La Jolla, California 92037, United States
| | - M. Madan Babu
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
38
|
Das J, Fragoza R, Lee HR, Cordero NA, Guo Y, Meyer MJ, Vo TV, Wang X, Yu H. Exploring mechanisms of human disease through structurally resolved protein interactome networks. MOLECULAR BIOSYSTEMS 2014; 10:9-17. [PMID: 24096645 DOI: 10.1039/c3mb70225a] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The study of the molecular basis of human disease has gained increasing attention over the past decade. With significant improvements in sequencing efficiency and throughput, a wealth of genotypic data has become available. However the translation of this information into concrete advances in diagnostic and clinical setups has proved far more challenging. Two major reasons for this are the lack of functional annotation for genomic variants and the complex nature of genotype-to-phenotype relationships. One fundamental approach to bypass these issues is to examine the effects of genetic variation at the level of proteins as they are directly involved in carrying out biological functions. Within the cell, proteins function by interacting with other proteins as a part of an underlying interactome network. This network can be determined using interactome mapping - a combination of high-throughput experimental toolkits and curation from small-scale studies. Integrating structural information from co-crystals with the network allows generation of a structurally resolved network. Within the context of this network, the structural principles of disease mutations can be examined and used to generate reliable mechanistic hypotheses regarding disease pathogenesis.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Hao Ran Lee
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Nicolas A Cordero
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Yu Guo
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Tommy V Vo
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Xiujuan Wang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
39
|
Murakami Y, Mizuguchi K. Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators. BMC Bioinformatics 2014; 15:213. [PMID: 24953126 PMCID: PMC4229973 DOI: 10.1186/1471-2105-15-213] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 06/17/2014] [Indexed: 02/02/2023] Open
Abstract
Background Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism (“an interactome”) is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs. Results In this article, we propose a new computational method to predict interaction between a given pair of protein sequences using features derived from known homologous PPIs. The proposed method is capable of predicting interaction between two proteins (of unknown structure) using Averaged One-Dependence Estimators (AODE) and three features calculated for the protein pair: (a) sequence similarities to a known interacting protein pair (FSeq), (b) statistical propensities of domain pairs observed in interacting proteins (FDom) and (c) a sum of edge weights along the shortest path between homologous proteins in a PPI network (FNet). Feature vectors were defined to lie in a half-space of the symmetrical high-dimensional feature space to make them independent of the protein order. The predictability of the method was assessed by a 10-fold cross validation on a recently created human PPI dataset with randomly sampled negative data, and the best model achieved an Area Under the Curve of 0.79 (pAUC0.5% = 0.16). In addition, the AODE trained on all three features (named PSOPIA) showed better prediction performance on a separate independent data set than a recently reported homology-based method. Conclusions Our results suggest that FNet, a feature representing proximity in a known PPI network between two proteins that are homologous to a target protein pair, contributes to the prediction of whether the target proteins interact or not. PSOPIA will help identify novel PPIs and estimate complete PPI networks. The method proposed in this article is freely available on the web at http://mizuguchilab.org/PSOPIA.
Collapse
Affiliation(s)
- Yoichi Murakami
- Bioinformatics Project, National Institute of Biomedical Innovation, 7-6-8 Saito-Asagi, Ibaraki, Osaka 567-0085, Japan.
| | | |
Collapse
|
40
|
Lu HC, Fornili A, Fraternali F. Protein-protein interaction networks studies and importance of 3D structure knowledge. Expert Rev Proteomics 2014; 10:511-20. [PMID: 24206225 DOI: 10.1586/14789450.2013.856764] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Protein-protein interaction networks (PPINs) are a powerful tool to study biological processes in living cells. In this review, we present the progress of PPIN studies from abstract to more detailed representations. We will focus on 3D interactome networks, which offer detailed information at the atomic level. This information can be exploited in understanding not only the underlying cellular mechanisms, but also how human variants and disease-causing mutations affect protein functions and complexes' stability. Recent studies have used structural information on PPINs to also understand the molecular mechanisms of binding partner selection. We will address the challenges in generating 3D PPINs due to the restricted number of solved protein structures. Finally, some of the current use of 3D PPINs will be discussed, highlighting their contribution to the studies in genotype-phenotype relationships and in the optimization of targeted studies to design novel chemical compounds for medical treatments.
Collapse
Affiliation(s)
- Hui-Chun Lu
- Randall Division of Cell and Molecular Biophysics, King's College London, New Hunt's House, London SE1 1UL, UK
| | | | | |
Collapse
|
41
|
Van Roey K, Uyar B, Weatheritt RJ, Dinkel H, Seiler M, Budd A, Gibson TJ, Davey NE. Short Linear Motifs: Ubiquitous and Functionally Diverse Protein Interaction Modules Directing Cell Regulation. Chem Rev 2014; 114:6733-78. [DOI: 10.1021/cr400585q] [Citation(s) in RCA: 293] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Kim Van Roey
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Bora Uyar
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Robert J. Weatheritt
- MRC
Laboratory of Molecular Biology (LMB), Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, United Kingdom
| | - Holger Dinkel
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Markus Seiler
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Aidan Budd
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Toby J. Gibson
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Norman E. Davey
- Structural
and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
- Department
of Physiology, University of California, San Francisco, San Francisco, California 94143, United States
| |
Collapse
|
42
|
The domain landscape of virus-host interactomes. BIOMED RESEARCH INTERNATIONAL 2014; 2014:867235. [PMID: 24991570 PMCID: PMC4065681 DOI: 10.1155/2014/867235] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 03/19/2014] [Indexed: 12/31/2022]
Abstract
Viral infections result in millions of deaths in the world today. A thorough analysis of virus-host interactomes may reveal insights into viral infection and pathogenic strategies. In this study, we presented a landscape of virus-host interactomes based on protein domain interaction. Compared to the analysis at protein level, this domain-domain interactome provided a unique abstraction of protein-protein interactome. Through comparisons among DNA, RNA, and retrotranscribing viruses, we identified a core of human domains, that viruses used to hijack the cellular machinery and evade the immune system, which might be promising antiviral drug targets. We showed that viruses preferentially interacted with host hub and bottleneck domains, and the degree and betweenness centrality among three categories of viruses are significantly different. Further analysis at functional level highlighted that different viruses perturbed the host cellular molecular network by common and unique strategies. Most importantly, we creatively proposed a viral disease network among viral domains, human domains and the corresponding diseases, which uncovered several unknown virus-disease relationships that needed further verification. Overall, it is expected that the findings will help to deeply understand the viral infection and contribute to the development of antiviral therapy.
Collapse
|
43
|
Villoutreix BO, Kuenemann MA, Poyet JL, Bruzzoni-Giovanelli H, Labbé C, Lagorce D, Sperandio O, Miteva MA. Drug-Like Protein-Protein Interaction Modulators: Challenges and Opportunities for Drug Discovery and Chemical Biology. Mol Inform 2014; 33:414-437. [PMID: 25254076 PMCID: PMC4160817 DOI: 10.1002/minf.201400040] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 04/21/2014] [Indexed: 12/13/2022]
Abstract
[Formula: see text] Fundamental processes in living cells are largely controlled by macromolecular interactions and among them, protein-protein interactions (PPIs) have a critical role while their dysregulations can contribute to the pathogenesis of numerous diseases. Although PPIs were considered as attractive pharmaceutical targets already some years ago, they have been thus far largely unexploited for therapeutic interventions with low molecular weight compounds. Several limiting factors, from technological hurdles to conceptual barriers, are known, which, taken together, explain why research in this area has been relatively slow. However, this last decade, the scientific community has challenged the dogma and became more enthusiastic about the modulation of PPIs with small drug-like molecules. In fact, several success stories were reported both, at the preclinical and clinical stages. In this review article, written for the 2014 International Summer School in Chemoinformatics (Strasbourg, France), we discuss in silico tools (essentially post 2012) and databases that can assist the design of low molecular weight PPI modulators (these tools can be found at www.vls3d.com). We first introduce the field of protein-protein interaction research, discuss key challenges and comment recently reported in silico packages, protocols and databases dedicated to PPIs. Then, we illustrate how in silico methods can be used and combined with experimental work to identify PPI modulators.
Collapse
Affiliation(s)
- Bruno O Villoutreix
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Melaine A Kuenemann
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - Jean-Luc Poyet
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- IUH, Hôpital Saint-LouisParis, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Heriberto Bruzzoni-Giovanelli
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CIC, Clinical investigation center, Hôpital Saint-LouisParis, France
| | - Céline Labbé
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - David Lagorce
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| | - Olivier Sperandio
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
- CDithem, Faculté de Pharmacie, 1 rue du Prof Laguesse59000 Lille, France
| | - Maria A Miteva
- Université Paris Diderot, Sorbonne Paris Cité, UMRS 973 InsermParis 75013, France
- Inserm, U973Paris 75013, France
| |
Collapse
|
44
|
Das J, Lee HR, Sagar A, Fragoza R, Liang J, Wei X, Wang X, Mort M, Stenson PD, Cooper DN, Yu H. Elucidating common structural features of human pathogenic variations using large-scale atomic-resolution protein networks. Hum Mutat 2014; 35:585-93. [PMID: 24599843 DOI: 10.1002/humu.22534] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 02/14/2014] [Indexed: 01/24/2023]
Abstract
With the rapid growth of structural genomics, numerous protein crystal structures have become available. However, the parallel increase in knowledge of the functional principles underlying biological processes, and more specifically the underlying molecular mechanisms of disease, has been less dramatic. This notwithstanding, the study of complex cellular networks has made possible the inference of protein functions on a large scale. Here, we combine the scale of network systems biology with the resolution of traditional structural biology to generate a large-scale atomic-resolution interactome-network comprising 3,398 interactions between 2,890 proteins with a well-defined interaction interface and interface residues for each interaction. Within the framework of this atomic-resolution network, we have explored the structural principles underlying variations causing human-inherited disease. We find that in-frame pathogenic variations are enriched at both the interface and in the interacting domain, suggesting that variations not only at interface "hot-spots," but in the entire interacting domain can result in alterations of interactions. Further, the sites of pathogenic variations are closely related to the biophysical strength of the interactions they perturb. Finally, we show that biochemical alterations consequent to these variations are considerably more disruptive than evolutionary changes, with the most significant alterations at the protein interaction interface.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Extended interaction network of procollagen C-proteinase enhancer-1 in the extracellular matrix. Biochem J 2014; 457:137-49. [PMID: 24117177 DOI: 10.1042/bj20130295] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PCPE-1 (procollagen C-proteinase enhancer-1) is an extracellular matrix glycoprotein that can stimulate procollagen processing by procollagen C-proteinases such as BMP-1 (bone morphogenetic protein 1). PCPE-1 interacts with several proteins in addition to procollagens and BMP-1, suggesting that it could be involved in biological processes other than collagen maturation. We thus searched for additional partners of PCPE-1 in the extracellular matrix, which could provide new insights into its biological roles. We identified 17 new partners of PCPE-1 by SPR (surface plasmon resonance) imaging. PCPE-1 forms a transient complex with the β-amyloid peptide, whereas it forms high or very high affinity complexes with laminin-111 (KD=58.8 pM), collagen VI (KD=9.5 nM), TSP-1 (thrombospondin-1) (KD1=19.9 pM, KD2=14.5 nM), collagen IV (KD=49.4 nM) and endostatin, a fragment of collagen XVIII (KD1=0.30 nM, KD2=1.1 nM). Endostatin binds to the NTR (netrin-like) domain of PCPE-1 and decreases the degree of superstimulation of PCPE-1 enhancing activity by heparin. The analysis of the PCPE-1 interaction network based on Gene Ontology terms suggests that, besides its role in collagen deposition, PCPE-1 might be involved in tumour growth, neurodegenerative diseases and angiogenesis. In vitro assays have indeed shown that the CUB1CUB2 (where CUB is complement protein subcomponents C1r/C1s, urchin embryonic growth factor and BMP-1) fragment of PCPE-1 inhibits angiogenesis.
Collapse
|
46
|
Damle NP, Mohanty D. Deciphering kinase–substrate relationships by analysis of domain-specific phosphorylation network. Bioinformatics 2014; 30:1730-8. [DOI: 10.1093/bioinformatics/btu112] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
|
47
|
Trends in structural coverage of the protein universe and the impact of the Protein Structure Initiative. Proc Natl Acad Sci U S A 2014; 111:3733-8. [PMID: 24567391 DOI: 10.1073/pnas.1321614111] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins--including proteins for which reliable homology models can be generated--on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long.
Collapse
|
48
|
Cukuroglu E, Gursoy A, Nussinov R, Keskin O. Non-redundant unique interface structures as templates for modeling protein interactions. PLoS One 2014; 9:e86738. [PMID: 24475173 PMCID: PMC3903793 DOI: 10.1371/journal.pone.0086738] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2013] [Accepted: 12/18/2013] [Indexed: 01/16/2023] Open
Abstract
Improvements in experimental techniques increasingly provide structural data relating to protein-protein interactions. Classification of structural details of protein-protein interactions can provide valuable insights for modeling and abstracting design principles. Here, we aim to cluster protein-protein interactions by their interface structures, and to exploit these clusters to obtain and study shared and distinct protein binding sites. We find that there are 22604 unique interface structures in the PDB. These unique interfaces, which provide a rich resource of structural data of protein-protein interactions, can be used for template-based docking. We test the specificity of these non-redundant unique interface structures by finding protein pairs which have multiple binding sites. We suggest that residues with more than 40% relative accessible surface area should be considered as surface residues in template-based docking studies. This comprehensive study of protein interface structures can serve as a resource for the community. The dataset can be accessed at http://prism.ccbb.ku.edu.tr/piface.
Collapse
Affiliation(s)
- Engin Cukuroglu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Ruth Nussinov
- National Cancer Institute, Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| |
Collapse
|
49
|
Garbutt CC, Bangalore PV, Kannar P, Mukhtar MS. Getting to the edge: protein dynamical networks as a new frontier in plant-microbe interactions. FRONTIERS IN PLANT SCIENCE 2014; 5:312. [PMID: 25071795 PMCID: PMC4074768 DOI: 10.3389/fpls.2014.00312] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 06/11/2014] [Indexed: 05/18/2023]
Abstract
A systems perspective on diverse phenotypes, mechanisms of infection, and responses to environmental stresses can lead to considerable advances in agriculture and medicine. A significant promise of systems biology within plants is the development of disease-resistant crop varieties, which would maximize yield output for food, clothing, building materials, and biofuel production. A systems or "-omics" perspective frames the next frontier in the search for enhanced knowledge of plant network biology. The functional understanding of network structure and dynamics is vital to expanding our knowledge of how the intercellular communication processes are executed. This review article will systematically discuss various levels of organization of systems biology beginning with the building blocks termed "-omes" and ending with complex transcriptional and protein-protein interaction networks. We will also highlight the prevailing computational modeling approaches of biological regulatory network dynamics. The latest developments in the "-omics" approach will be reviewed and discussed to underline and highlight novel technologies and research directions in plant network biology.
Collapse
Affiliation(s)
- Cassandra C. Garbutt
- Department of Biology, The University of Alabama at BirminghamBirmingham, AL, USA
| | - Purushotham V. Bangalore
- Department of Computer and Information Sciences, The University of Alabama at BirminghamBirmingham, AL, USA
| | - Pegah Kannar
- Department of Biology, The University of Alabama at BirminghamBirmingham, AL, USA
| | - M. S. Mukhtar
- Department of Biology, The University of Alabama at BirminghamBirmingham, AL, USA
- Nutrition Obesity Research Center, The University of Alabama at BirminghamBirmingham, AL, USA
- *Correspondence: M. S. Mukhtar, Department of Biology, The University of Alabama at Birmingham, Campbell Hall 369, 1300 University Boulevard, Birmingham, AL 35294-1170, USA e-mail:
| |
Collapse
|
50
|
On the use of knowledge-based potentials for the evaluation of models of protein-protein, protein-DNA, and protein-RNA interactions. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:77-120. [PMID: 24629186 DOI: 10.1016/b978-0-12-800168-4.00004-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Proteins are the bricks and mortar of cells, playing structural and functional roles. In order to perform their function, they interact with each other as well as with other biomolecules such as DNA or RNA. Therefore, to fathom the function of a protein, we require knowing its partners and the atomic details of its interactions (i.e., the structure of the complex). However, the amount of protein interactions with an experimentally determined three-dimensional structure is scarce. Therefore, computational techniques such as homology modeling are foremost to fill this gap. Protein interactions can be modeled using as templates the interactions of homologous proteins, if the structure of the complex is known, or using docking methods. In both approaches, the estimation of the quality of models is essential. There are several ways to address this problem. In this review, we focus on the use of knowledge-based potentials for the analysis of protein interactions. We describe the procedure to derive statistical potentials and split them into different energetic terms that can be used for different purposes. We extensively discuss the fields where knowledge-based potentials have been successfully applied to (1) model protein-protein, protein-DNA, and protein-RNA interactions and (2) predict binding sites (in the protein and in the DNA). Moreover, we provide ready-to-use resources for docking and benchmarking protein interactions.
Collapse
|