1
|
Su Z, Dhusia K, Wu Y. Encoding the space of protein-protein binding interfaces by artificial intelligence. Comput Biol Chem 2024; 110:108080. [PMID: 38643609 DOI: 10.1016/j.compbiolchem.2024.108080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/03/2024] [Accepted: 04/17/2024] [Indexed: 04/23/2024]
Abstract
The physical interactions between proteins are largely determined by the structural properties at their binding interfaces. It was found that the binding interfaces in distinctive protein complexes are highly similar. The structural properties underlying different binding interfaces could be further captured by artificial intelligence. In order to test this hypothesis, we broke protein-protein binding interfaces into pairs of interacting fragments. We employed a generative model to encode these interface fragment pairs in a low-dimensional latent space. After training, new conformations of interface fragment pairs were generated. We found that, by only using a small number of interface fragment pairs that were generated by artificial intelligence, we were able to guide the assembly of protein complexes into their native conformations. These results demonstrate that the conformational space of fragment pairs at protein-protein binding interfaces is highly degenerate. Features in this degenerate space can be well characterized by artificial intelligence. In summary, our machine learning method will be potentially useful to search for and predict the conformations of unknown protein-protein interactions.
Collapse
Affiliation(s)
- Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN 37212, USA
| | - Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
2
|
Yuan Q, Tian C, Yang Y. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024; 13:RP93695. [PMID: 38630609 PMCID: PMC11023698 DOI: 10.7554/elife.93695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open
Abstract
Revealing protein binding sites with other molecules, such as nucleic acids, peptides, or small ligands, sheds light on disease mechanism elucidation and novel drug design. With the explosive growth of proteins in sequence databases, how to accurately and efficiently identify these binding sites from sequences becomes essential. However, current methods mostly rely on expensive multiple sequence alignments or experimental protein structures, limiting their genome-scale applications. Besides, these methods haven't fully explored the geometry of the protein structures. Here, we propose GPSite, a multi-task network for simultaneously predicting binding residues of DNA, RNA, peptide, protein, ATP, HEM, and metal ions on proteins. GPSite was trained on informative sequence embeddings and predicted structures from protein language models, while comprehensively extracting residual and relational geometric contexts in an end-to-end manner. Experiments demonstrate that GPSite substantially surpasses state-of-the-art sequence-based and structure-based approaches on various benchmark datasets, even when the structures are not well-predicted. The low computational cost of GPSite enables rapid genome-scale binding residue annotations for over 568,000 sequences, providing opportunities to unveil unexplored associations of binding sites with molecular functions, biological processes, and genetic variants. The GPSite webserver and annotation database can be freely accessed at https://bio-web1.nscc-gz.cn/app/GPSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Chong Tian
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| |
Collapse
|
3
|
Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024; 64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Fang Liang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
4
|
Aybey E, Gümüş Ö. SENSDeep: An Ensemble Deep Learning Method for Protein-Protein Interaction Sites Prediction. Interdiscip Sci 2023; 15:55-87. [PMID: 36346583 DOI: 10.1007/s12539-022-00543-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 10/15/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE The determination of which amino acid in a protein interacts with other proteins is important in understanding the functional mechanism of that protein. Although there are experimental methods to detect protein-protein interaction sites (PPISs), these are costly, time-consuming, and require expertise. Therefore, many computational methods have been proposed to accelerate this type of research, but they are generally insufficient to predict PPISs accurately. There is a need for development in this field. METHODS In this study, we introduce a new PPISs prediction method. This method is a sequence-based Stacking ENSemble Deep (SENSDeep) learning method that has an ensemble learning model including the models of RNN, CNN, GRU sequence to sequence (GRUs2s), GRU sequence to sequence with an attention layer (GRUs2satt) and a multilayer perceptron. Two embedded features, secondary structure, and protein sequence information are added to the training data set in addition to twelve existing features to improve the prediction performance of the method. RESULTS SENSDeep trained on the training data set without two extra features obtains a better performance on some of the independent testing data sets than that of the other methods in the literature, especially on scoring metrics of sensitivity, F1, MCC, and AUPRC, having increments up to 63.5%, 19.3%, 18.5%, 11.4%, respectively. It is shown that the added extra features improve the performance of the method by having almost the same performance with less data as the method trained on the data set without these added features. On the other hand, different sizes of the sliding window are tried on the data sets and an optimal sliding window size for SENSDeep is found. Moreover, SENSDeep has also been compared to structure-based methods. Some of these methods have been found to perform better. Using SENSDeep obtained by training with both training data sets, PPISs prediction examples of various proteins that are not in these training data sets are also presented. Furthermore, execution times for SENSDeep and its submodels are shown. AVAILABILITY AND IMPLEMENTATION https://github.com/enginaybey/SENSDeep.
Collapse
Affiliation(s)
- Engin Aybey
- Department of Health Bioinformatics, Ege University, 35100, Bornova, Izmir, Turkey.
- Rectorate, Marmara University, 34722, Kadıköy, Istanbul, Turkey.
| | - Özgür Gümüş
- Department of Computer Engineering, Ege University, 35100, Bornova, Izmir, Turkey
| |
Collapse
|
5
|
Sarkar C, Das B, Rawat VS, Wahlang JB, Nongpiur A, Tiewsoh I, Lyngdoh NM, Das D, Bidarolli M, Sony HT. Artificial Intelligence and Machine Learning Technology Driven Modern Drug Discovery and Development. Int J Mol Sci 2023; 24:ijms24032026. [PMID: 36768346 PMCID: PMC9916967 DOI: 10.3390/ijms24032026] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/27/2022] [Accepted: 12/28/2022] [Indexed: 01/22/2023] Open
Abstract
The discovery and advances of medicines may be considered as the ultimate relevant translational science effort that adds to human invulnerability and happiness. But advancing a fresh medication is a quite convoluted, costly, and protracted operation, normally costing USD ~2.6 billion and consuming a mean time span of 12 years. Methods to cut back expenditure and hasten new drug discovery have prompted an arduous and compelling brainstorming exercise in the pharmaceutical industry. The engagement of Artificial Intelligence (AI), including the deep-learning (DL) component in particular, has been facilitated by the employment of classified big data, in concert with strikingly reinforced computing prowess and cloud storage, across all fields. AI has energized computer-facilitated drug discovery. An unrestricted espousing of machine learning (ML), especially DL, in many scientific specialties, and the technological refinements in computing hardware and software, in concert with various aspects of the problem, sustain this progress. ML algorithms have been extensively engaged for computer-facilitated drug discovery. DL methods, such as artificial neural networks (ANNs) comprising multiple buried processing layers, have of late seen a resurgence due to their capability to power automatic attribute elicitations from the input data, coupled with their ability to obtain nonlinear input-output pertinencies. Such features of DL methods augment classical ML techniques which bank on human-contrived molecular descriptors. A major part of the early reluctance concerning utility of AI in pharmaceutical discovery has begun to melt, thereby advancing medicinal chemistry. AI, along with modern experimental technical knowledge, is anticipated to invigorate the quest for new and improved pharmaceuticals in an expeditious, economical, and increasingly compelling manner. DL-facilitated methods have just initiated kickstarting for some integral issues in drug discovery. Many technological advances, such as "message-passing paradigms", "spatial-symmetry-preserving networks", "hybrid de novo design", and other ingenious ML exemplars, will definitely come to be pervasively widespread and help dissect many of the biggest, and most intriguing inquiries. Open data allocation and model augmentation will exert a decisive hold during the progress of drug discovery employing AI. This review will address the impending utilizations of AI to refine and bolster the drug discovery operation.
Collapse
Affiliation(s)
- Chayna Sarkar
- Department of Pharmacology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Biswadeep Das
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
- Correspondence: ; Tel./Fax: +91-135-708-856-0009
| | - Vikram Singh Rawat
- Department of Psychiatry, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| | - Julie Birdie Wahlang
- Department of Pharmacology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Arvind Nongpiur
- Department of Psychiatry, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Iadarilang Tiewsoh
- Department of Medicine, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Nari M. Lyngdoh
- Department of Anesthesiology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Debasmita Das
- Department of Computer Science and Engineering, Vellore Institute of Technology, Vellore Campus, Tiruvalam Road, Katpadi, Vellore 632014, Tamil Nadu, India
| | - Manjunath Bidarolli
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| | - Hannah Theresa Sony
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| |
Collapse
|
6
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
7
|
Li Y, Zhang R, Wang C, Forouhar F, Clarke OB, Vorobiev S, Singh S, Montelione GT, Szyperski T, Xu Y, Hunt JF. Oligomeric interactions maintain active-site structure in a noncooperative enzyme family. EMBO J 2022; 41:e108368. [PMID: 35801308 DOI: 10.15252/embj.2021108368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 04/07/2022] [Accepted: 04/16/2022] [Indexed: 11/09/2022] Open
Abstract
The evolutionary benefit accounting for widespread conservation of oligomeric structures in proteins lacking evidence of intersubunit cooperativity remains unclear. Here, crystal and cryo-EM structures, and enzymological data, demonstrate that a conserved tetramer interface maintains the active-site structure in one such class of proteins, the short-chain dehydrogenase/reductase (SDR) superfamily. Phylogenetic comparisons support a significantly longer polypeptide being required to maintain an equivalent active-site structure in the context of a single subunit. Oligomerization therefore enhances evolutionary fitness by reducing the metabolic cost of enzyme biosynthesis. The large surface area of the structure-stabilizing oligomeric interface yields a synergistic gain in fitness by increasing tolerance to activity-enhancing yet destabilizing mutations. We demonstrate that two paralogous SDR superfamily enzymes with different specificities can form mixed heterotetramers that combine their individual enzymological properties. This suggests that oligomerization can also diversify the functions generated by a given metabolic investment, enhancing the fitness advantage provided by this architectural strategy.
Collapse
Affiliation(s)
- Yaohui Li
- Key Laboratory of Industrial Biotechnology of Ministry of Education and School of Biotechnology, Jiangnan University, Wuxi, China.,Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA
| | - Rongzhen Zhang
- Key Laboratory of Industrial Biotechnology of Ministry of Education and School of Biotechnology, Jiangnan University, Wuxi, China
| | - Chi Wang
- Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA.,Cryo-Electron Microscopy Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Farhad Forouhar
- Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA.,Macromolecular Crystallography Shared Resource, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Oliver B Clarke
- Department of Physiology and Cellular Biophysics and Department of Anesthesiology, Columbia University Irving Medical Center, New York, NY, USA
| | - Sergey Vorobiev
- Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA
| | - Shikha Singh
- Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA
| | - Gaetano T Montelione
- Department of Chemistry & Chemical Biology and Center for Biotechnology & Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Thomas Szyperski
- Department of Chemistry, State University of New York at Buffalo, Buffalo, NY, USA
| | - Yan Xu
- Key Laboratory of Industrial Biotechnology of Ministry of Education and School of Biotechnology, Jiangnan University, Wuxi, China
| | - John F Hunt
- Department of Biological Sciences, 702 Sherman Fairchild Center, MC2434, Columbia University, New York, NY, USA
| |
Collapse
|
8
|
Tubiana J, Schneidman-Duhovny D, Wolfson HJ. ScanNet: A web server for structure-based prediction of protein binding sites with geometric deep learning. J Mol Biol 2022; 434:167758. [DOI: 10.1016/j.jmb.2022.167758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 07/18/2022] [Accepted: 07/19/2022] [Indexed: 11/28/2022]
|
9
|
ScanNet: an interpretable geometric deep learning model for structure-based protein binding site prediction. Nat Methods 2022; 19:730-739. [DOI: 10.1038/s41592-022-01490-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 04/12/2022] [Indexed: 11/08/2022]
|
10
|
Gao M, Nakajima An D, Parks JM, Skolnick J. AF2Complex predicts direct physical interactions in multimeric proteins with deep learning. Nat Commun 2022; 13:1744. [PMID: 35365655 PMCID: PMC8975832 DOI: 10.1038/s41467-022-29394-2] [Citation(s) in RCA: 107] [Impact Index Per Article: 53.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 03/15/2022] [Indexed: 12/20/2022] Open
Abstract
Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Remarkably accurate atomic structures have been recently computed for individual proteins by AlphaFold2 (AF2). Here, we demonstrate that the same neural network models from AF2 developed for single protein sequences can be adapted to predict the structures of multimeric protein complexes without retraining. In contrast to common approaches, our method, AF2Complex, does not require paired multiple sequence alignments. It achieves higher accuracy than some complex protein-protein docking strategies and provides a significant improvement over AF-Multimer, a development of AlphaFold for multimeric proteins. Moreover, we introduce metrics for predicting direct protein-protein interactions between arbitrary protein pairs and validate AF2Complex on some challenging benchmark sets and the E. coli proteome. Lastly, using the cytochrome c biogenesis system I as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system. Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Here the authors present AF2Complex and show that application to the E. coli cytochrome biogenesis system I yields confident computational models for three sought-after assemblies.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Atlanta, GA, USA.
| | - Davi Nakajima An
- School of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Atlanta, GA, USA.
| |
Collapse
|
11
|
Casadio R, Martelli PL, Savojardo C. Machine learning solutions for predicting protein–protein interactions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group University of Bologna Bologna Italy
| | | | | |
Collapse
|
12
|
Sridharan R, Krishnaswamy V, Kumar PS, Vidhya TA, Sivamurugan V, Kumar DT, Doss CGP, Vo DVN. Analysis and effective separation of toxic pollutants from water resources using MBBR: Pathway prediction using alkaliphilic P. mendocina. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 797:149135. [PMID: 34311373 DOI: 10.1016/j.scitotenv.2021.149135] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 07/11/2021] [Accepted: 07/14/2021] [Indexed: 06/13/2023]
Abstract
Azo dyes are highly toxic, which acts as a notable mutagen and carcinogen. This has a significant effect on human health, plants, animals, aquatic and terrestrial environments. Thus, the degradation of the azo dyes is exclusively studied using the conventional methods of which biodegradation is an eco-friendly approach. Hence, the present study is focused on the elucidation of reactive mixed azo dye degradation pathway using MBBR and laccase enzyme produced by an alkaliphilic bacterium P. mendocina. Synthetic wastewater treatment performed using MBBR was very effective which reduced the COD and BOD to 90 mg/L and 460 mg/L. The potential degrader P. mendocina was isolated and laccase enzyme was screened. Finally, the degradation pathway was elucidated. The in silico toxicity analysis predicted Reactive Red and Reactive Brown as developmental toxicants during Reactive Black as Developmental non-toxicant. Docking studies were performed to understand interaction of laccase with compounds evolved from dyes.
Collapse
Affiliation(s)
- Rajalakshmi Sridharan
- Department of Biotechnology, Stella Maris College (Autonomous) Affiliated to University of Madras, Chennai, Tamil Nadu 600 086, India
| | - Veenagayathri Krishnaswamy
- Department of Biotechnology, Stella Maris College (Autonomous) Affiliated to University of Madras, Chennai, Tamil Nadu 600 086, India.
| | - P Senthil Kumar
- Department of Chemical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai 603110, India; Centre of Excellence in Water Research (CEWAR), Sri Sivasubramaniya Nadar College of Engineering, Chennai 603110, India.
| | - T Akshaya Vidhya
- Department of Biotechnology, Stella Maris College (Autonomous) Affiliated to University of Madras, Chennai, Tamil Nadu 600 086, India
| | | | - D Thirumal Kumar
- Department of Bioinformatics, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai 602 105, India
| | - C George Priya Doss
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu 632 014, India
| | - Dai-Viet N Vo
- Institute of Environmental Sciences, Nguyen Tat Thanh University, Ho Chi Minh City, Viet Nam
| |
Collapse
|
13
|
Abstract
This review provides the feasible literature on drug discovery through ML tools and techniques that are enforced in every phase of drug development to accelerate the research process and deduce the risk and expenditure in clinical trials. Machine learning techniques improve the decision-making in pharmaceutical data across various applications like QSAR analysis, hit discoveries, de novo drug architectures to retrieve accurate outcomes. Target validation, prognostic biomarkers, digital pathology are considered under problem statements in this review. ML challenges must be applicable for the main cause of inadequacy in interpretability outcomes that may restrict the applications in drug discovery. In clinical trials, absolute and methodological data must be generated to tackle many puzzles in validating ML techniques, improving decision-making, promoting awareness in ML approaches, and deducing risk failures in drug discovery.
Collapse
Affiliation(s)
- Suresh Dara
- Department of Computer Science and Engineering, B V Raju Institute of Technology, Narsapur, Medak, 502313 Telangana India
| | - Swetha Dhamercherla
- Department of Computer Science and Engineering, B V Raju Institute of Technology, Narsapur, Medak, 502313 Telangana India
| | - Surender Singh Jadav
- Centre for Molecular Cancer Research (CMCR) and Vishnu Institute of Pharmaceutical Education and Research (VIPER), Narsapur, Medak, 502313 Telangana India
| | - CH Madhu Babu
- Department of Computer Science and Engineering, B V Raju Institute of Technology, Narsapur, Medak, 502313 Telangana India
| | - Mohamed Jawed Ahsan
- Department of Pharmaceutical Chemistry, Maharishi Arvind College of Pharmacy, Jaipur, 302023 Rajasthan India
| |
Collapse
|
14
|
Matos-Filipe P, Preto AJ, Koukos PI, Mourão J, Bonvin AMJJ, Moreira IS. MENSAdb: a thorough structural analysis of membrane protein dimers. Database (Oxford) 2021; 2021:baab013. [PMID: 33822911 PMCID: PMC8023553 DOI: 10.1093/database/baab013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 01/19/2021] [Accepted: 03/01/2021] [Indexed: 11/14/2022]
Abstract
Membrane proteins (MPs) are key players in a variety of different cellular processes and constitute the target of around 60% of all Food and Drug Administration-approved drugs. Despite their importance, there is still a massive lack of relevant structural, biochemical and mechanistic information mainly due to their localization within the lipid bilayer. To help fulfil this gap, we developed the MEmbrane protein dimer Novel Structure Analyser database (MENSAdb). This interactive web application summarizes the evolutionary and physicochemical properties of dimeric MPs to expand the available knowledge on the fundamental principles underlying their formation. Currently, MENSAdb contains features of 167 unique MPs (63% homo- and 37% heterodimers) and brings insights into the conservation of residues, accessible solvent area descriptors, average B-factors, intermolecular contacts at 2.5 Å and 4.0 Å distance cut-offs, hydrophobic contacts, hydrogen bonds, salt bridges, π-π stacking, T-stacking and cation-π interactions. The regular update and organization of all these data into a unique platform will allow a broad community of researchers to collect and analyse a large number of features efficiently, thus facilitating their use in the development of prediction models associated with MPs. Database URL: http://www.moreiralab.com/resources/mensadb.
Collapse
Affiliation(s)
- Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - António J Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
- PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research, University of Coimbra, Coimbra, 3030-789, Portugal
| | - Panagiotis I Koukos
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Coimbra, 3000-456, Portugal
- Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
15
|
He FF, Xin YY, Ma YX, Yang S, Fei H. Rational design to enhance the catalytic activity of 2-deoxy-D-ribose-5-phosphate aldolase from Pseudomonas syringae pv. syringae B728a. Protein Expr Purif 2021; 183:105863. [PMID: 33677085 DOI: 10.1016/j.pep.2021.105863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 02/26/2021] [Accepted: 02/28/2021] [Indexed: 11/24/2022]
Abstract
The 2-Deoxy-d-ribose-5-phosphate aldolase (DERA) enzyme in psychrophilic bacteria has gradually attracted the attention of researchers. A novel gene, deoC (681 bp), encoding DERAPsy, was identified in Pseudomonas syringae pv. syringae B728a, recombinantly expressed in E. coli BL21 and purified via affinity chromatography, which yielded a homodimeric enzyme of 23 kDa. The specific activity of DERAPsy toward 2-deoxy-d-ribose-5-phosphate (DR5P) was 7.37 ± 0.03 U/mg, and 61.32% of its initial activity remained after incubation in 300 mM acetaldehyde at 25 °C for 2 h. Based on the calculation results (dock binding free energy) with the ligand chloroacetaldehyde (CAH), five target substitutions (T16L, F69R, V66K, S188V, and G189R) were identified, in which the DERAPsy mutant (G189R) exhibited higher catalytic activity toward DR5P than DERAPsy. Only the DERAPsy mutant (V66K) exhibited 12% higher activity toward chloroacetaldehyde and acetaldehyde condensation reactions than DERAPsy. Fortunately, the aldehyde tolerance of these mutants exhibited no significant decline compared with the wild type. These results indicate an effective strategy for enhancing DERA activity.
Collapse
Affiliation(s)
- Fei-Fan He
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, 310018, China; Zhejiang Provincial Key Laboratory of Silkworm Bioreactor and Biomedicine, Zhejiang Sci-Tech University, Hangzhou, 310018, China
| | - Yi-Yao Xin
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, 310018, China
| | - Yuan-Xin Ma
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, 310018, China; Zhejiang Provincial Key Laboratory of Silkworm Bioreactor and Biomedicine, Zhejiang Sci-Tech University, Hangzhou, 310018, China
| | - Shun Yang
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, 310018, China.
| | - Hui Fei
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, 310018, China; Zhejiang Provincial Key Laboratory of Silkworm Bioreactor and Biomedicine, Zhejiang Sci-Tech University, Hangzhou, 310018, China.
| |
Collapse
|
16
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
17
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
18
|
Chakravarty D, McElfresh GW, Kundrotas PJ, Vakser IA. How to choose templates for modeling of protein complexes: Insights from benchmarking template-based docking. Proteins 2020; 88:1070-1081. [PMID: 31994759 DOI: 10.1002/prot.25875] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/07/2020] [Accepted: 01/22/2020] [Indexed: 01/01/2023]
Abstract
Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.
Collapse
Affiliation(s)
| | - G W McElfresh
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
| |
Collapse
|
19
|
Kundrotas PJ, Kotthoff I, Choi SW, Copeland MM, Vakser IA. Dockground Tool for Development and Benchmarking of Protein Docking Procedures. Methods Mol Biol 2020; 2165:289-300. [PMID: 32621232 DOI: 10.1007/978-1-0716-0708-4_17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| | - Ian Kotthoff
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Sherman W Choi
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Matthew M Copeland
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| |
Collapse
|
20
|
Gemovic B, Sumonja N, Davidovic R, Perovic V, Veljkovic N. Mapping of Protein-Protein Interactions: Web-Based Resources for Revealing Interactomes. Curr Med Chem 2019; 26:3890-3910. [PMID: 29446725 DOI: 10.2174/0929867325666180214113704] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 09/14/2017] [Accepted: 01/29/2018] [Indexed: 01/04/2023]
Abstract
BACKGROUND The significant number of protein-protein interactions (PPIs) discovered by harnessing concomitant advances in the fields of sequencing, crystallography, spectrometry and two-hybrid screening suggests astonishing prospects for remodelling drug discovery. The PPI space which includes up to 650 000 entities is a remarkable reservoir of potential therapeutic targets for every human disease. In order to allow modern drug discovery programs to leverage this, we should be able to discern complete PPI maps associated with a specific disorder and corresponding normal physiology. OBJECTIVE Here, we will review community available computational programs for predicting PPIs and web-based resources for storing experimentally annotated interactions. METHODS We compared the capacities of prediction tools: iLoops, Struck2Net, HOMCOS, COTH, PrePPI, InterPreTS and PRISM to predict recently discovered protein interactions. RESULTS We described sequence-based and structure-based PPI prediction tools and addressed their peculiarities. Additionally, since the usefulness of prediction algorithms critically depends on the quality and quantity of the experimental data they are built on; we extensively discussed community resources for protein interactions. We focused on the active and recently updated primary and secondary PPI databases, repositories specialized to the subject or species, as well as databases that include both experimental and predicted PPIs. CONCLUSION PPI complexes are the basis of important physiological processes and therefore, possible targets for cell-penetrating ligands. Reliable computational PPI predictions can speed up new target discoveries through prioritization of therapeutically relevant protein-protein complexes for experimental studies.
Collapse
Affiliation(s)
- Branislava Gemovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Neven Sumonja
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Radoslav Davidovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Vladimir Perovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Nevena Veljkovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
21
|
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 2019; 46:W296-W303. [PMID: 29788355 PMCID: PMC6030848 DOI: 10.1093/nar/gky427] [Citation(s) in RCA: 7048] [Impact Index Per Article: 1409.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 05/07/2018] [Indexed: 11/13/2022] Open
Abstract
Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
Collapse
Affiliation(s)
- Andrew Waterhouse
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Florian T Heer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Tjaart A P de Beer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Christine Rempfer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rosalba Lepore
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| |
Collapse
|
22
|
Verma R, Pandit SB. Unraveling the structural landscape of intra-chain domain interfaces: Implication in the evolution of domain-domain interactions. PLoS One 2019; 14:e0220336. [PMID: 31374091 PMCID: PMC6677297 DOI: 10.1371/journal.pone.0220336] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/12/2019] [Indexed: 12/22/2022] Open
Abstract
Intra-chain domain interactions are known to play a significant role in the function and stability of multidomain proteins. These interactions are mediated through a physical interaction at domain-domain interfaces (DDIs). With a motivation to understand evolution of interfaces, we have investigated similarities among DDIs. Even though interfaces of protein-protein interactions (PPIs) have been previously studied by structurally aligning interfaces, similar analyses have not yet been performed on DDIs of either multidomain proteins or PPIs. For studying the structural landscape of DDIs, we have used iAlign to structurally align intra-chain domain interfaces of domains. The interface alignment of spatially constrained domains (due to inter-domain linkers) showed that ~88% of these could identify a structural matching interface having similar C-alpha geometry and contact pattern despite that aligned domain pairs are not structurally related. Moreover, the mean interface similarity score (IS-score) is 0.307, which is higher compared to the average random IS-score (0.207) suggesting domain interfaces are not random. The structural space of DDIs is highly connected as ~84% of all possible directed edges among interfaces are found to have at most path length of 8 when 0.26 is IS-score threshold. At this threshold, ~83% of interfaces form the largest strongly connected component. Thus, suggesting that structural space of intra-chain domain interfaces is degenerate and highly connected, as has been found in PPI interfaces. Interestingly, searching for structural neighbors of inter-chain interfaces among intra-chain interfaces showed that ~86% could find a statistically significant match to intra-chain interface with a mean IS-score of 0.311. This implies that domain interfaces are degenerate whether formed within a protein or between proteins. The interface degeneracy is most likely due to limited possible ways of packing secondary structures. In principle, interface similarities can be exploited to accurately model domain interfaces in structure prediction of multidomain proteins.
Collapse
Affiliation(s)
- Rivi Verma
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
| | - Shashi Bhushan Pandit
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
- * E-mail:
| |
Collapse
|
23
|
Wang L, Wang HF, Liu SR, Yan X, Song KJ. Predicting Protein-Protein Interactions from Matrix-Based Protein Sequence Using Convolution Neural Network and Feature-Selective Rotation Forest. Sci Rep 2019; 9:9848. [PMID: 31285519 PMCID: PMC6614364 DOI: 10.1038/s41598-019-46369-4] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 06/10/2019] [Indexed: 01/09/2023] Open
Abstract
Protein is an essential component of the living organism. The prediction of protein-protein interactions (PPIs) has important implications for understanding the behavioral processes of life, preventing diseases, and developing new drugs. Although the development of high-throughput technology makes it possible to identify PPIs in large-scale biological experiments, it restricts the extensive use of experimental methods due to the constraints of time, cost, false positive rate and other conditions. Therefore, there is an urgent need for computational methods as a supplement to experimental methods to predict PPIs rapidly and accurately. In this paper, we propose a novel approach, namely CNN-FSRF, for predicting PPIs based on protein sequence by combining deep learning Convolution Neural Network (CNN) with Feature-Selective Rotation Forest (FSRF). The proposed method firstly converts the protein sequence into the Position-Specific Scoring Matrix (PSSM) containing biological evolution information, then uses CNN to objectively and efficiently extracts the deeply hidden features of the protein, and finally removes the redundant noise information by FSRF and gives the accurate prediction results. When performed on the PPIs datasets Yeast and Helicobacter pylori, CNN-FSRF achieved a prediction accuracy of 97.75% and 88.96%. To further evaluate the prediction performance, we compared CNN-FSRF with SVM and other existing methods. In addition, we also verified the performance of CNN-FSRF on independent datasets. Excellent experimental results indicate that CNN-FSRF can be used as a useful complement to biological experiments to identify protein interactions.
Collapse
Affiliation(s)
- Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, Shandong, 277100, P.R. China. .,Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, P.R. China.
| | - Hai-Feng Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, Shandong, 277100, P.R. China
| | - San-Rong Liu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, Shandong, 277100, P.R. China
| | - Xin Yan
- School of Foreign Languages, Zaozhuang University, Zaozhuang, Shandong, 277100, P.R. China.
| | - Ke-Jian Song
- School of information engineering, JiangXi University of Science and Technology, Ganzhou, Jiangxi, 341000, P.R. China
| |
Collapse
|
24
|
Heterologous expression and characterization of novel 2-Deoxy-d-ribose-5-phosphate aldolase (DERA) from Pyrobaculum calidifontis and Meiothermus ruber. Process Biochem 2019. [DOI: 10.1016/j.procbio.2019.02.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
25
|
Guven-Maiorov E, Tsai CJ, Ma B, Nussinov R. Interface-Based Structural Prediction of Novel Host-Pathogen Interactions. Methods Mol Biol 2019; 1851:317-335. [PMID: 30298406 PMCID: PMC8192064 DOI: 10.1007/978-1-4939-8736-8_18] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
About 20% of the cancer incidences worldwide have been estimated to be associated with infections. However, the molecular mechanisms of exactly how they contribute to host tumorigenesis are still unknown. To evade host defense, pathogens hijack host proteins at different levels: sequence, structure, motif, and binding surface, i.e., interface. Interface similarity allows pathogen proteins to compete with host counterparts to bind to a target protein, rewire physiological signaling, and result in persistent infections, as well as cancer. Identification of host-pathogen interactions (HPIs)-along with their structural details at atomic resolution-may provide mechanistic insight into pathogen-driven cancers and innovate therapeutic intervention. HPI data including structural details is scarce and large-scale experimental detection is challenging. Therefore, there is an urgent and mounting need for efficient and robust computational approaches to predict HPIs and their complex (bound) structures. In this chapter, we review the first and currently only interface-based computational approach to identify novel HPIs. The concept of interface mimicry promises to identify more HPIs than complete sequence or structural similarity. We illustrate this concept with a case study on Kaposi's sarcoma herpesvirus (KSHV) to elucidate how it subverts host immunity and helps contribute to malignant transformation of the host cells.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Buyong Ma
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA.
- Department of Human Genetics and Molecular Medicine, Sackler Inst. of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
26
|
Nadalin F, Carbone A. Protein-protein interaction specificity is captured by contact preferences and interface composition. Bioinformatics 2018; 34:459-468. [PMID: 29028884 PMCID: PMC5860360 DOI: 10.1093/bioinformatics/btx584] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 09/18/2017] [Indexed: 12/24/2022] Open
Abstract
Motivation Large-scale computational docking will be increasingly used in future years to discriminate protein–protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein–protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue–residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability and implementation CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francesca Nadalin
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France.,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
27
|
Fleming JR, Schupfner M, Busch F, Baslé A, Ehrmann A, Sterner R, Mayans O. Evolutionary Morphing of Tryptophan Synthase: Functional Mechanisms for the Enzymatic Channeling of Indole. J Mol Biol 2018; 430:5066-5079. [PMID: 30367843 DOI: 10.1016/j.jmb.2018.10.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Revised: 08/29/2018] [Accepted: 10/19/2018] [Indexed: 10/28/2022]
Abstract
Tryptophan synthase (TrpS) is a heterotetrameric αββα enzyme that exhibits complex substrate channeling and allosteric mechanisms and is a model system in enzymology. In this work, we characterize proposed early and late evolutionary states of TrpS and show that they have distinct quaternary structures caused by insertions-deletions of sequence segments (indels) in the β-subunit. Remarkably, indole hydrophobic channels that connect α and β active sites have re-emerged in both TrpS types, yet they follow different paths through the β-subunit fold. Also, both TrpS geometries activate the α-subunit through the rearrangement of loops flanking the active site. Our results link evolutionary sequence changes in the enzyme subunits with channeling and allostery in the TrpS enzymes. The findings demonstrate that indels allow protein quaternary architectures to escape "minima" in the evolutionary landscape, thereby overcoming the conservational constraints imposed by existing functional interfaces and being free to morph into new mechanistic enzymes.
Collapse
Affiliation(s)
| | - Michael Schupfner
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Florian Busch
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Arnaud Baslé
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK
| | - Alexander Ehrmann
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Reinhard Sterner
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany
| | - Olga Mayans
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany; Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK.
| |
Collapse
|
28
|
Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules 2018; 23:E1963. [PMID: 30082644 PMCID: PMC6222862 DOI: 10.3390/molecules23081963] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/03/2018] [Accepted: 08/04/2018] [Indexed: 12/14/2022] Open
Abstract
The advent of advanced molecular modeling software, big data analytics, and high-speed processing units has led to the exponential evolution of modern drug discovery and better insights into complex biological processes and disease networks. This has progressively steered current research interests to understanding protein-protein interaction (PPI) systems that are related to a number of relevant diseases, such as cancer, neurological illnesses, metabolic disorders, etc. However, targeting PPIs are challenging due to their "undruggable" binding interfaces. In this review, we focus on the current obstacles that impede PPI drug discovery, and how recent discoveries and advances in in silico approaches can alleviate these barriers to expedite the search for potential leads, as shown in several exemplary studies. We will also discuss about currently available information on PPI compounds and systems, along with their usefulness in molecular modeling. Finally, we conclude by presenting the limits of in silico application in drug discovery and offer a perspective in the field of computer-aided PPI drug discovery.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Shaherin Basith
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Nina Abigail B Clavio
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Hyerim Chang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| |
Collapse
|
29
|
Göktepe YE, Kodaz H. Prediction of Protein-Protein Interactions Using An Effective Sequence Based Combined Method. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.03.062] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
30
|
Artificial intelligence in drug design. SCIENCE CHINA-LIFE SCIENCES 2018; 61:1191-1204. [PMID: 30054833 DOI: 10.1007/s11427-018-9342-2] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 05/22/2018] [Indexed: 12/27/2022]
Abstract
Thanks to the fast improvement of the computing power and the rapid development of the computational chemistry and biology, the computer-aided drug design techniques have been successfully applied in almost every stage of the drug discovery and development pipeline to speed up the process of research and reduce the cost and risk related to preclinical and clinical trials. Owing to the development of machine learning theory and the accumulation of pharmacological data, the artificial intelligence (AI) technology, as a powerful data mining tool, has cut a figure in various fields of the drug design, such as virtual screening, activity scoring, quantitative structure-activity relationship (QSAR) analysis, de novo drug design, and in silico evaluation of absorption, distribution, metabolism, excretion and toxicity (ADME/T) properties. Although it is still challenging to provide a physical explanation of the AI-based models, it indeed has been acting as a great power to help manipulating the drug discovery through the versatile frameworks. Recently, due to the strong generalization ability and powerful feature extraction capability, deep learning methods have been employed in predicting the molecular properties as well as generating the desired molecules, which will further promote the application of AI technologies in the field of drug design.
Collapse
|
31
|
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 2018. [PMID: 29788355 DOI: 10.1093/nar/gky427.pmid:29788355] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023] Open
Abstract
Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
Collapse
Affiliation(s)
- Andrew Waterhouse
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Florian T Heer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Tjaart A P de Beer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Christine Rempfer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rosalba Lepore
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| |
Collapse
|
32
|
Geometric and amino acid type determinants for protein-protein interaction interfaces. QUANTITATIVE BIOLOGY 2018. [DOI: 10.1007/s40484-018-0138-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
33
|
Heterodimer Binding Scaffolds Recognition via the Analysis of Kinetically Hot Residues. Pharmaceuticals (Basel) 2018; 11:ph11010029. [PMID: 29547506 PMCID: PMC5874725 DOI: 10.3390/ph11010029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Revised: 03/06/2018] [Accepted: 03/08/2018] [Indexed: 12/13/2022] Open
Abstract
Physical interactions between proteins are often difficult to decipher. The aim of this paper is to present an algorithm that is designed to recognize binding patches and supporting structural scaffolds of interacting heterodimer proteins using the Gaussian Network Model (GNM). The recognition is based on the (self) adjustable identification of kinetically hot residues and their connection to possible binding scaffolds. The kinetically hot residues are residues with the lowest entropy, i.e., the highest contribution to the weighted sum of the fastest modes per chain extracted via GNM. The algorithm adjusts the number of fast modes in the GNM's weighted sum calculation using the ratio of predicted and expected numbers of target residues (contact and the neighboring first-layer residues). This approach produces very good results when applied to dimers with high protein sequence length ratios. The protocol's ability to recognize near native decoys was compared to the ability of the residue-level statistical potential of Lu and Skolnick using the Sternberg and Vakser decoy dimers sets. The statistical potential produced better overall results, but in a number of cases its predicting ability was comparable, or even inferior, to the prediction ability of the adjustable GNM approach. The results presented in this paper suggest that in heterodimers at least one protein has interacting scaffold determined by the immovable, kinetically hot residues. In many cases, interacting proteins (especially if being of noticeably different sizes) either behave as a rigid lock and key or, presumably, exhibit the opposite dynamic behavior. While the binding surface of one protein is rigid and stable, its partner's interacting scaffold is more flexible and adaptable.
Collapse
|
34
|
Škrbić T, Zamuner S, Hong R, Seno F, Laio A, Trovato A. Vibrational entropy estimation can improve binding affinity prediction for non-obligatory protein complexes. Proteins 2018; 86:393-404. [DOI: 10.1002/prot.25454] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 12/22/2017] [Accepted: 01/05/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Tatjana Škrbić
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Stefano Zamuner
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
| | - Rolando Hong
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Flavio Seno
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| | - Alessandro Laio
- Faculty of Physics; International School for Advanced Studies (SISSA/ISAS); Trieste Italy
| | - Antonio Trovato
- Department of Physics and Astronomy “Galileo Galilei”; University of Padova; Padova Italy
- Padova Section, National Institute of Nuclear Physics (INFN); Padova Italy
| |
Collapse
|
35
|
Structure-based prediction of ligand-protein interactions on a genome-wide scale. Proc Natl Acad Sci U S A 2017; 114:13685-13690. [PMID: 29229851 DOI: 10.1073/pnas.1705381114] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We report a template-based method, LT-scanner, which scans the human proteome using protein structural alignment to identify proteins that are likely to bind ligands that are present in experimentally determined complexes. A scoring function that rapidly accounts for binding site similarities between the template and the proteins being scanned is a crucial feature of the method. The overall approach is first tested based on its ability to predict the residues on the surface of a protein that are likely to bind small-molecule ligands. The algorithm that we present, LBias, is shown to compare very favorably to existing algorithms for binding site residue prediction. LT-scanner's performance is evaluated based on its ability to identify known targets of Food and Drug Administration (FDA)-approved drugs and it too proves to be highly effective. The specificity of the scoring function that we use is demonstrated by the ability of LT-scanner to identify the known targets of FDA-approved kinase inhibitors based on templates involving other kinases. Combining sequence with structural information further improves LT-scanner performance. The approach we describe is extendable to the more general problem of identifying binding partners of known ligands even if they do not appear in a structurally determined complex, although this will require the integration of methods that combine protein structure and chemical compound databases.
Collapse
|
36
|
Guven-Maiorov E, Tsai CJ, Ma B, Nussinov R. Prediction of Host-Pathogen Interactions for Helicobacter pylori by Interface Mimicry and Implications to Gastric Cancer. J Mol Biol 2017; 429:3925-3941. [PMID: 29106933 PMCID: PMC7906438 DOI: 10.1016/j.jmb.2017.10.023] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 10/16/2017] [Accepted: 10/16/2017] [Indexed: 02/07/2023]
Abstract
There is a strong correlation between some pathogens and certain cancer types. One example is Helicobacter pylori and gastric cancer. Exactly how they contribute to host tumorigenesis is, however, a mystery. Pathogens often interact with the host through proteins. To subvert defense, they may mimic host proteins at the sequence, structure, motif, or interface levels. Interface similarity permits pathogen proteins to compete with those of the host for a target protein and thereby alter the host signaling. Detection of host-pathogen interactions (HPIs) and mapping the re-wired superorganism HPI network-with structural details-can provide unprecedented clues to the underlying mechanisms and help therapeutics. Here, we describe the first computational approach exploiting solely interface mimicry to model potential HPIs. Interface mimicry can identify more HPIs than sequence or complete structural similarity since it appears more common than the other mimicry types. We illustrate the usefulness of this concept by modeling HPIs of H. pylori to understand how they modulate host immunity, persist lifelong, and contribute to tumorigenesis. H. pylori proteins interfere with multiple host pathways as they target several host hub proteins. Our results help illuminate the structural basis of resistance to apoptosis, immune evasion, and loss of cell junctions seen in H. pylori-infected host cells.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.
| | - Buyong Ma
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA; Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| |
Collapse
|
37
|
Jelínek J, Škoda P, Hoksza D. Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites. BMC Bioinformatics 2017; 18:492. [PMID: 29244012 PMCID: PMC5731498 DOI: 10.1186/s12859-017-1921-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. RESULTS We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. CONCLUSION In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.
Collapse
Affiliation(s)
- Jan Jelínek
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, Prague 2, Czech Republic
| | - Petr Škoda
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, Prague 2, Czech Republic
| | - David Hoksza
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, Prague 2, Czech Republic
| |
Collapse
|
38
|
Ur Rehman H, Bari I, Ali A, Mahmood H. A Bayesian approach for estimating protein-protein interactions by integrating structural and non-structural biological data. MOLECULAR BIOSYSTEMS 2017; 13:2592-2602. [PMID: 29028065 DOI: 10.1039/c7mb00484b] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Accurate elucidation of genome wide protein-protein interactions is crucial for understanding the regulatory processes of the cell. High-throughput techniques, such as the yeast-2-hybrid (Y2H) assay, co-immunoprecipitation (co-IP), mass spectrometric (MS) protein complex identification, affinity purification (AP) etc., are generally relied upon to determine protein interactions. Unfortunately, each type of method is inherently subject to different types of noise and results in false positive interactions. On the other hand, precise understanding of proteins, especially knowledge of their functional associations is necessary for understanding how complex molecular machines function. To solve this problem, computational techniques are generally relied upon to precisely predict protein interactions. In this work, we present a novel method that combines structural and non-structural biological data to precisely predict protein interactions. The conceptual novelty of our approach lies in identifying and precisely associating biological information that provides substantial interaction clues. Our model combines structural and non-structural information using Bayesian statistics to calculate the likelihood of each interaction. The proposed model is tested on Saccharomyces cerevisiae's interactions extracted from the DIP and IntAct databases and provides substantial improvements in terms of accuracy, precision, recall and F1 score, as compared with the most widely used related state-of-the-art techniques.
Collapse
Affiliation(s)
- Hafeez Ur Rehman
- Department of Computer Science, FAST National University of Computer & Emerging Sciences, Peshawar, Pakistan.
| | | | | | | |
Collapse
|
39
|
Kundrotas PJ, Anishchenko I, Dauzhenka T, Kotthoff I, Mnevets D, Copeland MM, Vakser IA. Dockground: A comprehensive data resource for modeling of protein complexes. Protein Sci 2017; 27:172-181. [PMID: 28891124 DOI: 10.1002/pro.3295] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 09/06/2017] [Accepted: 09/07/2017] [Indexed: 12/28/2022]
Abstract
Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ivan Anishchenko
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Taras Dauzhenka
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ian Kotthoff
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Daniil Mnevets
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Matthew M Copeland
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
40
|
Abstract
Hundreds of different species colonize multicellular organisms making them "metaorganisms". A growing body of data supports the role of microbiota in health and in disease. Grasping the principles of host-microbiota interactions (HMIs) at the molecular level is important since it may provide insights into the mechanisms of infections. The crosstalk between the host and the microbiota may help resolve puzzling questions such as how a microorganism can contribute to both health and disease. Integrated superorganism networks that consider host and microbiota as a whole-may uncover their code, clarifying perhaps the most fundamental question: how they modulate immune surveillance. Within this framework, structural HMI networks can uniquely identify potential microbial effectors that target distinct host nodes or interfere with endogenous host interactions, as well as how mutations on either host or microbial proteins affect the interaction. Furthermore, structural HMIs can help identify master host cell regulator nodes and modules whose tweaking by the microbes promote aberrant activity. Collectively, these data can delineate pathogenic mechanisms and thereby help maximize beneficial therapeutics. To date, challenges in experimental techniques limit large-scale characterization of HMIs. Here we highlight an area in its infancy which we believe will increasingly engage the computational community: predicting interactions across kingdoms, and mapping these on the host cellular networks to figure out how commensal and pathogenic microbiota modulate the host signaling and broadly cross-species consequences.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, United States of America
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, United States of America
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, United States of America
- Sackler Inst. of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
41
|
Bertoni M, Kiefer F, Biasini M, Bordoli L, Schwede T. Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology. Sci Rep 2017; 7:10480. [PMID: 28874689 PMCID: PMC5585393 DOI: 10.1038/s41598-017-09654-8] [Citation(s) in RCA: 479] [Impact Index Per Article: 68.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 07/28/2017] [Indexed: 01/01/2023] Open
Abstract
Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.
Collapse
Affiliation(s)
- Martino Bertoni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Florian Kiefer
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Marco Biasini
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Lorenza Bordoli
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland. .,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland.
| |
Collapse
|
42
|
Voitenko OS, Dhroso A, Feldmann A, Korkin D, Kalinina OV. Patterns of amino acid conservation in human and animal immunodeficiency viruses. Bioinformatics 2017; 32:i685-i692. [PMID: 27587690 DOI: 10.1093/bioinformatics/btw441] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Due to their high genomic variability, RNA viruses and retroviruses present a unique opportunity for detailed study of molecular evolution. Lentiviruses, with HIV being a notable example, are one of the best studied viral groups: hundreds of thousands of sequences are available together with experimentally resolved three-dimensional structures for most viral proteins. In this work, we use these data to study specific patterns of evolution of the viral proteins, and their relationship to protein interactions and immunogenicity. RESULTS We propose a method for identification of two types of surface residues clusters with abnormal conservation: extremely conserved and extremely variable clusters. We identify them on the surface of proteins from HIV and other animal immunodeficiency viruses. Both types of clusters are overrepresented on the interaction interfaces of viral proteins with other proteins, nucleic acids or low molecular-weight ligands, both in the viral particle and between the virus and its host. In the immunodeficiency viruses, the interaction interfaces are not more conserved than the corresponding proteins on an average, and we show that extremely conserved clusters coincide with protein-protein interaction hotspots, predicted as the residues with the largest energetic contribution to the interaction. Extremely variable clusters have been identified here for the first time. In the HIV-1 envelope protein gp120, they overlap with known antigenic sites. These antigenic sites also contain many residues from extremely conserved clusters, hence representing a unique interacting interface enriched both in extremely conserved and in extremely variable clusters of residues. This observation may have important implication for antiretroviral vaccine development. AVAILABILITY AND IMPLEMENTATION A Python package is available at https://bioinf.mpi-inf.mpg.de/publications/viral-ppi-pred/ CONTACT voitenko@mpi-inf.mpg.de or kalinina@mpi-inf.mpg.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Olga S Voitenko
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, Saarbrücken 66123, Germany, Graduate School for Computer Science, Saarland University, Campus E1 3, Saarbrücken 66123, Germany
| | - Andi Dhroso
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA
| | - Anna Feldmann
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, Saarbrücken 66123, Germany, Graduate School for Computer Science, Saarland University, Campus E1 3, Saarbrücken 66123, Germany
| | - Dmitry Korkin
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA
| | - Olga V Kalinina
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, Saarbrücken 66123, Germany
| |
Collapse
|
43
|
Garland J. Unravelling the complexity of signalling networks in cancer: A review of the increasing role for computational modelling. Crit Rev Oncol Hematol 2017; 117:73-113. [PMID: 28807238 DOI: 10.1016/j.critrevonc.2017.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 06/01/2017] [Accepted: 06/08/2017] [Indexed: 02/06/2023] Open
Abstract
Cancer induction is a highly complex process involving hundreds of different inducers but whose eventual outcome is the same. Clearly, it is essential to understand how signalling pathways and networks generated by these inducers interact to regulate cell behaviour and create the cancer phenotype. While enormous strides have been made in identifying key networking profiles, the amount of data generated far exceeds our ability to understand how it all "fits together". The number of potential interactions is astronomically large and requires novel approaches and extreme computation methods to dissect them out. However, such methodologies have high intrinsic mathematical and conceptual content which is difficult to follow. This review explains how computation modelling is progressively finding solutions and also revealing unexpected and unpredictable nano-scale molecular behaviours extremely relevant to how signalling and networking are coherently integrated. It is divided into linked sections illustrated by numerous figures from the literature describing different approaches and offering visual portrayals of networking and major conceptual advances in the field. First, the problem of signalling complexity and data collection is illustrated for only a small selection of known oncogenes. Next, new concepts from biophysics, molecular behaviours, kinetics, organisation at the nano level and predictive models are presented. These areas include: visual representations of networking, Energy Landscapes and energy transfer/dissemination (entropy); diffusion, percolation; molecular crowding; protein allostery; quinary structure and fractal distributions; energy management, metabolism and re-examination of the Warburg effect. The importance of unravelling complex network interactions is then illustrated for some widely-used drugs in cancer therapy whose interactions are very extensive. Finally, use of computational modelling to develop micro- and nano- functional models ("bottom-up" research) is highlighted. The review concludes that computational modelling is an essential part of cancer research and is vital to understanding network formation and molecular behaviours that are associated with it. Its role is increasingly essential because it is unravelling the huge complexity of cancer induction otherwise unattainable by any other approach.
Collapse
Affiliation(s)
- John Garland
- Manchester Interdisciplinary Biocentre, Manchester University, Manchester, UK.
| |
Collapse
|
44
|
Mirabello C, Wallner B. InterPred: A pipeline to identify and model protein-protein interactions. Proteins 2017; 85:1159-1170. [DOI: 10.1002/prot.25280] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 02/27/2017] [Accepted: 03/01/2017] [Indexed: 12/22/2022]
Affiliation(s)
- Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| |
Collapse
|
45
|
Chen J, Xie ZR, Wu Y. Understand protein functions by comparing the similarity of local structural environments. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1865:142-152. [PMID: 27884635 DOI: 10.1016/j.bbapap.2016.11.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 11/03/2016] [Accepted: 11/17/2016] [Indexed: 12/20/2022]
Abstract
The three-dimensional structures of proteins play an essential role in regulating binding between proteins and their partners, offering a direct relationship between structures and functions of proteins. It is widely accepted that the function of a protein can be determined if its structure is similar to other proteins whose functions are known. However, it is also observed that proteins with similar global structures do not necessarily correspond to the same function, while proteins with very different folds can share similar functions. This indicates that function similarity is originated from the local structural information of proteins instead of their global shapes. We assume that proteins with similar local environments prefer binding to similar types of molecular targets. In order to testify this assumption, we designed a new structural indicator to define the similarity of local environment between residues in different proteins. This indicator was further used to calculate the probability that a given residue binds to a specific type of structural neighbors, including DNA, RNA, small molecules and proteins. After applying the method to a large-scale non-redundant database of proteins, we show that the positive signal of binding probability calculated from the local structural indicator is statistically meaningful. In summary, our studies suggested that the local environment of residues in a protein is a good indicator to recognize specific binding partners of the protein. The new method could be a potential addition to a suite of existing template-based approaches for protein function prediction.
Collapse
Affiliation(s)
- Jiawen Chen
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY 10461, United States
| | - Zhong-Ru Xie
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY 10461, United States
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY 10461, United States.
| |
Collapse
|
46
|
Tonddast-Navaei S, Skolnick J. Are protein-protein interfaces special regions on a protein's surface? J Chem Phys 2016; 143:243149. [PMID: 26723634 DOI: 10.1063/1.4937428] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Protein-protein interactions (PPIs) are involved in many cellular processes. Experimentally obtained protein quaternary structures provide the location of protein-protein interfaces, the surface region of a given protein that interacts with another. These regions are termed half-interfaces (HIs). Canonical HIs cover roughly one third of a protein's surface and were found to have more hydrophobic residues than the non-interface surface region. In addition, the classical view of protein HIs was that there are a few (if not one) HIs per protein that are structurally and chemically unique. However, on average, a given protein interacts with at least a dozen others. This raises the question of whether they use the same or other HIs. By copying HIs from monomers with the same folds in solved quaternary structures, we introduce the concept of geometric HIs (HIs whose geometry has a significant match to other known interfaces) and show that on average they cover three quarters of a protein's surface. We then demonstrate that in some cases, these geometric HI could result in real physical interactions (which may or may not be biologically relevant). The composition of the new HIs is on average more charged compared to most known ones, suggesting that the current protein interface database is biased towards more hydrophobic, possibly more obligate, complexes. Finally, our results provide evidence for interface fuzziness and PPI promiscuity. Thus, the classical view of unique, well defined HIs needs to be revisited as HIs are another example of coarse-graining that is used by nature.
Collapse
Affiliation(s)
- Sam Tonddast-Navaei
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| |
Collapse
|
47
|
Im W, Liang J, Olson A, Zhou HX, Vajda S, Vakser IA. Challenges in structural approaches to cell modeling. J Mol Biol 2016; 428:2943-64. [PMID: 27255863 PMCID: PMC4976022 DOI: 10.1016/j.jmb.2016.05.024] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Revised: 05/19/2016] [Accepted: 05/24/2016] [Indexed: 11/17/2022]
Abstract
Computational modeling is essential for structural characterization of biomolecular mechanisms across the broad spectrum of scales. Adequate understanding of biomolecular mechanisms inherently involves our ability to model them. Structural modeling of individual biomolecules and their interactions has been rapidly progressing. However, in terms of the broader picture, the focus is shifting toward larger systems, up to the level of a cell. Such modeling involves a more dynamic and realistic representation of the interactomes in vivo, in a crowded cellular environment, as well as membranes and membrane proteins, and other cellular components. Structural modeling of a cell complements computational approaches to cellular mechanisms based on differential equations, graph models, and other techniques to model biological networks, imaging data, etc. Structural modeling along with other computational and experimental approaches will provide a fundamental understanding of life at the molecular level and lead to important applications to biology and medicine. A cross section of diverse approaches presented in this review illustrates the developing shift from the structural modeling of individual molecules to that of cell biology. Studies in several related areas are covered: biological networks; automated construction of three-dimensional cell models using experimental data; modeling of protein complexes; prediction of non-specific and transient protein interactions; thermodynamic and kinetic effects of crowding; cellular membrane modeling; and modeling of chromosomes. The review presents an expert opinion on the current state-of-the-art in these various aspects of structural modeling in cellular biology, and the prospects of future developments in this emerging field.
Collapse
Affiliation(s)
- Wonpil Im
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, United States.
| | - Arthur Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States.
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, United States.
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States.
| | - Ilya A Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| |
Collapse
|
48
|
Gao Y, Hao W, Gu J, Liu D, Fan C, Chen Z, Deng L. PredPhos: an ensemble framework for structure-based prediction of phosphorylation sites. ACTA ACUST UNITED AC 2016; 23:12. [PMID: 27437197 PMCID: PMC4943517 DOI: 10.1186/s40709-016-0042-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Background Post-translational modifications (PTMs) occur on almost all proteins and often strongly affect the functions of modified proteins. Phosphorylation is a crucial PTM mechanism with important regulatory functions in biological systems. Identifying the potential phosphorylation sites of a target protein may increase our understanding of the molecular processes in which it takes part. Results In this paper, we propose PredPhos, a computational method that can accurately predict both kinase-specific and non-kinase-specific phosphorylation sites by using optimally selected properties. The optimal combination of features was selected from a set of 153 novel structural neighborhood properties by a two-step feature selection method consisting of a random forest algorithm and a sequential backward elimination method. To overcome the imbalanced problem, we adopt an ensemble method, which combines bootstrap resampling technique, support vector machine-based fusion classifiers and majority voting strategy. We evaluate the proposed method using both tenfold cross validation and independent test. Results show that our method achieves a significant improvement on the prediction performance for both kinase-specific and non-kinase-specific phosphorylation sites. Conclusions The experimental results demonstrate that the proposed method is quite effective in predicting phosphorylation sites. Promising results are derived from the new structural neighborhood properties, the novel way of feature selection, as well as the ensemble method.
Collapse
Affiliation(s)
- Yong Gao
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China
| | - Weilin Hao
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China.,School of Electronics Engineering and Computer Science, Peking University, No. 5 Yiheyuan Road, Beijing, 100871 China
| | - Jing Gu
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China
| | - Diwei Liu
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China
| | - Chao Fan
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China
| | - Zhigang Chen
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China
| | - Lei Deng
- School of Software, Central South University, No. 22 Shaoshan South RD., Changsha, 410075 China.,Shanghai Key Laboratory of Intelligent Information Processing, No. 220 Handan Road, Shanghai, 200433 China
| |
Collapse
|
49
|
Champeimont R, Laine E, Hu SW, Penin F, Carbone A. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins. Sci Rep 2016; 6:26401. [PMID: 27198619 PMCID: PMC4873791 DOI: 10.1038/srep26401] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 05/03/2016] [Indexed: 12/20/2022] Open
Abstract
A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.
Collapse
Affiliation(s)
- Raphaël Champeimont
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Elodie Laine
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Shuang-Wei Hu
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Francois Penin
- CNRS, UMR5086, Bases Moléculaires et Structurales des Systèmes Infectieux, Institut de Biologie et Chimie des Protéines, 7 Passage du Vercors, Cedex 07, F-69367 Lyon, France
- LABEX Ecofect, Université de Lyon, Lyon, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
- Institut Universitaire de France, 75005, Paris, France
| |
Collapse
|
50
|
Maheshwari S, Brylinski M. Template-based identification of protein–protein interfaces using eFindSitePPI. Methods 2016; 93:64-71. [DOI: 10.1016/j.ymeth.2015.07.017] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/12/2015] [Accepted: 07/29/2015] [Indexed: 11/26/2022] Open
|