1
|
Fallah Atanaki F, Behrouzi S, Ariaeenejad S, Boroomand A, Kavousi K. BIPEP: Sequence-based Prediction of Biofilm Inhibitory Peptides Using a Combination of NMR and Physicochemical Descriptors. ACS OMEGA 2020; 5:7290-7297. [PMID: 32280870 PMCID: PMC7144140 DOI: 10.1021/acsomega.9b04119] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 03/12/2020] [Indexed: 05/26/2023]
Abstract
Biofilms are biological systems that are formed by a community of microorganisms in which microbial cells are connected on a surface within a self-produced matrix of an extracellular polymeric substance. On some occasions, microorganisms use biofilms to protect themselves against the harmful effects of the host body immune system and the surrounding environment, hence increasing their chances of survival against the various anti-microbial agents. Biofilms play a crucial role in medicine and industry because of the problems they cause. Designing agents that inhibit bacterial biofilm formation is very costly and takes too much time in the laboratory to be discovered and validated. Therefore, developing computational tools for the prediction of biofilm inhibitor peptides is inevitable and important. Here, we present a computational prediction tool to screen the vast number of peptide sequences and select potential candidate peptides for further lab experiments and validation. In this learning model, different feature vectors, extracted from the peptide primary structure, are exploited to learn patterns from the sequence of biofilm inhibitory peptides. Various classification algorithms including SVM, random forest, and k-nearest neighbor have been examined to evaluate their performance. Overall, our approach showed better prediction in comparison with other prediction methods. In this study, for the first time, we applied features extracted from NMR spectra of amino acids along with physicochemical features. Although each group of features showed good discrimination potential alone, we used a combination of features to enhance the performance of our method. Our prediction tool is freely available.
Collapse
Affiliation(s)
- Fereshteh Fallah Atanaki
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 1417466191, Iran
| | - Saman Behrouzi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 1417466191, Iran
| | - Shohreh Ariaeenejad
- Department
of Systems and Synthetic Biology, Agricultural
Biotechnology Research Institute of Iran (ABRII), Agricultural Research,
Education, and Extension Organization (AREEO), Karaj 31535-1897, Iran
| | - Amin Boroomand
- School
of Natural Sciences, University of California
Merced, Merced 95343-5001, California, United States of America
| | - Kaveh Kavousi
- Laboratory
of Complex Biological Systems and Bioinformatics (CBB), Department
of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran 1417466191, Iran
| |
Collapse
|
2
|
Revisiting the "satisfaction of spatial restraints" approach of MODELLER for protein homology modeling. PLoS Comput Biol 2019; 15:e1007219. [PMID: 31846452 PMCID: PMC6938380 DOI: 10.1371/journal.pcbi.1007219] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 12/31/2019] [Accepted: 11/13/2019] [Indexed: 01/02/2023] Open
Abstract
The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the “modeling by satisfaction of spatial restraints” strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program’s predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER’s objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance. Proteins are fundamental biological molecules that carry out countless activities in living beings. Since the function of proteins is dictated by their three-dimensional atomic structures, acquiring structural details of proteins provides deep insights into their function. Currently, the most frequently used computational approach for protein structure prediction is template-based modeling. In this approach, a target protein is modeled using the experimentally-derived structural information of a template protein assumed to have a similar structure to the target. MODELLER is the most frequently used program for template-based 3D model building. Despite its success, its predictions are not always accurate enough to be useful in Biomedical Research. Here, we show that it is possible to greatly increase the performance of MODELLER by modifying two aspects of its algorithm. First, we demonstrate that providing the program with accurate estimations of local target-template structural divergence greatly increases the quality of its predictions. Additionally, we show that modifying MODELLER’s scoring function with statistical potential energetic terms also helps to improve modeling quality. This work will be useful in future research, since it reports practical strategies to improve the performance of this core tool in Structural Bioinformatics.
Collapse
|
3
|
Cheng Q, Joung I, Lee J, Kuwajima K, Lee J. Exploring the Folding Mechanism of Small Proteins GB1 and LB1. J Chem Theory Comput 2019; 15:3432-3449. [DOI: 10.1021/acs.jctc.8b01163] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Qianyi Cheng
- Department of Chemistry, University of Memphis, Memphis, Tennessee 38152, United States
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, South Korea
| | - InSuk Joung
- Department of Chemistry, Kangwon National University, Chuncheon 24341, South Korea
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, South Korea
| | - Juyong Lee
- Department of Chemistry, Kangwon National University, Chuncheon 24341, South Korea
| | - Kunihiro Kuwajima
- Department of Physics, University of Tokyo, Tokyo 113-0033, Japan
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, South Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul 02455, South Korea
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, South Korea
| |
Collapse
|
4
|
Manavalan B, Shin TH, Kim MO, Lee G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front Immunol 2018; 9:1783. [PMID: 30108593 PMCID: PMC6079197 DOI: 10.3389/fimmu.2018.01783] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 07/19/2018] [Indexed: 02/03/2023] Open
Abstract
Proinflammatory cytokines have the capacity to increase inflammatory reaction and play a central role in first line of defence against invading pathogens. Proinflammatory inducing peptides (PIPs) have been used as an antineoplastic agent, an antibacterial agent and a vaccine in immunization therapies. Due to the advancement in sequence technologies that resulted an avalanche of protein sequence data. Therefore, it is necessary to develop an automated computational method to enable fast and accurate identification of novel PIPs within the vast number of candidate proteins and peptides. To address this, we proposed a new predictor, PIP-EL, for predicting PIPs using the strategy of ensemble learning (EL). Our benchmarking dataset is imbalanced. Thus, we applied a random under-sampling technique to generate 10 balanced models for each composition. Technically, PIP-EL is the fusion of 50 independent random forest (RF) models, where each of the five different compositions, including amino acid, dipeptide, composition-transition-distribution, physicochemical properties, and amino acid index contains 10 RF models. PIP-EL achieves the Matthews' correlation coefficient (MCC) of 0.435 in a 5-fold cross-validation test, which is ~2-5% higher than that of the individual classifiers and hybrid feature-based classifier. Furthermore, we evaluate the performance of PIP-EL on the independent dataset, showing that our method outperforms the existing method and two different machine learning methods developed in this study, with an MCC of 0.454. These results indicate that PIP-EL will be a useful tool for predicting PIPs and for researchers working in the field of peptide therapeutics and immunotherapy. The user-friendly web server, PIP-EL, is freely accessible.
Collapse
Affiliation(s)
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
5
|
Manavalan B, Shin TH, Kim MO, Lee G. AIPpred: Sequence-Based Prediction of Anti-inflammatory Peptides Using Random Forest. Front Pharmacol 2018; 9:276. [PMID: 29636690 PMCID: PMC5881105 DOI: 10.3389/fphar.2018.00276] [Citation(s) in RCA: 117] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 03/12/2018] [Indexed: 12/31/2022] Open
Abstract
The use of therapeutic peptides in various inflammatory diseases and autoimmune disorders has received considerable attention; however, the identification of anti-inflammatory peptides (AIPs) through wet-lab experimentation is expensive and often time consuming. Therefore, the development of novel computational methods is needed to identify potential AIP candidates prior to in vitro experimentation. In this study, we proposed a random forest (RF)-based method for predicting AIPs, called AIPpred (AIP predictor in primary amino acid sequences), which was trained with 354 optimal features. First, we systematically studied the contribution of individual composition [amino acid-, dipeptide composition (DPC), amino acid index, chain-transition-distribution, and physicochemical properties] in AIP prediction. Since the performance of the DPC-based model is significantly better than that of other composition-based models, we applied a feature selection protocol on this model and identified the optimal features. AIPpred achieved an area under the curve (AUC) value of 0.801 in a 5-fold cross-validation test, which was ∼2% higher than that of the control RF predictor trained with all DPC composition features, indicating the efficiency of the feature selection protocol. Furthermore, we evaluated the performance of AIPpred on an independent dataset, with results showing that our method outperformed an existing method, as well as 3 different machine learning methods developed in this study, with an AUC value of 0.814. These results indicated that AIPpred will be a useful tool for predicting AIPs and might efficiently assist the development of AIP therapeutics and biomedical research. AIPpred is freely accessible at www.thegleelab.org/AIPpred.
Collapse
Affiliation(s)
| | - Tae H Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong O Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
6
|
Manavalan B, Lee J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics 2018; 33:2496-2503. [PMID: 28419290 DOI: 10.1093/bioinformatics/btx222] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 04/12/2017] [Indexed: 01/03/2023] Open
Abstract
Motivation The accurate ranking of predicted structural models and selecting the best model from a given candidate pool remain as open problems in the field of structural bioinformatics. The quality assessment (QA) methods used to address these problems can be grouped into two categories: consensus methods and single-model methods. Consensus methods in general perform better and attain higher correlation between predicted and true quality measures. However, these methods frequently fail to generate proper quality scores for native-like structures which are distinct from the rest of the pool. Conversely, single-model methods do not suffer from this drawback and are better suited for real-life applications where many models from various sources may not be readily available. Results In this study, we developed a support-vector-machine-based single-model global quality assessment (SVMQA) method. For a given protein model, the SVMQA method predicts TM-score and GDT_TS score based on a feature vector containing statistical potential energy terms and consistency-based terms between the actual structural features (extracted from the three-dimensional coordinates) and predicted values (from primary sequence). We trained SVMQA using CASP8, CASP9 and CASP10 targets and determined the machine parameters by 10-fold cross-validation. We evaluated the performance of our SVMQA method on various benchmarking datasets. Results show that SVMQA outperformed the existing best single-model QA methods both in ranking provided protein models and in selecting the best model from the pool. According to the CASP12 assessment, SVMQA was the best method in selecting good-quality models from decoys in terms of GDTloss. Availability and implementation SVMQA method can be freely downloaded from http://lee.kias.re.kr/SVMQA/SVMQA_eval.tar.gz. Contact jlee@kias.re.kr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Balachandran Manavalan
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study, Seoul 130-722, Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study, Seoul 130-722, Korea
| |
Collapse
|
7
|
Manavalan B, Basith S, Shin TH, Choi S, Kim MO, Lee G. MLACP: machine-learning-based prediction of anticancer peptides. Oncotarget 2017; 8:77121-77136. [PMID: 29100375 PMCID: PMC5652333 DOI: 10.18632/oncotarget.20365] [Citation(s) in RCA: 170] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 07/13/2017] [Indexed: 01/25/2023] Open
Abstract
Cancer is the second leading cause of death globally, and use of therapeutic peptides to target and kill cancer cells has received considerable attention in recent years. Identification of anticancer peptides (ACPs) through wet-lab experimentation is expensive and often time consuming; therefore, development of an efficient computational method is essential to identify potential ACP candidates prior to in vitro experimentation. In this study, we developed support vector machine- and random forest-based machine-learning methods for the prediction of ACPs using the features calculated from the amino acid sequence, including amino acid composition, dipeptide composition, atomic composition, and physicochemical properties. We trained our methods using the Tyagi-B dataset and determined the machine parameters by 10-fold cross-validation. Furthermore, we evaluated the performance of our methods on two benchmarking datasets, with our results showing that the random forest-based method outperformed the existing methods with an average accuracy and Matthews correlation coefficient value of 88.7% and 0.78, respectively. To assist the scientific community, we also developed a publicly accessible web server at www.thegleelab.org/MLACP.html.
Collapse
Affiliation(s)
| | - Shaherin Basith
- College of Pharmacy, Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, Republic of Korea
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Sun Choi
- College of Pharmacy, Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, Republic of Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, Republic of Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| |
Collapse
|
8
|
Karczyńska AS, Czaplewski C, Krupa P, Mozolewska MA, Joo K, Lee J, Liwo A. Ergodicity and model quality in template-restrained canonical and temperature/Hamiltonian replica exchange coarse-grained molecular dynamics simulations of proteins. J Comput Chem 2017; 38:2730-2746. [DOI: 10.1002/jcc.25070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Revised: 07/10/2017] [Accepted: 09/01/2017] [Indexed: 01/22/2023]
Affiliation(s)
- Agnieszka S. Karczyńska
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Cezary Czaplewski
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
| | - Paweł Krupa
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Institute of Physics, Polish Academy of Sciences, Aleja Lotników 32/46; Warsaw PL 02668 Poland
| | - Magdalena A. Mozolewska
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5; Warsaw 01-248 Poland
| | - Keehyoung Joo
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Adam Liwo
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
| |
Collapse
|
9
|
Mozolewska MA, Krupa P, Zaborowski B, Liwo A, Lee J, Joo K, Czaplewski C. Use of Restraints from Consensus Fragments of Multiple Server Models To Enhance Protein-Structure Prediction Capability of the UNRES Force Field. J Chem Inf Model 2016; 56:2263-2279. [DOI: 10.1021/acs.jcim.6b00189] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
| | - Paweł Krupa
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | | | - Adam Liwo
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
- Center
for In Silico Protein Structure and School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Jooyoung Lee
- Center
for In Silico Protein Structure and School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Keehyoung Joo
- Center
for Advanced Computation, Korea Institute for Advanced Study, 85
Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Cezary Czaplewski
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
10
|
Huang BFF, Boutros PC. The parameter sensitivity of random forests. BMC Bioinformatics 2016; 17:331. [PMID: 27586051 PMCID: PMC5009551 DOI: 10.1186/s12859-016-1228-x] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 08/26/2016] [Indexed: 02/07/2023] Open
Abstract
Background The Random Forest (RF) algorithm for supervised machine learning is an ensemble learning method widely used in science and many other fields. Its popularity has been increasing, but relatively few studies address the parameter selection process: a critical step in model fitting. Due to numerous assertions regarding the performance reliability of the default parameters, many RF models are fit using these values. However there has not yet been a thorough examination of the parameter-sensitivity of RFs in computational genomic studies. We address this gap here. Results We examined the effects of parameter selection on classification performance using the RF machine learning algorithm on two biological datasets with distinct p/n ratios: sequencing summary statistics (low p/n) and microarray-derived data (high p/n). Here, p, refers to the number of variables and, n, the number of samples. Our findings demonstrate that parameterization is highly correlated with prediction accuracy and variable importance measures (VIMs). Further, we demonstrate that different parameters are critical in tuning different datasets, and that parameter-optimization significantly enhances upon the default parameters. Conclusions Parameter performance demonstrated wide variability on both low and high p/n data. Therefore, there is significant benefit to be gained by model tuning RFs away from their default parameter settings. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1228-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Barbara F F Huang
- Informatics and Bio-computing Program, Ontario Institute for Cancer Research, Toronto, Canada
| | - Paul C Boutros
- Informatics and Bio-computing Program, Ontario Institute for Cancer Research, Toronto, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, Canada. .,Department of Pharmacology and Toxicology, University of Toronto, Toronto, Canada. .,MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, M5G 0A3, Canada.
| |
Collapse
|
11
|
Joung I, Lee SY, Cheng Q, Kim JY, Joo K, Lee SJ, Lee J. Template-free modeling by LEE and LEER in CASP11. Proteins 2015; 84 Suppl 1:118-30. [DOI: 10.1002/prot.24944] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/26/2015] [Accepted: 10/11/2015] [Indexed: 12/25/2022]
Affiliation(s)
- InSuk Joung
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Sun Young Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Qianyi Cheng
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Jong Yun Kim
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Keehyoung Joo
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Sung Jong Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Department of Physics; University of Suwon; Hwaseong-Si Gyeonggi-Do 445-743 Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
| |
Collapse
|
12
|
Joo K, Joung I, Lee SY, Kim JY, Cheng Q, Manavalan B, Joung JY, Heo S, Lee J, Nam M, Lee IH, Lee SJ, Lee J. Template based protein structure modeling by global optimization in CASP11. Proteins 2015; 84 Suppl 1:221-32. [PMID: 26329522 DOI: 10.1002/prot.24917] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/04/2015] [Accepted: 08/21/2015] [Indexed: 11/11/2022]
Abstract
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Keehyoung Joo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - InSuk Joung
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Sun Young Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Yun Kim
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Qianyi Cheng
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Balachandran Manavalan
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Young Joung
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Seungryong Heo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Juyong Lee
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20852
| | - Mikyung Nam
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - In-Ho Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Korea Research Institute of Standards and Science (KRISS), Seoul, 305-600, Korea
| | - Sung Jong Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Department of Physics, University of Suwon, Hwaseong-Si, Gyeonggi-Do, 445-743, Korea
| | - Jooyoung Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea.
| |
Collapse
|
13
|
Lee J, Joo K, Brooks BR, Lee J. The Atomistic Mechanism of Conformational Transition of Adenylate Kinase Investigated by Lorentzian Structure-Based Potential. J Chem Theory Comput 2015; 11:3211-24. [PMID: 26575758 DOI: 10.1021/acs.jctc.5b00268] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a new all-atom structure-based method to study protein conformational transitions using Lorentzian attractive interactions based on native structures. The variability of each native contact is estimated based on evolutionary information using a machine learning method. To test the validity of this approach, we have investigated the conformational transition of adenylate kinase (ADK). The intrinsic boundedness of the Lorentzian attractive interactions facilitated frequent conformational transitions, and consequently we were able to observe more than 1000 structural interconversions between the open and closed states of ADK out of a total of 6 μs MD simulations. ADK has three domains: the nucleoside monophosphate (NMP) binding domain, the LID-domain, and the CORE domain, which catalyze the interconversion between ATP and ADP. We identified two transition states: a more frequent LID-closed-NMP-open (TS1) state and a less frequent LID-open-NMP-closed (TS2) state. The transition was found to be symmetric in both directions via TS1. We also obtained an off-pathway metastable state that was previously observed with physics-based all-atom simulations but not with coarse-grained models. In the metastable state, the LID domain was slightly twisted and formed contacts with the NMP domain. Our model correctly identified a total of 14 out of the top 16 residues with highest fluctuation by NMR experiment, thus showing excellent agreement with experimental NMR relaxation data and overwhelmingly better results than existing models.
Collapse
Affiliation(s)
- Juyong Lee
- School of Computational Sciences, Korea Institute for Advanced Study , Dongdaemun-gu, Seoul 130-722, Korea.,Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health , Bethesda, Maryland 20852, United States
| | - Keehyoung Joo
- Center for In Silico Protein Science, Korea Institute for Advanced Study , Dongdaemun-gu, Seoul 130-722, Korea.,Center for Advanced Computation, Korea Institute for Advanced Study , Dongdaemun-gu, Seoul 130-722, Korea
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health , Bethesda, Maryland 20852, United States
| | - Jooyoung Lee
- School of Computational Sciences, Korea Institute for Advanced Study , Dongdaemun-gu, Seoul 130-722, Korea.,Center for In Silico Protein Science, Korea Institute for Advanced Study , Dongdaemun-gu, Seoul 130-722, Korea
| |
Collapse
|