1
|
Su J, Zhou P. Musical protein: Mapping the time sequence of music onto the spatial architecture of proteins. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 252:108233. [PMID: 38781810 DOI: 10.1016/j.cmpb.2024.108233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Revised: 05/13/2024] [Accepted: 05/16/2024] [Indexed: 05/25/2024]
Abstract
BACKGROUND AND OBJECTIVE Music, the ubiquitous language across human cultures, is traditionally considered as a form of art but has been linked to biomolecules in recent years. However, previous efforts have only been addressed on sonification of nucleic acids and proteins to produce so-called life music, the soundscape from the basic building blocks of life. In this study, we attempted to, for the first time, conduct a reverse operation of this process, i.e. conversion of music to protein (CoMtP). METHODS A novel notion termed musical protein (MP) -- the protein defined by music -- was proposed and, on this basis, we described a computational strategy to map the time sequence of music onto the spatial architecture of proteins, which considered that each note in the stave of a music (target) can be simply characterized by two acoustical quantities and that each residue in the primary sequence of a protein (hit) was represented by amino acid descriptors. RESULTS A simulated annealing (SA) algorithm was applied to iteratively generate the best matched MP hit for a music target and structural bioinformatics was then used to model spatial advanced structure for the resulting MP. We also demonstrated that some small MPs derived from music segments may have potential biological functions, which, for example, can serve as antimicrobial peptides (AMPs) to inhibit clinical bacterial strains with moderate or high antibacterial potency. CONCLUSIONS This work may benefit many aspects; for example, it would open a door for the hearing-impaired persons to 'listen' music in a biological vision and could be a mean of exposing students to the concepts of biomolecules at an earlier age through the use of auditory characteristics. The CoMtP would also facilitate the rational design of proteins with biological and medicinal significance.
Collapse
Affiliation(s)
- Jun Su
- College of Music, Chengdu Normal University, No.99 Haike Road East Section, Wenjiang District, Chengdu 611130, China.
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No.2006 Xiyuan Ave West Hi-Tech Zone, Chengdu 611731, China.
| |
Collapse
|
2
|
Hess R, Faessler J, Yun D, Mama A, Saleh D, Grosch JH, Wang G, Schwab T, Hubbuch J. Predicting multimodal chromatography of therapeutic antibodies using multiscale modeling. J Chromatogr A 2024; 1718:464706. [PMID: 38335881 DOI: 10.1016/j.chroma.2024.464706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 01/30/2024] [Accepted: 01/31/2024] [Indexed: 02/12/2024]
Abstract
Multimodal chromatography has emerged as a powerful method for the purification of therapeutic antibodies. However, process development of this separation technique remains challenging because of an intricate and molecule-specific interaction towards multimodal ligands, leading to time-consuming and costly experimental optimization. This study presents a multiscale modeling approach to predict the multimodal chromatographic behavior of therapeutic antibodies based on their sequence information. Linear gradient elution (LGE) experiments were performed on an anionic multimodal resin for 59 full-length antibodies, including five different antibody formats at pH 5.0, 6.0, and 7.0 that were used for parameter determination of a linear adsorption model at low loading density conditions. Quantitative structure-property relationship (QSPR) modeling was utilized to correlate the adsorption parameters with up to 1374 global and local physicochemical descriptors calculated from antibody homology models. The final QSPR models employed less than eight descriptors per model and demonstrated high training accuracy (R² > 0.93) and reasonable test set prediction accuracy (Q² > 0.83) for the adsorption parameters. Model evaluation revealed the significance of electrostatic interaction and hydrophobicity in determining the chromatographic behavior of antibodies, as well as the importance of the HFR3 region in antibody binding to the multimodal resin. Chromatographic simulations using the predicted adsorption parameters showed good agreement with the experimental data for the vast majority of antibodies not employed during the model training. The results of this study demonstrate the potential of sequence-based prediction for determining chromatographic behavior in therapeutic antibody purification. This approach leads to more efficient and cost-effective process development, providing a valuable tool for the biopharmaceutical industry.
Collapse
Affiliation(s)
- Rudger Hess
- Karlsruhe Institute of Technology (KIT), Institute of Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe, Germany; DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jan Faessler
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Doil Yun
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Ahmed Mama
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - David Saleh
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jan-Hendrik Grosch
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Gang Wang
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Thomas Schwab
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jürgen Hubbuch
- Karlsruhe Institute of Technology (KIT), Institute of Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe, Germany.
| |
Collapse
|
3
|
Hess R, Faessler J, Yun D, Saleh D, Grosch JH, Schwab T, Hubbuch J. Antibody sequence-based prediction of pH gradient elution in multimodal chromatography. J Chromatogr A 2023; 1711:464437. [PMID: 37865026 DOI: 10.1016/j.chroma.2023.464437] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 10/03/2023] [Accepted: 10/05/2023] [Indexed: 10/23/2023]
Abstract
Multimodal chromatography has emerged as a promising technique for antibody purification, owing to its capacity to selectively capture and separate target molecules. However, the optimization of chromatography parameters remains a challenge due to the intricate nature of protein-ligand interactions. To tackle this issue, efficient predictive tools are essential for the development and optimization of multimodal chromatography processes. In this study, we introduce a methodology that predicts the elution behavior of antibodies in multimodal chromatography based on their amino acid sequences. We analyzed a total of 64 full-length antibodies, including IgG1, IgG4, and IgG-like multispecific formats, which were eluted using linear pH gradients from pH 9.0 to 4.0 on the anionic mixed-mode resin Capto adhere. Homology models were constructed, and 1312 antibody-specific physicochemical descriptors were calculated for each molecule. Our analysis identified six key structural features of the multimodal antibody interaction, which were correlated with the elution behavior, emphasizing the antibody variable region. The results show that our methodology can predict pH gradient elution for a diverse range of antibodies and antibody formats, with a test set R² of 0.898. The developed model can inform process development by predicting initial conditions for multimodal elution, thereby reducing trial and error during process optimization. Furthermore, the model holds the potential to enable an in silico manufacturability assessment by screening target antibodies that adhere to standardized purification conditions. In conclusion, this study highlights the feasibility of using structure-based prediction to enhance antibody purification in the biopharmaceutical industry. This approach can lead to more efficient and cost-effective process development while increasing process understanding.
Collapse
Affiliation(s)
- Rudger Hess
- Institute of Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany; DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jan Faessler
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Doil Yun
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - David Saleh
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jan-Hendrik Grosch
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Thomas Schwab
- DSP Development, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Jürgen Hubbuch
- Institute of Engineering in Life Sciences, Section IV: Biomolecular Separation Engineering, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
| |
Collapse
|
4
|
Lin J, Wen L, Zhou Y, Wang S, Ye H, Su J, Li J, Shu J, Huang J, Zhou P. PepQSAR: a comprehensive data source and information platform for peptide quantitative structure-activity relationships. Amino Acids 2023; 55:235-242. [PMID: 36474016 DOI: 10.1007/s00726-022-03219-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022]
Abstract
Peptide quantitative structure-activity relationships (pQSARs) have been widely applied to the statistical modeling and empirical prediction of peptide activity, property and feature. In the procedure, the peptide structure is characterized at sequence level using amino acid descriptors (AADs) and then correlated with observations by machine learning methods (MLMs), consequently resulting in a variety of quantitative regression models used to explain the structural factors that govern peptide activities, to generalize peptide properties of unknown from known samples, and to design new peptides with desired features. In this study, we developed a comprehensive platform, termed PepQSAR database, which is a systematic collection and decomposition of various data sources and abundant information regarding the pQSARs, including AADs, MLMs, data sets, peptide sequences, measured activities, model statistics, and literatures. The database also provides a comparison function for the various previously built pQSAR models reported by different groups via distinct approaches. The structured and searchable PepQSAR database is expected to provide a useful resource and powerful tool for the computational peptidology community, which is freely available at http://i.uestc.edu.cn/PQsarDB .
Collapse
Affiliation(s)
- Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Yuwei Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Shaozhou Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Haiyang Ye
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Jun Su
- College of Music, Chengdu Normal University, Chengdu, 611130, China
| | - Juelin Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Jianping Shu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
| | - Jian Huang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China.
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), No. 2006 Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China.
| |
Collapse
|
5
|
Shao X, Kong W, Li Y, Zhang S. Quantitative structure-activity relationship modeling reveals the minimal sequence requirement and amino acid preference of sirtuin-1's deacetylation substrates in diabetes mellitus. J Bioinform Comput Biol 2022; 20:2250008. [PMID: 35451939 DOI: 10.1142/s0219720022500081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Sirtuin 1 (SIRT1) is a nicotinamide adenine dinucleotide (NAD[Formula: see text]-dependent deacetylase involved in multiple glucose metabolism pathways and plays an important role in the pathogenesis of diabetes mellitus (DM). The enzyme specifically recognizes its deacetylation substrates' peptide segments containing a central acetyl-lysine residue as well as a number of amino acids flanking the central residue. In this study, we attempted to ascertain the minimal sequence requirement (MSR) around the central acetyl-lysine residue of SIRT1 substrate-recognition sites as well as the amino acid preference (AAP) at different residues of the MSR window through quantitative structure-activity relationship (QSAR) strategy, which would benefit our understanding of SIRT1 substrate specificity at the molecular level and is also helpful to rationally design substrate-mimicking peptidic agents against DM by competitively targeting SIRT1 active site. In this procedure, a large-scale dataset containing 6801 13-mer acetyl-lysine peptides (and their SIRT1-catalyized deacetylation activities) were compiled to train 10 QSAR regression models developed by systematic combination of machine learning methods (PLS and SVM) and five amino acids descriptors (DPPS, T-scale, MolSurf, [Formula: see text]-score, and FASGAI). The two best QSAR models (PLS+FASGAI and SVM+DPPS) were then employed to statistically examine the contribution of residue positions to the deacetylation activity of acetyl-lysine peptide substrates, revealing that the MSR can be represented by 5-mer acetyl-lysine peptides that meet a consensus motif X[Formula: see text]X[Formula: see text]X[Formula: see text](AcK)0X[Formula: see text]. Structural analysis found that the X[Formula: see text] and (AcK)0 residues are tightly packed against the enzyme active site and confer both stability and specificity for the enzyme-substrate complex, whereas the X[Formula: see text], X[Formula: see text] and X[Formula: see text] residues are partially exposed to solvent but can also effectively stabilize the complex system. Subsequently, a systematic deacetylation activity change profile (SDACP) was created based on QSAR modeling, from which the AAP for each residue position of MSR was depicted. With the profile, we were able to rationally design an SDACP combinatorial library with promising deacetylation activity, from which nine MSR acetyl-lysine peptides as well as two known SIRT1 acetyl-lysine peptide substrates were tested by using SIRT1 deacetylation assay. It is revealed that the designed peptides exhibit a comparable or even higher activity than the controls, although the former is considerably shorter than the latter.
Collapse
Affiliation(s)
- X Shao
- Department of Nephrology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University, School of Medicine, Suzhou 215000, P. R. China
| | - W Kong
- Department of Nephrology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University, School of Medicine, Suzhou 215000, P. R. China
| | - Y Li
- Department of Nephrology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University, School of Medicine, Suzhou 215000, P. R. China
| | - S Zhang
- Department of Nephrology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University, School of Medicine, Suzhou 215000, P. R. China
| |
Collapse
|
6
|
Zhou P, Wen L, Lin J, Mei L, Liu Q, Shang S, Li J, Shu J. Integrated unsupervised-supervised modeling and prediction of protein-peptide affinities at structural level. Brief Bioinform 2022; 23:6555404. [PMID: 35352094 DOI: 10.1093/bib/bbac097] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 02/15/2022] [Accepted: 02/23/2022] [Indexed: 12/24/2022] Open
Abstract
Cell signal networks are orchestrated directly or indirectly by various peptide-mediated protein-protein interactions, which are normally weak and transient and thus ideal for biological regulation and medicinal intervention. Here, we develop a general-purpose method for modeling and predicting the binding affinities of protein-peptide interactions (PpIs) at the structural level. The method is a hybrid strategy that employs an unsupervised approach to derive a layered PpI atom-residue interaction (ulPpI[a-r]) potential between different protein atom types and peptide residue types from thousands of solved PpI complex structures and then statistically correlates the potential descriptors with experimental affinities (KD values) over hundreds of known PpI samples in a supervised manner to create an integrated unsupervised-supervised PpI affinity (usPpIA) predictor. Although both the ulPpI[a-r] potential and usPpIA predictor can be used to calculate PpI affinities from their complex structures, the latter seems to perform much better than the former, suggesting that the unsupervised potential can be improved substantially with a further correction by supervised statistical learning. We examine the robustness and fault-tolerance of usPpIA predictor when applied to treat the coarse-grained PpI complex structures modeled computationally by sophisticated peptide docking and dynamics simulation. It is revealed that, despite developed solely based on solved structures, the integrated unsupervised-supervised method is also applicable for locally docked structures to reach a quantitative prediction but can only give a qualitative prediction on globally docked structures. The dynamics refinement seems not to change (or improve) the predictive results essentially, although it is computationally expensive and time-consuming relative to peptide docking. We also perform extrapolation of usPpIA predictor to the indirect affinity quantities of HLA-A*0201 binding epitope peptides and NHERF PDZ binding scaffold peptides, consequently resulting in a good and moderate correlation of the predicted KD with experimental IC50 and BLU on the two peptide sets, with Pearson's correlation coefficients Rp = 0.635 and 0.406, respectively.
Collapse
Affiliation(s)
- Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Li Mei
- Institute of Culinary, Sichuan Tourism University, Chengdu 610100, China
| | - Qian Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Shuyong Shang
- of Ecological Environment Protection, Chengdu Normal University, Chengdu 611130, China
| | - Juelin Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Jianping Shu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| |
Collapse
|
7
|
Liu Q, Lin J, Wen L, Wang S, Zhou P, Mei L, Shang S. Systematic Modeling, Prediction, and Comparison of Domain-Peptide Affinities: Does it Work Effectively With the Peptide QSAR Methodology? Front Genet 2022; 12:800857. [PMID: 35096016 PMCID: PMC8795790 DOI: 10.3389/fgene.2021.800857] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 12/14/2021] [Indexed: 11/17/2022] Open
Abstract
The protein-protein association in cellular signaling networks (CSNs) often acts as weak, transient, and reversible domain-peptide interaction (DPI), in which a flexible peptide segment on the surface of one protein is recognized and bound by a rigid peptide-recognition domain from another. Reliable modeling and accurate prediction of DPI binding affinities would help to ascertain the diverse biological events involved in CSNs and benefit our understanding of various biological implications underlying DPIs. Traditionally, peptide quantitative structure-activity relationship (pQSAR) has been widely used to model and predict the biological activity of oligopeptides, which employs amino acid descriptors (AADs) to characterize peptide structures at sequence level and then statistically correlate the resulting descriptor vector with observed activity data via regression. However, the QSAR has not yet been widely applied to treat the direct binding behavior of large-scale peptide ligands to their protein receptors. In this work, we attempted to clarify whether the pQSAR methodology can work effectively for modeling and predicting DPI affinities in a high-throughput manner? Over twenty thousand short linear motif (SLiM)-containing peptide segments involved in SH3, PDZ and 14-3-3 domain-medicated CSNs were compiled to define a comprehensive sequence-based data set of DPI affinities, which were represented by the Boehringer light units (BLUs) derived from previous arbitrary light intensity assays following SPOT peptide synthesis. Four sophisticated MLMs (MLMs) were then utilized to perform pQSAR modeling on the set described with different AADs to systematically create a variety of linear and nonlinear predictors, and then verified by rigorous statistical test. It is revealed that the genome-wide DPI events can only be modeled qualitatively or semiquantitatively with traditional pQSAR strategy due to the intrinsic disorder of peptide conformation and the potential interplay between different peptide residues. In addition, the arbitrary BLUs used to characterize DPI affinity values were measured via an indirect approach, which may not very reliable and may involve strong noise, thus leading to a considerable bias in the modeling. The R prd 2 = 0.7 can be considered as the upper limit of external generalization ability of the pQSAR methodology working on large-scale DPI affinity data.
Collapse
Affiliation(s)
- Qian Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Shaozhou Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
| | - Li Mei
- Institute of Culinary, Sichuan Tourism University, Chengdu, China
| | - Shuyong Shang
- Institute of Ecological Environment Protection, Chengdu Normal University, Chengdu, China
| |
Collapse
|
8
|
Kolmar SS, Grulke CM. The effect of noise on the predictive limit of QSAR models. J Cheminform 2021; 13:92. [PMID: 34823605 PMCID: PMC8613965 DOI: 10.1186/s13321-021-00571-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/14/2021] [Indexed: 01/09/2023] Open
Abstract
A key challenge in the field of Quantitative Structure Activity Relationships (QSAR) is how to effectively treat experimental error in the training and evaluation of computational models. It is often assumed in the field of QSAR that models cannot produce predictions which are more accurate than their training data. Additionally, it is implicitly assumed, by necessity, that data points in test sets or validation sets do not contain error, and that each data point is a population mean. This work proposes the hypothesis that QSAR models can make predictions which are more accurate than their training data and that the error-free test set assumption leads to a significant misevaluation of model performance. This work used 8 datasets with six different common QSAR endpoints, because different endpoints should have different amounts of experimental error associated with varying complexity of the measurements. Up to 15 levels of simulated Gaussian distributed random error was added to the datasets, and models were built on the error laden datasets using five different algorithms. The models were trained on the error laden data, evaluated on error-laden test sets, and evaluated on error-free test sets. The results show that for each level of added error, the RMSE for evaluation on the error free test sets was always better. The results support the hypothesis that, at least under the conditions of Gaussian distributed random error, QSAR models can make predictions which are more accurate than their training data, and that the evaluation of models on error laden test and validation sets may give a flawed measure of model performance. These results have implications for how QSAR models are evaluated, especially for disciplines where experimental error is very large, such as in computational toxicology. ![]()
Collapse
Affiliation(s)
- Scott S Kolmar
- Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC, USA.
| | - Christopher M Grulke
- Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC, USA
| |
Collapse
|
9
|
Bell DR, Chen SH. Toward Guided Mutagenesis: Gaussian Process Regression Predicts MHC Class II Antigen Mutant Binding. J Chem Inf Model 2021; 61:4857-4867. [PMID: 34375111 DOI: 10.1021/acs.jcim.1c00458] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Antigen-specific immunotherapies (ASI) require successful loading and presentation of antigen peptides into the major histocompatibility complex (MHC) binding cleft. One route of ASI design is to mutate native antigens for either stronger or weaker binding interaction to MHC. Exploring all possible mutations is costly both experimentally and computationally. To reduce experimental and computational expense, here we investigate the minimal amount of prior data required to accurately predict the relative binding affinity of point mutations for peptide-MHC class II (pMHCII) binding. Using data from different residue subsets, we interpolate pMHCII mutant binding affinities by Gaussian process (GP) regression of residue volume and hydrophobicity. We apply GP regression to an experimental data set from the Immune Epitope Database, and theoretical data sets from NetMHCIIpan and Free Energy Perturbation calculations. We find that GP regression can predict binding affinities of nine neutral residues from a six-residue subset with an average R2 coefficient of determination value of 0.62 ± 0.04 (±95% CI), average error of 0.09 ± 0.01 kcal/mol (±95% CI), and with an receiver operating characteristic (ROC) AUC value of 0.92 for binary classification of enhanced or diminished binding affinity. Similarly, metrics increase to an R2 value of 0.69 ± 0.04, average error of 0.07 ± 0.01 kcal/mol, and an ROC AUC value of 0.94 for predicting seven neutral residues from an eight-residue subset. Our work finds that prediction is most accurate for neutral residues at anchor residue sites without register shift. This work holds relevance to predicting pMHCII binding and accelerating ASI design.
Collapse
Affiliation(s)
- David R Bell
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Serena H Chen
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| |
Collapse
|
10
|
Zhou P, Liu Q, Wu T, Miao Q, Shang S, Wang H, Chen Z, Wang S, Wang H. Systematic Comparison and Comprehensive Evaluation of 80 Amino Acid Descriptors in Peptide QSAR Modeling. J Chem Inf Model 2021; 61:1718-1731. [DOI: 10.1021/acs.jcim.0c01370] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Peng Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qian Liu
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Ting Wu
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shuyong Shang
- College of Chemistry and Life Science, Chengdu Normal University, Chengdu 611130, China
| | - Heyi Wang
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Zheng Chen
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shaozhou Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Heyan Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| |
Collapse
|
11
|
Li Q, Xing S, Chen Y, Liao Q, Xiong B, He S, Lu W, Liu Y, Yang H, Li Q, Feng F, Liu W, Chen Y, Sun H. Discovery and Biological Evaluation of a Novel Highly Potent Selective Butyrylcholinsterase Inhibitor. J Med Chem 2020; 63:10030-10044. [PMID: 32787113 DOI: 10.1021/acs.jmedchem.0c01129] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
To discover novel BChE inhibitors, a hierarchical virtual screening protocol followed by biochemical evaluation was applied. The most potent compound 8012-9656 (eqBChE IC50 = 0.18 ± 0.03 μM, hBChE IC50 = 0.32 ± 0.07 μM) was purchased and synthesized. It inhibited BChE in a noncompetitive manner and could occupy the binding pocket forming diverse interactions with the target. 8012-9656 was proven to be safe in vivo and in vitro and showed comparable performance in ameliorating the scopolamine-induced cognition impairment to tacrine. Additionally, treatment with 8012-9656 could almost entirely recover the Aβ1-42 (icv)-impaired cognitive function to the normal level and showed better behavioral performance than donepezil. The evaluation of the Aβ1-42 total amount confirmed its anti-amyloidogenic profile. Moreover, 8012-9656 possessed blood-brain barrier (BBB) penetrating ability, a long T1/2, and low intrinsic clearance. Hence, the novel potential BChE inhibitor 8012-9656 can be considered as a promising lead compound for further investigation of anti-AD agents.
Collapse
Affiliation(s)
- Qi Li
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Shuaishuai Xing
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Ying Chen
- Department of Natural Medicinal Chemistry, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Qinghong Liao
- Department of Natural Medicinal Chemistry, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Baichen Xiong
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Siyu He
- State Key Laboratory of Natural Medicines, Jiangsu Key Laboratory of Carcinogenesis and Intervention, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 210009, People's Republic of China
| | - Weixuan Lu
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Yang Liu
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Hongyu Yang
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Qihang Li
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Feng Feng
- Department of Natural Medicinal Chemistry, China Pharmaceutical University, Nanjing 211198, People's Republic of China.,Jiangsu Food and Pharmaceutical Science College, No. 4 Meicheng Road, Huai'an 223003, People's Republic of China
| | - Wenyuan Liu
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| | - Yao Chen
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, People's Republic of China
| | - Haopeng Sun
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, People's Republic of China
| |
Collapse
|
12
|
Cortés-Ciriano I, Bender A. Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout. J Chem Inf Model 2019; 59:3330-3339. [PMID: 31241929 DOI: 10.1021/acs.jcim.9b00297] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
While the use of deep learning in drug discovery is gaining increasing attention, the lack of methods to compute reliable errors in prediction for Neural Networks prevents their application to guide decision making in domains where identifying unreliable predictions is essential, e.g., precision medicine. Here, we present a framework to compute reliable errors in prediction for Neural Networks using Test-Time Dropout and Conformal Prediction. Specifically, the algorithm consists of training a single Neural Network using dropout, and then applying it N times to both the validation and test sets, also employing dropout in this step. Therefore, for each instance in the validation and test sets an ensemble of predictions are generated. The residuals and absolute errors in prediction for the validation set are then used to compute prediction errors for the test set instances using Conformal Prediction. We show using 24 bioactivity data sets from ChEMBL 23 that Dropout Conformal Predictors are valid (i.e., the fraction of instances whose true value lies within the predicted interval strongly correlates with the confidence level) and efficient, as the predicted confidence intervals span a narrower set of values than those computed with Conformal Predictors generated using Random Forest (RF) models. Lastly, we show in retrospective virtual screening experiments that dropout and RF-based Conformal Predictors lead to comparable retrieval rates of active compounds. Overall, we propose a computationally efficient framework (as only N extra forward passes are required in addition to training a single network) to harness Test-Time Dropout and the Conformal Prediction framework, which is generally applicable to generate reliable prediction errors for Deep Neural Networks in drug discovery and beyond.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| |
Collapse
|
13
|
Li Z, Miao Q, Yan F, Meng Y, Zhou P. Machine Learning in Quantitative Protein–peptide Affinity Prediction: Implications for Therapeutic Peptide Design. Curr Drug Metab 2019; 20:170-176. [DOI: 10.2174/1389200219666181012151944] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 11/07/2017] [Accepted: 08/20/2018] [Indexed: 01/03/2023]
Abstract
Background:Protein–peptide recognition plays an essential role in the orchestration and regulation of cell signaling networks, which is estimated to be responsible for up to 40% of biological interaction events in the human interactome and has recently been recognized as a new and attractive druggable target for drug development and disease intervention.Methods:We present a systematic review on the application of machine learning techniques in the quantitative modeling and prediction of protein–peptide binding affinity, particularly focusing on its implications for therapeutic peptide design. We also briefly introduce the physical quantities used to characterize protein–peptide affinity and attempt to extend the content of generalized machine learning methods.Results:Existing issues and future perspective on the statistical modeling and regression prediction of protein– peptide binding affinity are discussed.Conclusion:There is still a long way to go before establishment of general, reliable and efficient machine leaningbased protein–peptide affinity predictors.
Collapse
Affiliation(s)
- Zhongyan Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Fugang Yan
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Yang Meng
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| |
Collapse
|
14
|
Li Q, Yang H, Mo J, Chen Y, Wu Y, Kang C, Sun Y, Sun H. Identification by shape-based virtual screening and evaluation of new tyrosinase inhibitors. PeerJ 2018; 6:e4206. [PMID: 29383286 PMCID: PMC5788061 DOI: 10.7717/peerj.4206] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 12/08/2017] [Indexed: 12/17/2022] Open
Abstract
Targeting tyrosinase is considered to be an effective way to control the production of melanin. Tyrosinase inhibitor is anticipated to provide new therapy to prevent skin pigmentation, melanoma and neurodegenerative diseases. Herein, we report our results in identifying new tyrosinase inhibitors. The shape-based virtual screening was performed to discover new tyrosinase inhibitors. Thirteen potential hits derived from virtual screening were tested by biological determinations. Compound 5186-0429 exhibited the most potent inhibitory activity. It dose-dependently inhibited the activity of tyrosinase, with the IC50 values 6.2 ± 2.0 µM and 10.3 ± 5.4 µM on tyrosine and L-Dopa formation, respectively. The kinetic study of 5186-0429 demonstrated that this compound acted as a competitive inhibitor. We believe the discoveries here could serve as a good starting point for further design of potent tyrosinase inhibitor.
Collapse
Affiliation(s)
- Qi Li
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, China
| | - Hongyu Yang
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, China
| | - Jun Mo
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, China
| | - Yao Chen
- School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, China
| | - Yue Wu
- Nanjing Duoyuan Biochemistry Co., Ltd., Nanjing, China
| | - Chen Kang
- Department of Internal Medicine, Carver College of Medicine, University of Iowa, Iowa City, IA, United States of America
| | - Yuan Sun
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Sacramento, CA, United States of America
| | - Haopeng Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
15
|
He Y, He X. Molecular design and genetic optimization of antimicrobial peptides containing unnatural amino acids against antibiotic-resistant bacterial infections. Biopolymers 2017; 106:746-56. [PMID: 27258330 DOI: 10.1002/bip.22885] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 04/30/2016] [Accepted: 05/31/2016] [Indexed: 01/25/2023]
Abstract
Antimicrobial peptides (AMPs) have been the focus of intense research towards the finding of a viable alternative to current small-molecule antibiotics, owing to their commonly observed and naturally occurring resistance against pathogens. However, natural peptides have many problems such as low bioavailability and high allergenicity that largely limit the clinical applications of AMPs. In the present study, an integrative protocol that combined chemoinformatics modeling, molecular dynamics simulations, and in vitro susceptibility test was described to design AMPs containing unnatural amino acids (AMP-UAAs). To fulfill this, a large panel of synthetic AMPs with determined activity was collected and used to perform quantitative structure-activity relationship (QSAR) modeling. The obtained QSAR predictors were then employed to direct genetic algorithm (GA)-based optimization of AMP-UAA population, to which a number of commercially available, structurally diverse unnatural amino acids were introduced during the optimization process. Subsequently, several designed AMP-UAAs were confirmed to have high antibacterial potency against two antibiotic-resistant strains, i.e. multidrug-resistant Pseudomonas aeruginosa (MDRPA) and methicillin-resistant Staphylococcus aureus (MRSA), with minimum inhibitory concentration (MIC) < 10 μg/ml. Structural dynamics characterizations revealed that the most potent AMP-UAA peptide is an amphipathic helix that can spontaneously embed into an artificial lipid bilayer and exhibits a strong destructuring tendency associated with the embedding process. © 2016 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 106: 746-756, 2016.
Collapse
Affiliation(s)
- Yongkang He
- Department of Infectious Diseases, Taixing People's Hospital, Yangzhou University, Taixing, 225400, China.
| | - Xiaofeng He
- Department of Infectious Diseases, Taixing People's Hospital, Yangzhou University, Taixing, 225400, China
| |
Collapse
|
16
|
Xue X, Zhao NY, Yu HT, Sun Y, Kang C, Huang QB, Sun HP, Wang XL, Li NG. Discovery of novel inhibitors disrupting HIF-1 α/von Hippel-Lindau interaction through shape-based screening and cascade docking. PeerJ 2016; 4:e2757. [PMID: 27994971 PMCID: PMC5162400 DOI: 10.7717/peerj.2757] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 11/04/2016] [Indexed: 01/20/2023] Open
Abstract
Major research efforts have been devoted to the discovery and development of new chemical entities that could inhibit the protein–protein interaction between HIF-1α and the von Hippel–Lindau protein (pVHL), which serves as the substrate recognition subunit of an E3 ligase and is regarded as a crucial drug target in cancer, chronic anemia, and ischemia. Currently there is only one class of compounds available to interdict the HIF-1α/pVHL interaction, urging the need to discover chemical inhibitors with more diversified structures. We report here a strategy combining shape-based virtual screening and cascade docking to identify new chemical scaffolds for the designing of novel inhibitors. Based on this strategy, nine active hits have been identified and the most active hit, 9 (ZINC13466751), showed comparable activity to pVHL with an IC50 of 2.0 ± 0.14 µM, showing the great potential of utilizing these compounds for further optimization and serving as drug candidates for the inhibition of HIF-1α/von Hippel–Lindau interaction.
Collapse
Affiliation(s)
- Xin Xue
- Department of Medicinal Chemistry, Nanjing University of Chinese Medicine , Nanjing , China
| | - Ning-Yi Zhao
- Department of Pharmacy, Nanjing Health-Innovating Biotechnology Co., Ltd. , Nanjing , China
| | - Hai-Tao Yu
- Department of Medicinal Chemistry, Nanjing University of Chinese Medicine , Nanjing , China
| | - Yuan Sun
- Department of Chemistry and Biochemistry, Ohio State University , Columbus , OH , United States
| | - Chen Kang
- Division of Pharmacology, College of Pharmacy, Ohio State University , Columbus , OH , United States
| | - Qiong-Bin Huang
- Department of Medicinal Chemistry, Nanjing University of Chinese Medicine , Nanjing , China
| | - Hao-Peng Sun
- Department of Medicinal Chemistry, China Pharmaceutical University , Nanjing , China
| | - Xiao-Long Wang
- Department of Medicinal Chemistry, Nanjing University of Chinese Medicine , Nanjing , China
| | - Nian-Guang Li
- Department of Medicinal Chemistry, Nanjing University of Chinese Medicine , Nanjing , China
| |
Collapse
|
17
|
Ni Z, Chen H, Lin X, Jin R. Insight into molecular mechanism underlying the transesterification catalysed by penicillin G amidase (PGA) using a combination protocol of experimental assay and theoretical analysis. MOLECULAR SIMULATION 2014. [DOI: 10.1080/08927022.2013.850500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
18
|
Cortes-Ciriano I, van Westen GJ, Lenselink EB, Murrell DS, Bender A, Malliavin T. Proteochemometric modeling in a Bayesian framework. J Cheminform 2014; 6:35. [PMID: 25045403 PMCID: PMC4083135 DOI: 10.1186/1758-2946-6-35] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 06/18/2014] [Indexed: 11/10/2022] Open
Abstract
Proteochemometrics (PCM) is an approach for bioactivity predictive modeling which models the relationship between protein and chemical information. Gaussian Processes (GP), based on Bayesian inference, provide the most objective estimation of the uncertainty of the predictions, thus permitting the evaluation of the applicability domain (AD) of the model. Furthermore, the experimental error on bioactivity measurements can be used as input for this probabilistic model. In this study, we apply GP implemented with a panel of kernels on three various (and multispecies) PCM datasets. The first dataset consisted of information from 8 human and rat adenosine receptors with 10,999 small molecule ligands and their binding affinity. The second consisted of the catalytic activity of four dengue virus NS3 proteases on 56 small peptides. Finally, we have gathered bioactivity information of small molecule ligands on 91 aminergic GPCRs from 9 different species, leading to a dataset of 24,593 datapoints with a matrix completeness of only 2.43%. GP models trained on these datasets are statistically sound, at the same level of statistical significance as Support Vector Machines (SVM), with R02 values on the external dataset ranging from 0.68 to 0.92, and RMSEP values close to the experimental error. Furthermore, the best GP models obtained with the normalized polynomial and radial kernels provide intervals of confidence for the predictions in agreement with the cumulative Gaussian distribution. GP models were also interpreted on the basis of individual targets and of ligand descriptors. In the dengue dataset, the model interpretation in terms of the amino-acid positions in the tetra-peptide ligands gave biologically meaningful results.
Collapse
Affiliation(s)
- Isidro Cortes-Ciriano
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3825; Département de Biologie Structurale et Chimie
| | - Gerard Jp van Westen
- ChEMBL Group, European Molecular Biology Laboratory European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK
| | - Eelke Bart Lenselink
- Division of Medicinal Chemistry, Leiden Academic Center for Drug Research, Leiden, The Netherlands
| | - Daniel S Murrell
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Andreas Bender
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Thérèse Malliavin
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3825; Département de Biologie Structurale et Chimie
| |
Collapse
|
19
|
To Determine Biologically Important Mutations in Oxytocin. Int J Pept Res Ther 2014. [DOI: 10.1007/s10989-014-9412-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
20
|
He P, Wu W, Wang HD, Liao KL, Zhang W, Lv FL, Yang K. Why ligand cross-reactivity is high within peptide recognition domain families? A case study on human c-Src SH3 domain. J Theor Biol 2014; 340:30-7. [DOI: 10.1016/j.jtbi.2013.08.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Revised: 07/26/2013] [Accepted: 08/21/2013] [Indexed: 10/26/2022]
|
21
|
Ren Y, Wang Q, Chen S, Cao H. Integrating Computational Modeling and Experimental Assay to Discover New Potent ACE-Inhibitory Peptides. Mol Inform 2013; 33:43-52. [DOI: 10.1002/minf.201300131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2013] [Accepted: 09/11/2013] [Indexed: 02/05/2023]
|
22
|
Zhou Y, Ni Z, Chen K, Liu H, Chen L, Lian C, Yan L. Modeling Protein–Peptide Recognition Based on Classical Quantitative Structure–Affinity Relationship Approach: Implication for Proteome-Wide Inference of Peptide-Mediated Interactions. Protein J 2013; 32:568-78. [DOI: 10.1007/s10930-013-9519-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
23
|
Borkar MR, Pissurlenkar RRS, Coutinho EC. HomoSAR: Bridging comparative protein modeling with quantitative structural activity relationship to design new peptides. J Comput Chem 2013; 34:2635-46. [DOI: 10.1002/jcc.23436] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Revised: 08/17/2013] [Accepted: 08/21/2013] [Indexed: 12/19/2022]
Affiliation(s)
- Mahesh R. Borkar
- Department of Pharmaceutical Chemistry; Bombay College of Pharmacy; Kalina, Santacruz (East) Mumbai 400098 India
| | - Raghuvir R. S. Pissurlenkar
- Department of Pharmaceutical Chemistry; Bombay College of Pharmacy; Kalina, Santacruz (East) Mumbai 400098 India
| | - Evans C. Coutinho
- Department of Pharmaceutical Chemistry; Bombay College of Pharmacy; Kalina, Santacruz (East) Mumbai 400098 India
| |
Collapse
|
24
|
Guo T, Yang J, Zeng L, Wang H, Tong Q, Li X. Does there exist an intrinsic relationship between the flexibility and self-assembly of pepfactants? MOLECULAR SIMULATION 2013. [DOI: 10.1080/08927022.2013.817673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
25
|
Tian F, Tan R, Guo T, Zhou P, Yang L. Fast and reliable prediction of domain–peptide binding affinity using coarse-grained structure models. Biosystems 2013; 113:40-9. [DOI: 10.1016/j.biosystems.2013.04.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2013] [Revised: 04/15/2013] [Accepted: 04/20/2013] [Indexed: 10/26/2022]
|
26
|
Biomacromolecular quantitative structure–activity relationship (BioQSAR): a proof-of-concept study on the modeling, prediction and interpretation of protein–protein binding affinity. J Comput Aided Mol Des 2013; 27:67-78. [DOI: 10.1007/s10822-012-9625-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 12/12/2012] [Indexed: 01/22/2023]
|
27
|
Membrane curvature and its generation by BAR proteins. Trends Biochem Sci 2012; 37:526-33. [PMID: 23058040 DOI: 10.1016/j.tibs.2012.09.001] [Citation(s) in RCA: 190] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Revised: 09/12/2012] [Accepted: 09/13/2012] [Indexed: 01/26/2023]
Abstract
Membranes are flexible barriers that surround the cell and its compartments. To execute vital functions such as locomotion or receptor turnover, cells need to control the shapes of their membranes. In part, this control is achieved through membrane-bending proteins, such as the Bin/amphiphysin/Rvs (BAR) domain proteins. Many open questions remain about the mechanisms by which membrane-bending proteins function. Addressing this shortfall, recent structures of BAR protein:membrane complexes support existing mechanistic models, but also produced novel insights into how BAR domain proteins sense, stabilize, and generate curvature. Here we review these recent findings, focusing on how BAR proteins interact with the membrane, and how the resulting scaffold structures might aid the recruitment of other proteins to the sites where membranes are bent.
Collapse
|
28
|
Wang X, Zhang A, Ren W, Chen C, Dong J. Genome-wide Inference of Transcription Factor-DNA Binding Specificity in Cell Regeneration Using a Combination Strategy. Chem Biol Drug Des 2012; 80:734-44. [DOI: 10.1111/cbdd.12013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
29
|
|
30
|
|
31
|
Characterization of PDZ domain–peptide interactions using an integrated protocol of QM/MM, PB/SA, and CFEA analyses. J Comput Aided Mol Des 2011; 25:947-58. [DOI: 10.1007/s10822-011-9474-5] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Accepted: 09/13/2011] [Indexed: 01/04/2023]
|
32
|
Structure-based characterization of the binding of peptide to the human endophilin-1 Src homology 3 domain using position-dependent noncovalent potential analysis. J Mol Model 2011; 18:2153-61. [DOI: 10.1007/s00894-011-1197-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2011] [Accepted: 07/20/2011] [Indexed: 02/05/2023]
|
33
|
Characterization of the binding profile of peptide to transporter associated with antigen processing (TAP) using Gaussian process regression. Comput Biol Med 2011; 41:865-70. [DOI: 10.1016/j.compbiomed.2011.07.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Revised: 07/10/2011] [Accepted: 07/18/2011] [Indexed: 11/22/2022]
|
34
|
Prediction of protein 13Cα NMR chemical shifts using a combination scheme of statistical modeling and quantum-mechanical analysis. J Mol Struct 2011. [DOI: 10.1016/j.molstruc.2011.04.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
35
|
Zhang Y, Jin Q, Wang S, Ren R. Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches. Comput Biol Med 2011; 41:272-7. [DOI: 10.1016/j.compbiomed.2011.03.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2010] [Revised: 12/29/2010] [Accepted: 03/02/2011] [Indexed: 10/18/2022]
|
36
|
Peng S, Jian-Wei Z, Peng Z, Lin X. QSPR modeling of bioconcentration factor of nonionic compounds using Gaussian processes and theoretical descriptors derived from electrostatic potentials on molecular surface. CHEMOSPHERE 2011; 83:1045-1052. [PMID: 21339002 DOI: 10.1016/j.chemosphere.2011.01.063] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Revised: 01/21/2011] [Accepted: 01/29/2011] [Indexed: 05/30/2023]
Abstract
In the present study, geometrical structures were constructed and optimized for 122 nonionic organic compounds at the quantum-mechanical HF/6-31G level of theory. The electrostatic potentials and subsequent structural descriptors derived from them were obtained. Gaussian process, and for comparison purpose, multiple linear regression (MLR) and support vector machine (SVM), were then employed to build the quantitative structure-bioconcentration factor relationships. Systematical validations including internal leave-one-out cross-validation, the validation for external test set, as well as a more rigorous Monte Carlo cross-validation were made to confirm the reliability of the constructed models. It has been found that the quantities derived from electrostatic potential, V(min) and ∑V(s,ind)(-), together with the molecular volume (V(mc)), dipole moment (μ) and the energy level of highest occupied molecular orbital (E(HOMO)) can be well used to express the quantitative structure-property relationship of this sample set. Both linear and nonlinear models can give satisfactory results, and the GP, which be capable of handing with linear and nonlinear-hybrid relationship through a mixed covariance function, appears to have better fitting and predictive abilities than other two statistical methods. The coefficient of determination r(pred)(2) and root mean square error of prediction (RMSEP) for the external test set are 0.953 and 0.337, respectively.
Collapse
Affiliation(s)
- Sang Peng
- Department of Chemistry, Zhejiang University, Hangzhou 310027, China
| | | | | | | |
Collapse
|
37
|
He P, Wu W, Yang K, Jing T, Liao KL, Zhang W, Wang HD, Hua X. Exploring the activity space of peptides binding to diverse SH3 domains using principal property descriptors derived from amino acid rotamers. Biopolymers 2011; 96:288-301. [DOI: 10.1002/bip.21531] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
38
|
Ding Y, Lin Y, Shu M, Wang Y, Wang L, Cheng X, Lin Z. Quantitative Structure–Activity Relationship Model for Prediction of Protein–Peptide Interaction Binding Affinities between Human Amphiphysin-1 SH3 Domains and Their Peptide Ligands. Int J Pept Res Ther 2011. [DOI: 10.1007/s10989-011-9244-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
39
|
Tian F, Zhang C, Fan X, Yang X, Wang X, Liang H. Predicting the Flexibility Profile of Ribosomal RNAs. Mol Inform 2010; 29:707-15. [PMID: 27464014 DOI: 10.1002/minf.201000092] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 09/28/2010] [Indexed: 11/06/2022]
Abstract
Flexibility in biomolecules is an important determinant of biological functionality, which can be measured quantitatively by atomic Debye-Waller factor or B-factor. Although numerous works have been addressed on theoretical and computational studies of the B-factor profiles of proteins, the methods used for predicting B-factor values of nucleic acids, especially the complicated ribosomal RNAs (rRNAs), which are very functionally similar to proteins in providing matrix structures and in catalyzing biochemical reactions, still remain unexploited. In this article, we present a quantitative structure-flexibility relationship (QSFR) study with the aim at the quantitative prediction of rRNA B-factor based on primary sequences (sequence-based) and advanced structures (structure-based) by using both linear and nonlinear machine learning approaches, including partial least squares regression (PLS), least squares support vector machine (LSSVM), and Gaussian process (GP). By rigorously examining the performance and reliability of constructed statistical models and by comparing our models in detail to those developed previously for protein B-factors, we demonstrate that (i) rRNA B-factors could be predicted at a similar level of accuracy with that of protein, (ii) a structure-based approach performed much better as compared to sequence-based methods in modeling of rRNA B-factors, and (iii) rRNA flexibility is primarily governed by the local features of nonbonding potential landscapes, such as electrostatic and van der Waals forces.
Collapse
Affiliation(s)
- Feifei Tian
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.,College of Bioengineering, Chongqing University, Chongqing 400044, China
| | - Chun Zhang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xia Fan
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xue Yang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xi Wang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Huaping Liang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.
| |
Collapse
|
40
|
Liu X, Liang J, Fan J, Shang Z. Prediction of Ion Drift Times for a Proteome-Wide Peptide Set Using Partial Least Squares Regression, Least-Squares Support Vector Machine and Gaussian Process. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200910075] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
41
|
Toward quantitative characterization of the binding profile between the human amphiphysin-1 SH3 domain and its peptide ligands. Amino Acids 2009; 38:1209-18. [DOI: 10.1007/s00726-009-0332-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2009] [Accepted: 07/22/2009] [Indexed: 10/20/2022]
|
42
|
Tian F, Yang L, Lv F, Zhou P. Modeling and prediction of retention behavior of histidine-containing peptides in immobilized metal-affinity chromatography. J Sep Sci 2009; 32:2159-69. [DOI: 10.1002/jssc.200800739] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
43
|
Predicting liquid chromatographic retention times of peptides from the Drosophila melanogaster proteome by machine learning approaches. Anal Chim Acta 2009; 644:10-6. [DOI: 10.1016/j.aca.2009.04.010] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 03/29/2009] [Accepted: 04/07/2009] [Indexed: 11/22/2022]
|
44
|
Gaussian process: an alternative approach for QSAM modeling of peptides. Amino Acids 2009; 38:199-212. [DOI: 10.1007/s00726-008-0228-1] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Accepted: 12/18/2008] [Indexed: 10/21/2022]
|