1
|
Perspectives on the landscape and flux theory for describing emergent behaviors of the biological systems. J Biol Phys 2022; 48:1-36. [PMID: 34822073 PMCID: PMC8866630 DOI: 10.1007/s10867-021-09586-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 09/07/2021] [Indexed: 10/19/2022] Open
Abstract
We give a review on the landscape theory of the equilibrium biological systems and landscape-flux theory of the nonequilibrium biological systems as the global driving force. The emergences of the behaviors, the associated thermodynamics in terms of the entropy and free energy and dynamics in terms of the rate and paths have been quantitatively demonstrated. The hierarchical organization structures have been discussed. The biological applications ranging from protein folding, biomolecular recognition, specificity, biomolecular evolution and design for equilibrium systems as well as cell cycle, differentiation and development, cancer, neural networks and brain function, and evolution for nonequilibrium systems, cross-scale studies of genome structural dynamics and experimental quantifications/verifications of the landscape and flux are illustrated. Together, this gives an overall global physical and quantitative picture in terms of the landscape and flux for the behaviors, dynamics and functions of biological systems.
Collapse
|
2
|
Chu WT, Yan Z, Chu X, Zheng X, Liu Z, Xu L, Zhang K, Wang J. Physics of biomolecular recognition and conformational dynamics. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021; 84:126601. [PMID: 34753115 DOI: 10.1088/1361-6633/ac3800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
Biomolecular recognition usually leads to the formation of binding complexes, often accompanied by large-scale conformational changes. This process is fundamental to biological functions at the molecular and cellular levels. Uncovering the physical mechanisms of biomolecular recognition and quantifying the key biomolecular interactions are vital to understand these functions. The recently developed energy landscape theory has been successful in quantifying recognition processes and revealing the underlying mechanisms. Recent studies have shown that in addition to affinity, specificity is also crucial for biomolecular recognition. The proposed physical concept of intrinsic specificity based on the underlying energy landscape theory provides a practical way to quantify the specificity. Optimization of affinity and specificity can be adopted as a principle to guide the evolution and design of molecular recognition. This approach can also be used in practice for drug discovery using multidimensional screening to identify lead compounds. The energy landscape topography of molecular recognition is important for revealing the underlying flexible binding or binding-folding mechanisms. In this review, we first introduce the energy landscape theory for molecular recognition and then address four critical issues related to biomolecular recognition and conformational dynamics: (1) specificity quantification of molecular recognition; (2) evolution and design in molecular recognition; (3) flexible molecular recognition; (4) chromosome structural dynamics. The results described here and the discussions of the insights gained from the energy landscape topography can provide valuable guidance for further computational and experimental investigations of biomolecular recognition and conformational dynamics.
Collapse
Affiliation(s)
- Wen-Ting Chu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Xiakun Chu
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| | - Xiliang Zheng
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zuojia Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Li Xu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Kun Zhang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| |
Collapse
|
3
|
Guedes IA, Barreto AMS, Marinho D, Krempser E, Kuenemann MA, Sperandio O, Dardenne LE, Miteva MA. New machine learning and physics-based scoring functions for drug discovery. Sci Rep 2021; 11:3198. [PMID: 33542326 PMCID: PMC7862620 DOI: 10.1038/s41598-021-82410-1] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 01/20/2021] [Indexed: 12/11/2022] Open
Abstract
Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise physics-based descriptors better representing protein–ligand recognition process are strongly needed. We developed a set of new empirical scoring functions, named DockTScore, by explicitly accounting for physics-based terms combined with machine learning. Target-specific scoring functions were developed for two important drug targets, proteases and protein–protein interactions, representing an original class of molecules for drug discovery. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. DockTScore scoring functions demonstrated to be competitive with the current best-evaluated scoring functions in terms of binding energy prediction and ranking on four DUD-E datasets and will be useful for in silico drug design for diverse proteins as well as for specific targets such as proteases and protein–protein interactions. Currently, the MLR DockTScore is available at www.dockthor.lncc.br.
Collapse
Affiliation(s)
- Isabella A Guedes
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil.,Inserm U973, Université Paris Diderot, Paris, France
| | - André M S Barreto
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil
| | - Diogo Marinho
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil
| | | | | | - Olivier Sperandio
- Inserm U973, Université Paris Diderot, Paris, France.,Structural Bioinformatics Unit, CNRS UMR3528, Institut Pasteur, 75015, Paris, France
| | - Laurent E Dardenne
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil.
| | - Maria A Miteva
- Inserm U973, Université Paris Diderot, Paris, France. .,Inserm U1268 "Medicinal Chemistry and Translational Research", CiTCoM, UMR 8038, CNRS, Université de Paris, 75006, Paris, France.
| |
Collapse
|
4
|
Macari G, Toti D, Pasquadibisceglie A, Polticelli F. DockingApp RF: A State-of-the-Art Novel Scoring Function for Molecular Docking in a User-Friendly Interface to AutoDock Vina. Int J Mol Sci 2020; 21:ijms21249548. [PMID: 33333976 PMCID: PMC7765429 DOI: 10.3390/ijms21249548] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 12/11/2020] [Accepted: 12/11/2020] [Indexed: 11/28/2022] Open
Abstract
Motivation: Bringing a new drug to the market is expensive and time-consuming. To cut the costs and time, computer-aided drug design (CADD) approaches have been increasingly included in the drug discovery pipeline. However, despite traditional docking tools show a good conformational space sampling ability, they are still unable to produce accurate binding affinity predictions. This work presents a novel scoring function for molecular docking seamlessly integrated into DockingApp, a user-friendly graphical interface for AutoDock Vina. The proposed function is based on a random forest model and a selection of specific features to overcome the existing limits of Vina’s original scoring mechanism. A novel version of DockingApp, named DockingApp RF, has been developed to host the proposed scoring function and to automatize the rescoring procedure of the output of AutoDock Vina, even to nonexpert users. Results: By coupling intermolecular interaction, solvent accessible surface area features and Vina’s energy terms, DockingApp RF’s new scoring function is able to improve the binding affinity prediction of AutoDock Vina. Furthermore, comparison tests carried out on the CASF-2013 and CASF-2016 datasets demonstrate that DockingApp RF’s performance is comparable to other state-of-the-art machine-learning- and deep-learning-based scoring functions. The new scoring function thus represents a significant advancement in terms of the reliability and effectiveness of docking compared to AutoDock Vina’s scoring function. At the same time, the characteristics that made DockingApp appealing to a wide range of users are retained in this new version and have been complemented with additional features.
Collapse
Affiliation(s)
- Gabriele Macari
- Department of Sciences, Roma Tre University, 00146 Rome, Italy; (G.M.); (A.P.)
| | - Daniele Toti
- Faculty of Mathematical, Physical and Natural Sciences, Catholic University of the Sacred Heart, 25121 Brescia, Italy;
| | | | - Fabio Polticelli
- Department of Sciences, Roma Tre University, 00146 Rome, Italy; (G.M.); (A.P.)
- National Institute of Nuclear Physics, Roma Tre Section, 00146 Rome, Italy
- Correspondence:
| |
Collapse
|
5
|
Battisti A, Zamuner S, Sarti E, Laio A. Toward a unified scoring function for native state discrimination and drug-binding pocket recognition. Phys Chem Chem Phys 2019; 20:17148-17155. [PMID: 29900428 DOI: 10.1039/c7cp08170g] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Protein folding and receptor-ligand recognition are fundamental processes for any living organism. Although folding and ligand recognition are based on the same chemistry, the existing empirical scoring functions target just one problem: predicting the correct fold or the correct binding pose. We here introduce a statistical potential which considers moieties as fundamental units. The scoring function is able to deal with both folding and ligand pocket recognition problems with a performance comparable to the scoring functions specifically tailored for one of the two tasks. We foresee that the capability of the new scoring function to tackle both problems in a unified framework will be a key to deal with the induced fit phenomena, in which a target protein changes significantly its conformation upon binding. Moreover, the new scoring function might be useful in docking protocols towards intrinsically disordered proteins, whose flexibility cannot be handled with the available docking software.
Collapse
Affiliation(s)
- Anna Battisti
- International School for Advanced Studies (SISSA), Via Bonomea 265, I-34136 Trieste, Italy.
| | | | | | | |
Collapse
|
6
|
Yan Z, Wang J. SPA-LN: a scoring function of ligand-nucleic acid interactions via optimizing both specificity and affinity. Nucleic Acids Res 2017; 45:e110. [PMID: 28431169 PMCID: PMC5499587 DOI: 10.1093/nar/gkx255] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 04/05/2017] [Indexed: 01/10/2023] Open
Abstract
Nucleic acids have been widely recognized as potential targets in drug discovery and aptamer selection. Quantifying the interactions between small molecules and nucleic acids is critical to discover lead compounds and design novel aptamers. Scoring function is normally employed to quantify the interactions in structure-based virtual screening. However, the predictive power of nucleic acid–ligand scoring functions is still a challenge compared to other types of biomolecular recognition. With the rapid growth of experimentally determined nucleic acid–ligand complex structures, in this work, we develop a knowledge-based scoring function of nucleic acid–ligand interactions, namely SPA-LN. SPA-LN is optimized by maximizing both the affinity and specificity of native complex structures. The development strategy is different from those of previous nucleic acid–ligand scoring functions which focus on the affinity only in the optimization. The native conformation is stabilized while non-native conformations are destabilized by our optimization, making the funnel-like binding energy landscape more biased toward the native state. The performance of SPA-LN validates the development strategy and provides a relatively more accurate way to score the nucleic acid–ligand interactions.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China.,Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794-3400, USA
| |
Collapse
|
7
|
Liu Z, Su M, Han L, Liu J, Yang Q, Li Y, Wang R. Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions. Acc Chem Res 2017; 50:302-309. [PMID: 28182403 DOI: 10.1021/acs.accounts.6b00491] [Citation(s) in RCA: 207] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our latest work on this track, i.e. CASF-2013, the performance of a scoring function was quantified in four aspects, including "scoring power", "ranking power", "docking power", and "screening power". All four performance tests were conducted on a test set containing 195 high-quality protein-ligand complexes selected from PDBbind. A panel of 20 standard scoring functions were tested as demonstration. Importantly, CASF is designed to be an open-access benchmark, with which scoring functions developed by different researchers can be compared on the same grounds. Indeed, it has become a popular choice for scoring function validation in recent years. Despite the considerable progress that has been made so far, the performance of today's scoring functions still does not meet people's expectations in many aspects. There is a constant demand for more advanced scoring functions. Our efforts have helped to overcome some obstacles underlying scoring function development so that the researchers in this field can move forward faster. We will continue to improve the PDBbind database and the CASF benchmark in the future to keep them as useful community resources.
Collapse
Affiliation(s)
- Zhihai Liu
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Minyi Su
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Li Han
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Jie Liu
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Qifan Yang
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Yan Li
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Renxiao Wang
- State
Key Laboratory of Bioorganic and Natural Products Chemistry, Collaborative
Innovation Center of Chemistry for Life Sciences, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
- State
Key Laboratory of Quality Research in Chinese Medicine, Macau Institute
for Applied Research in Medicine and Health, Macau University of Science and Technology, Macau, People’s Republic of China
| |
Collapse
|
8
|
Wang C, Zhang Y. Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest. J Comput Chem 2017; 38:169-177. [PMID: 27859414 PMCID: PMC5140681 DOI: 10.1002/jcc.24667] [Citation(s) in RCA: 169] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Revised: 09/06/2016] [Accepted: 10/26/2016] [Indexed: 12/16/2022]
Abstract
The development of new protein-ligand scoring functions using machine learning algorithms, such as random forest, has been of significant interest. By efficiently utilizing expanded feature sets and a large set of experimental data, random forest based scoring functions (RFbScore) can achieve better correlations to experimental protein-ligand binding data with known crystal structures; however, more extensive tests indicate that such enhancement in scoring power comes with significant under-performance in docking and screening power tests compared to traditional scoring functions. In this work, to improve scoring-docking-screening powers of protein-ligand docking functions simultaneously, we have introduced a Δvina RF parameterization and feature selection framework based on random forest. Our developed scoring function Δvina RF20 , which employs 20 descriptors in addition to the AutoDock Vina score, can achieve superior performance in all power tests of both CASF-2013 and CASF-2007 benchmarks compared to classical scoring functions. The Δvina RF20 scoring function and its code are freely available on the web at: https://www.nyu.edu/projects/yzhang/DeltaVina. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Chemistry, New York University, New York, New York 10003
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
9
|
Yan Z, Wang J. Incorporating specificity into optimization: evaluation of SPA using CSAR 2014 and CASF 2013 benchmarks. J Comput Aided Mol Des 2016; 30:219-27. [DOI: 10.1007/s10822-016-9897-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 01/28/2016] [Indexed: 01/04/2023]
|
10
|
Ain QU, Aleksandrova A, Roessler FD, Ballester PJ. Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2015; 5:405-424. [PMID: 27110292 PMCID: PMC4832270 DOI: 10.1002/wcms.1225] [Citation(s) in RCA: 190] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 07/17/2015] [Accepted: 07/18/2015] [Indexed: 12/29/2022]
Abstract
Docking tools to predict whether and how a small molecule binds to a target can be applied if a structural model of such target is available. The reliability of docking depends, however, on the accuracy of the adopted scoring function (SF). Despite intense research over the years, improving the accuracy of SFs for structure-based binding affinity prediction or virtual screening has proven to be a challenging task for any class of method. New SFs based on modern machine-learning regression models, which do not impose a predetermined functional form and thus are able to exploit effectively much larger amounts of experimental data, have recently been introduced. These machine-learning SFs have been shown to outperform a wide range of classical SFs at both binding affinity prediction and virtual screening. The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert-selected structural features can be strongly improved by a machine-learning approach based on nonlinear regression allied with comprehensive data-driven feature selection. Furthermore, the performance of classical SFs does not grow with larger training datasets and hence this performance gap is expected to widen as more training data becomes available in the future. Other topics covered in this review include predicting the reliability of a SF on a particular target class, generating synthetic data to improve predictive performance and modeling guidelines for SF development. WIREs Comput Mol Sci 2015, 5:405-424. doi: 10.1002/wcms.1225 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Qurrat Ul Ain
- Department of Chemistry, Centre for Molecular Informatics University of Cambridge Cambridge UK
| | | | - Florian D Roessler
- Department of Chemistry, Centre for Molecular Informatics University of Cambridge Cambridge UK
| | - Pedro J Ballester
- Cancer Research Center of Marseille, (INSERM U1068, Institut Paoli-Calmettes, Aix-Marseille Université, CNRS UMR7258) Marseille France
| |
Collapse
|