1
|
Li X, Shen C, Zhu H, Yang Y, Wang Q, Yang J, Huang N. A High-Quality Data Set of Protein-Ligand Binding Interactions Via Comparative Complex Structure Modeling. J Chem Inf Model 2024; 64:2454-2466. [PMID: 38181418 DOI: 10.1021/acs.jcim.3c01170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2024]
Abstract
High-quality protein-ligand complex structures provide the basis for understanding the nature of noncovalent binding interactions at the atomic level and enable structure-based drug design. However, experimentally determined complex structures are scarce compared with the vast chemical space. In this study, we addressed this issue by constructing the BindingNet data set via comparative complex structure modeling, which contains 69,816 modeled high-quality protein-ligand complex structures with experimental binding affinity data. BindingNet provides valuable insights into investigating protein-ligand interactions, allowing visual inspection and interpretation of structural analogues' structure-activity relationships. It can also be used for evaluating machine-learning-based scoring functions. Our results indicate that machine learning models trained on BindingNet could reduce the bias caused by buried solvent-accessible surface area, as we previously found for models trained on the PDBbind data set. We also discussed strategies to improve BindingNet and its potential utilization for benchmarking the molecular docking methods and ligand binding free energy calculation approaches. The BindingNet complements PDBbind in constructing a sufficient and unbiased protein-ligand binding data set and is freely available at http://bindingnet.huanglab.org.cn.
Collapse
Affiliation(s)
- Xuelian Li
- National Institute of Biological Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
| | - Cheng Shen
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
| | - Hui Zhu
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing 102206, China
| | - Yujian Yang
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
| | - Qing Wang
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
| | - Jincai Yang
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
| | - Niu Huang
- National Institute of Biological Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing 102206, China
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing 102206, China
| |
Collapse
|
2
|
Ribeiro AJM, Riziotis IG, Borkakoti N, Thornton JM. Enzyme function and evolution through the lens of bioinformatics. Biochem J 2023; 480:1845-1863. [PMID: 37991346 PMCID: PMC10754289 DOI: 10.1042/bcj20220405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 11/23/2023]
Abstract
Enzymes have been shaped by evolution over billions of years to catalyse the chemical reactions that support life on earth. Dispersed in the literature, or organised in online databases, knowledge about enzymes can be structured in distinct dimensions, either related to their quality as biological macromolecules, such as their sequence and structure, or related to their chemical functions, such as the catalytic site, kinetics, mechanism, and overall reaction. The evolution of enzymes can only be understood when each of these dimensions is considered. In addition, many of the properties of enzymes only make sense in the light of evolution. We start this review by outlining the main paradigms of enzyme evolution, including gene duplication and divergence, convergent evolution, and evolution by recombination of domains. In the second part, we overview the current collective knowledge about enzymes, as organised by different types of data and collected in several databases. We also highlight some increasingly powerful computational tools that can be used to close gaps in understanding, in particular for types of data that require laborious experimental protocols. We believe that recent advances in protein structure prediction will be a powerful catalyst for the prediction of binding, mechanism, and ultimately, chemical reactions. A comprehensive mapping of enzyme function and evolution may be attainable in the near future.
Collapse
Affiliation(s)
- Antonio J. M. Ribeiro
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Ioannis G. Riziotis
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Neera Borkakoti
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Janet M. Thornton
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| |
Collapse
|
3
|
Ding K, Yin S, Li Z, Jiang S, Yang Y, Zhou W, Zhang Y, Huang B. Observing Noncovalent Interactions in Experimental Electron Density for Macromolecular Systems: A Novel Perspective for Protein–Ligand Interaction Research. J Chem Inf Model 2022; 62:1734-1743. [DOI: 10.1021/acs.jcim.1c01406] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kang Ding
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Shiqiu Yin
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Zhongwei Li
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Shiju Jiang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Yang Yang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Wenbiao Zhou
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Yingsheng Zhang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| | - Bo Huang
- Beijing StoneWise Technology Co Ltd., Haidian Street #15, Haidian District, Beijing 100080, China
| |
Collapse
|
4
|
Morado J, Mortenson PN, Nissink JWM, Verdonk ML, Ward RA, Essex JW, Skylaris CK. Generation of Quantum Configurational Ensembles Using Approximate Potentials. J Chem Theory Comput 2021; 17:7021-7042. [PMID: 34644088 DOI: 10.1021/acs.jctc.1c00532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Conformational analysis is of paramount importance in drug design: it is crucial to determine pharmacological properties, understand molecular recognition processes, and characterize the conformations of ligands when unbound. Molecular Mechanics (MM) simulation methods, such as Monte Carlo (MC) and molecular dynamics (MD), are usually employed to generate ensembles of structures due to their ability to extensively sample the conformational space of molecules. The accuracy of these MM-based schemes strongly depends on the functional form of the force field (FF) and its parametrization, components that often hinder their performance. High-level methods, such as ab initio MD, provide reliable structural information but are still too computationally expensive to allow for extensive sampling. Therefore, to overcome these limitations, we present a multilevel MC method that is capable of generating quantum configurational ensembles while keeping the computational cost at a minimum. We show that FF reparametrization is an efficient route to generate FFs that reproduce QM results more closely, which, in turn, can be used as low-cost models to achieve the gold standard QM accuracy. We demonstrate that the MC acceptance rate is strongly correlated with various phase space overlap measurements and that it constitutes a robust metric to evaluate the similarity between the MM and QM levels of theory. As a more advanced application, we present a self-parametrizing version of the algorithm, which combines sampling and FF parametrization in one scheme, and apply the methodology to generate the QM/MM distribution of a ligand in aqueous solution.
Collapse
Affiliation(s)
- João Morado
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Paul N Mortenson
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - J Willem M Nissink
- Medicinal Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Marcel L Verdonk
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - Richard A Ward
- Medicinal Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
5
|
Baltoumas FA, Zafeiropoulou S, Karatzas E, Koutrouli M, Thanati F, Voutsadaki K, Gkonta M, Hotova J, Kasionis I, Hatzis P, Pavlopoulos GA. Biomolecule and Bioentity Interaction Databases in Systems Biology: A Comprehensive Review. Biomolecules 2021; 11:1245. [PMID: 34439912 PMCID: PMC8391349 DOI: 10.3390/biom11081245] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 08/16/2021] [Accepted: 08/18/2021] [Indexed: 02/06/2023] Open
Abstract
Technological advances in high-throughput techniques have resulted in tremendous growth of complex biological datasets providing evidence regarding various biomolecular interactions. To cope with this data flood, computational approaches, web services, and databases have been implemented to deal with issues such as data integration, visualization, exploration, organization, scalability, and complexity. Nevertheless, as the number of such sets increases, it is becoming more and more difficult for an end user to know what the scope and focus of each repository is and how redundant the information between them is. Several repositories have a more general scope, while others focus on specialized aspects, such as specific organisms or biological systems. Unfortunately, many of these databases are self-contained or poorly documented and maintained. For a clearer view, in this article we provide a comprehensive categorization, comparison and evaluation of such repositories for different bioentity interaction types. We discuss most of the publicly available services based on their content, sources of information, data representation methods, user-friendliness, scope and interconnectivity, and we comment on their strengths and weaknesses. We aim for this review to reach a broad readership varying from biomedical beginners to experts and serve as a reference article in the field of Network Biology.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Sofia Zafeiropoulou
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Mikaela Koutrouli
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Foteini Thanati
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Kleanthi Voutsadaki
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Maria Gkonta
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Joana Hotova
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Ioannis Kasionis
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
| | - Pantelis Hatzis
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; (S.Z.); (E.K.); (M.K.); (F.T.); (K.V.); (M.G.); (J.H.); (I.K.); (P.H.)
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece
| |
Collapse
|
6
|
Alshehri AS, Gani R, You F. Deep learning and knowledge-based methods for computer-aided molecular design—toward a unified approach: State-of-the-art and future directions. Comput Chem Eng 2020. [DOI: 10.1016/j.compchemeng.2020.107005] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
7
|
Gojobori T, Ikeo K, Katayama Y, Kawabata T, Kinjo AR, Kinoshita K, Kwon Y, Migita O, Mizutani H, Muraoka M, Nagata K, Omori S, Sugawara H, Yamada D, Yura K. VaProS: a database-integration approach for protein/genome information retrieval. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2016; 17:69-81. [PMID: 28012137 PMCID: PMC5274651 DOI: 10.1007/s10969-016-9211-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2016] [Accepted: 12/05/2016] [Indexed: 01/01/2023]
Abstract
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
Collapse
Affiliation(s)
- Takashi Gojobori
- Computational Bioscience Research Center, Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Kazuho Ikeo
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Yukie Katayama
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Takeshi Kawabata
- Institute for Protein Research, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Akira R Kinjo
- Institute for Protein Research, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, Miyagi, Sendai, 980-8597, Japan
- Tohoku Medical Megabank Organization, Tohoku University, Miyagi, Sendai, 980-8573, Japan
| | - Yeondae Kwon
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Ohsuke Migita
- Department of Maternal-Fetal Biology, National Research Institute for Child Health and Development, Setagaya, Tokyo, 157-8535, Japan
- Department of Pediatrics, St. Marianna University School of Medicine, Miyamae, Kawasaki, 216-8511, Japan
| | - Hisashi Mizutani
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Masafumi Muraoka
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Koji Nagata
- Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo, Tokyo, 113-8657, Japan
| | - Satoshi Omori
- Graduate School of Information Sciences, Tohoku University, Miyagi, Sendai, 980-8597, Japan
| | - Hideaki Sugawara
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan
| | - Daichi Yamada
- Center for Informational Biology, Ochanomizu University, 2-1-1, Otsuka, Bunkyo, Tokyo, 112-8610, Japan
| | - Kei Yura
- National Institute of Genetics, Shizuoka, 411-8540, Mishima, Japan.
- Center for Informational Biology, Ochanomizu University, 2-1-1, Otsuka, Bunkyo, Tokyo, 112-8610, Japan.
| |
Collapse
|