1
|
Carpenter KA, Altman RB. Databases of ligand-binding pockets and protein-ligand interactions. Comput Struct Biotechnol J 2024; 23:1320-1338. [PMID: 38585646 PMCID: PMC10997877 DOI: 10.1016/j.csbj.2024.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 03/16/2024] [Accepted: 03/17/2024] [Indexed: 04/09/2024] Open
Abstract
Many research groups and institutions have created a variety of databases curating experimental and predicted data related to protein-ligand binding. The landscape of available databases is dynamic, with new databases emerging and established databases becoming defunct. Here, we review the current state of databases that contain binding pockets and protein-ligand binding interactions. We have compiled a list of such databases, fifty-three of which are currently available for use. We discuss variation in how binding pockets are defined and summarize pocket-finding methods. We organize the fifty-three databases into subgroups based on goals and contents, and describe standard use cases. We also illustrate that pockets within the same protein are characterized differently across different databases. Finally, we assess critical issues of sustainability, accessibility and redundancy.
Collapse
Affiliation(s)
- Kristy A. Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
2
|
Xia Y, Pan X, Shen HB. A comprehensive survey on protein-ligand binding site prediction. Curr Opin Struct Biol 2024; 86:102793. [PMID: 38447285 DOI: 10.1016/j.sbi.2024.102793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 02/18/2024] [Accepted: 02/18/2024] [Indexed: 03/08/2024]
Abstract
Protein-ligand binding site prediction is critical for protein function annotation and drug discovery. Biological experiments are time-consuming and require significant equipment, materials, and labor resources. Developing accurate and efficient computational methods for protein-ligand interaction prediction is essential. Here, we summarize the key challenges associated with ligand binding site (LBS) prediction and introduce recently published methods from their input features, computational algorithms, and ligand types. Furthermore, we investigate the specificity of allosteric site identification as a particular LBS type. Finally, we discuss the prospective directions for machine learning-based LBS prediction in the near future.
Collapse
Affiliation(s)
- Ying Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| |
Collapse
|
3
|
Wei H, Wang W, Peng Z, Yang J. Q-BioLiP: A Comprehensive Resource for Quaternary Structure-based Protein-ligand Interactions. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae001. [PMID: 38862427 DOI: 10.1093/gpbjnl/qzae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 11/12/2023] [Accepted: 12/03/2023] [Indexed: 06/13/2024]
Abstract
Since its establishment in 2013, BioLiP has become one of the widely used resources for protein-ligand interactions. Nevertheless, several known issues occurred with it over the past decade. For example, the protein-ligand interactions are represented in the form of single chain-based tertiary structures, which may be inappropriate as many interactions involve multiple protein chains (known as quaternary structures). We sought to address these issues, resulting in Q-BioLiP, a comprehensive resource for quaternary structure-based protein-ligand interactions. The major features of Q-BioLiP include: (1) representing protein structures in the form of quaternary structures rather than single chain-based tertiary structures; (2) pairing DNA/RNA chains properly rather than separation; (3) providing both experimental and predicted binding affinities; (4) retaining both biologically relevant and irrelevant interactions to alleviate the wrong justification of ligands' biological relevance; and (5) developing a new quaternary structure-based algorithm for the modelling of protein-ligand complex structure. With these new features, Q-BioLiP is expected to be a valuable resource for studying biomolecule interactions, including protein-small molecule interaction, protein-metal ion interaction, protein-peptide interaction, protein-protein interaction, protein-DNA/RNA interaction, and RNA-small molecule interaction. Q-BioLiP is freely available at https://yanglab.qd.sdu.edu.cn/Q-BioLiP/.
Collapse
Affiliation(s)
- Hong Wei
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
4
|
Carbery A, Buttenschoen M, Skyner R, von Delft F, Deane CM. Learnt representations of proteins can be used for accurate prediction of small molecule binding sites on experimentally determined and predicted protein structures. J Cheminform 2024; 16:32. [PMID: 38486231 PMCID: PMC10941399 DOI: 10.1186/s13321-024-00821-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 03/01/2024] [Indexed: 03/17/2024] Open
Abstract
Protein-ligand binding site prediction is a useful tool for understanding the functional behaviour and potential drug-target interactions of a novel protein of interest. However, most binding site prediction methods are tested by providing crystallised ligand-bound (holo) structures as input. This testing regime is insufficient to understand the performance on novel protein targets where experimental structures are not available. An alternative option is to provide computationally predicted protein structures, but this is not commonly tested. However, due to the training data used, computationally-predicted protein structures tend to be extremely accurate, and are often biased toward a holo conformation. In this study we describe and benchmark IF-SitePred, a protein-ligand binding site prediction method which is based on the labelling of ESM-IF1 protein language model embeddings combined with point cloud annotation and clustering. We show that not only is IF-SitePred competitive with state-of-the-art methods when predicting binding sites on experimental structures, but it performs better on proxies for novel proteins where low accuracy has been simulated by molecular dynamics. Finally, IF-SitePred outperforms other methods if ensembles of predicted protein structures are generated.
Collapse
Affiliation(s)
- Anna Carbery
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot, OX11 0DE, UK
| | - Martin Buttenschoen
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
| | - Rachael Skyner
- OMass Therapeutics, Building 4000, Chancellor Court, John Smith Drive, ARC Oxford, OX4 2GX, UK
| | - Frank von Delft
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot, OX11 0DE, UK
- Centre for Medicines Discovery, University of Oxford, Oxford, OX3 7DQ, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, Didcot, OX11 0FA, United Kingdom
- Department of Biochemistry, University of Johannesburg, Johannesburg, 2006, South Africa
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK.
| |
Collapse
|
5
|
Utgés JS, MacGowan SA, Ives CM, Barton GJ. Classification of likely functional class for ligand binding sites identified from fragment screening. Commun Biol 2024; 7:320. [PMID: 38480979 PMCID: PMC10937669 DOI: 10.1038/s42003-024-05970-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024] Open
Abstract
Fragment screening is used to identify binding sites and leads in drug discovery, but it is often unclear which binding sites are functionally important. Here, data from 37 experiments, and 1309 protein structures binding to 1601 ligands were analysed. A method to group ligands by binding sites is introduced and sites clustered according to profiles of relative solvent accessibility. This identified 293 unique ligand binding sites, grouped into four clusters (C1-4). C1 includes larger, buried, conserved, and population missense-depleted sites, enriched in known functional sites. C4 comprises smaller, accessible, divergent, missense-enriched sites, depleted in functional sites. A site in C1 is 28 times more likely to be functional than one in C4. Seventeen sites, which to the best of our knowledge are novel, in 13 proteins are identified as likely to be functionally important with examples from human tenascin and 5-aminolevulinate synthase highlighted. A multi-layer perceptron, and K-nearest neighbours model are presented to predict cluster labels for ligand binding sites with an accuracy of 96% and 100%, respectively, so allowing functional classification of sites for proteins not in this set. Our findings will be of interest to those studying protein-ligand interactions and developing new drugs or function modulators.
Collapse
Affiliation(s)
- Javier S Utgés
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, Scotland, UK
| | - Stuart A MacGowan
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, Scotland, UK
| | - Callum M Ives
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, Scotland, UK
- Department of Chemistry and Hamilton Institute, Maynooth University, Maynooth, Ireland
| | - Geoffrey J Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, Scotland, UK.
| |
Collapse
|
6
|
Leyland B, Novichkova E, Dolui AK, Jallet D, Daboussi F, Legeret B, Li Z, Li-Beisson Y, Boussiba S, Khozin-Goldberg I. Acyl-CoA binding protein is required for lipid droplet degradation in the diatom Phaeodactylum tricornutum. PLANT PHYSIOLOGY 2024; 194:958-981. [PMID: 37801606 DOI: 10.1093/plphys/kiad525] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/28/2023] [Accepted: 07/15/2023] [Indexed: 10/08/2023]
Abstract
Diatoms (Bacillariophyceae) accumulate neutral storage lipids in lipid droplets during stress conditions, which can be rapidly degraded and recycled when optimal conditions resume. Since nutrient and light availability fluctuate in marine environments, storage lipid turnover is essential for diatom dominance of marine ecosystems. Diatoms have garnered attention for their potential to provide a sustainable source of omega-3 fatty acids. Several independent proteomic studies of lipid droplets isolated from the model oleaginous pennate diatom Phaeodactylum tricornutum have identified a previously uncharacterized protein with an acyl-CoA binding (ACB) domain, Phatrdraft_48778, here referred to as Phaeodactylum tricornutum acyl-CoA binding protein (PtACBP). We report the phenotypic effects of CRISPR-Cas9 targeted genome editing of PtACBP. ptacbp mutants were defective in lipid droplet and triacylglycerol degradation, as well as lipid and eicosapentaenoic acid synthesis, during recovery from nitrogen starvation. Transcription of genes responsible for peroxisomal β-oxidation, triacylglycerol lipolysis, and eicosapentaenoic acid synthesis was inhibited. A lipid-binding assay using a synthetic ACB domain from PtACBP indicated preferential binding specificity toward certain polar lipids. PtACBP fused to eGFP displayed an endomembrane-like pattern, which surrounded the periphery of lipid droplets. PtACBP is likely responsible for intracellular acyl transport, affecting cell division, development, photosynthesis, and stress response. A deeper understanding of the molecular mechanisms governing storage lipid turnover will be crucial for developing diatoms and other microalgae as biotechnological cell factories.
Collapse
Affiliation(s)
- Ben Leyland
- The Microalgal Biotechnology Laboratory, The French Associates Institute for Agriculture and Biotechnology, Jacob Blaustein Institute for Desert Research, Ben-Gurion University of the Negev, Sede Boker Campus 84990, Israel
| | - Ekaterina Novichkova
- The Microalgal Biotechnology Laboratory, The French Associates Institute for Agriculture and Biotechnology, Jacob Blaustein Institute for Desert Research, Ben-Gurion University of the Negev, Sede Boker Campus 84990, Israel
| | - Achintya Kumar Dolui
- The Microalgal Biotechnology Laboratory, The French Associates Institute for Agriculture and Biotechnology, Jacob Blaustein Institute for Desert Research, Ben-Gurion University of the Negev, Sede Boker Campus 84990, Israel
| | - Denis Jallet
- Toulouse Biotechnology Institute Bio & Chemical Engineering, Institut National de la Recherche Agronomique, Institute National Des Sciences Appliquees, Le Centre national de la recherche scientifique, Toulouse 31077, France
| | - Fayza Daboussi
- Toulouse Biotechnology Institute Bio & Chemical Engineering, Institut National de la Recherche Agronomique, Institute National Des Sciences Appliquees, Le Centre national de la recherche scientifique, Toulouse 31077, France
| | - Bertrand Legeret
- Aix-Marseille University, CEA, CNRS, BIAM, Institut de Biosciences et Biotechnologies Aix-Marseille, CEA Cadarache, Saint Paul-Lez-Durance 13108, France
| | - Zhongze Li
- Aix-Marseille University, CEA, CNRS, BIAM, Institut de Biosciences et Biotechnologies Aix-Marseille, CEA Cadarache, Saint Paul-Lez-Durance 13108, France
| | - Yonghua Li-Beisson
- Aix-Marseille University, CEA, CNRS, BIAM, Institut de Biosciences et Biotechnologies Aix-Marseille, CEA Cadarache, Saint Paul-Lez-Durance 13108, France
| | - Sammy Boussiba
- The Microalgal Biotechnology Laboratory, The French Associates Institute for Agriculture and Biotechnology, Jacob Blaustein Institute for Desert Research, Ben-Gurion University of the Negev, Sede Boker Campus 84990, Israel
| | - Inna Khozin-Goldberg
- The Microalgal Biotechnology Laboratory, The French Associates Institute for Agriculture and Biotechnology, Jacob Blaustein Institute for Desert Research, Ben-Gurion University of the Negev, Sede Boker Campus 84990, Israel
| |
Collapse
|
7
|
Bagchi A. Molecular Modeling Techniques and In-Silico Drug Discovery. Methods Mol Biol 2024; 2719:1-11. [PMID: 37803109 DOI: 10.1007/978-1-0716-3461-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
Molecular modeling is the technique to determine the overall structure of an unknown molecule, be it a small one or a macromolecule. The technique encompasses the method of screening ligand libraries for the development of new candidate drug molecules. All these aspects have become an essential topic of research. This field is truly interdisciplinary and finds its applications in almost all fields of life science research. In this chapter, an overview of the protocol associated with molecular modeling techniques will be discussed.
Collapse
Affiliation(s)
- Angshuman Bagchi
- Department of Biochemistry and Biophysics, University of Kalyani, Kalyani, West Bengal, India.
| |
Collapse
|
8
|
Kumar A, Hooda P, Puri A, Khatter R, S. Al-Dosari M, Sinha N, Parvez MK, Sehgal D. Methotrexate, an anti-inflammatory drug, inhibits Hepatitis E viral replication. J Enzyme Inhib Med Chem 2023; 38:2280500. [PMID: 37975328 PMCID: PMC11003484 DOI: 10.1080/14756366.2023.2280500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 10/30/2023] [Indexed: 11/19/2023] Open
Abstract
Hepatitis E Virus (HEV) is a positively oriented RNA virus having a 7.2 kb genome. HEV consists of three open reading frames (ORF1-3). Of these, ORF1 codes for the enzymes Methyltransferase (Mtase), Papain-like cysteine protease (PCP), RNA helicase, and RNA-dependent RNA polymerase (RdRp). Unavailability of a vaccine or effective drug against HEV and considering the side effects associated with the off-label use of ribavirin (RBV) and pegylated interferons, an alternative approach is required by the modulation of specific enzymes to prevent the infection. HEV helicase is involved in unwinding the double-stranded RNA, RNA processing, transcriptional regulation, and pre-mRNA processing. Therefore, we screened FDA-approved compounds from the ZINC15 database against the modelled 3D structure of HEV helicase and found that methotrexate and compound A (Pubchem ID BTB07890) inhibit the NTPase and dsRNA unwinding activity leading to inhibition of HEV RNA replication. This may be further authenticated by in vivo study.
Collapse
Affiliation(s)
- Akash Kumar
- Department of Life Sciences, Virology lab, Shiv Nadar Institution of Eminence, Greater Noida, India
| | - Preeti Hooda
- Department of Life Sciences, Virology lab, Shiv Nadar Institution of Eminence, Greater Noida, India
| | - Anindita Puri
- Department of Life Sciences, Virology lab, Shiv Nadar Institution of Eminence, Greater Noida, India
| | - Radhika Khatter
- Department of Life Sciences, Virology lab, Shiv Nadar Institution of Eminence, Greater Noida, India
| | - Mohammed S. Al-Dosari
- Department of Pharmacognosy, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Neha Sinha
- Department of Infectious Diseases and Microbiology, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Mohammad K. Parvez
- Department of Pharmacognosy, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Deepak Sehgal
- Department of Life Sciences, Virology lab, Shiv Nadar Institution of Eminence, Greater Noida, India
| |
Collapse
|
9
|
Popov P, Kalinin R, Buslaev P, Kozlovskii I, Zaretckii M, Karlov D, Gabibov A, Stepanov A. Unraveling viral drug targets: a deep learning-based approach for the identification of potential binding sites. Brief Bioinform 2023; 25:bbad459. [PMID: 38113077 PMCID: PMC10783863 DOI: 10.1093/bib/bbad459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/10/2023] [Accepted: 11/22/2023] [Indexed: 12/21/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic has spurred a wide range of approaches to control and combat the disease. However, selecting an effective antiviral drug target remains a time-consuming challenge. Computational methods offer a promising solution by efficiently reducing the number of candidates. In this study, we propose a structure- and deep learning-based approach that identifies vulnerable regions in viral proteins corresponding to drug binding sites. Our approach takes into account the protein dynamics, accessibility and mutability of the binding site and the putative mechanism of action of the drug. We applied this technique to validate drug targeting toward severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike glycoprotein S. Our findings reveal a conformation- and oligomer-specific glycan-free binding site proximal to the receptor binding domain. This site comprises topologically important amino acid residues. Molecular dynamics simulations of Spike in complex with candidate drug molecules bound to the potential binding sites indicate an equilibrium shifted toward the inactive conformation compared with drug-free simulations. Small molecules targeting this binding site have the potential to prevent the closed-to-open conformational transition of Spike, thereby allosterically inhibiting its interaction with human angiotensin-converting enzyme 2 receptor. Using a pseudotyped virus-based assay with a SARS-CoV-2 neutralizing antibody, we identified a set of hit compounds that exhibited inhibition at micromolar concentrations.
Collapse
Affiliation(s)
- Petr Popov
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Roman Kalinin
- M.M. Shemyakin and Yu.A. Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, 117997, Russia
| | - Pavel Buslaev
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014, Jyväskylä, Finland
| | - Igor Kozlovskii
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Mark Zaretckii
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Dmitry Karlov
- School of Pharmacy, Medical Biology Centre, Queen’s University Belfast, Street, Belfast, BT9 7BL Northern Ireland, U.K
| | - Alexander Gabibov
- M.M. Shemyakin and Yu.A. Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, 117997, Russia
| | - Alexey Stepanov
- Department of Chemistry, The Scripps Research Institute, 10550 North Torrey Pines Road MB-10, La Jolla, 92037, CA, USA
| |
Collapse
|
10
|
Tan H, Wang Z, Hu G. GAABind: a geometry-aware attention-based network for accurate protein-ligand binding pose and binding affinity prediction. Brief Bioinform 2023; 25:bbad462. [PMID: 38102069 PMCID: PMC10724026 DOI: 10.1093/bib/bbad462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 11/19/2023] [Accepted: 11/22/2023] [Indexed: 12/17/2023] Open
Abstract
Protein-ligand interactions are increasingly profiled at high-throughput, playing a vital role in lead compound discovery and drug optimization. Accurate prediction of binding pose and binding affinity constitutes a pivotal challenge in advancing our computational understanding of protein-ligand interactions. However, inherent limitations still exist, including high computational cost for conformational search sampling in traditional molecular docking tools, and the unsatisfactory molecular representation learning and intermolecular interaction modeling in deep learning-based methods. Here we propose a geometry-aware attention-based deep learning model, GAABind, which effectively predicts the pocket-ligand binding pose and binding affinity within a multi-task learning framework. Specifically, GAABind comprehensively captures the geometric and topological properties of both binding pockets and ligands, and employs expressive molecular representation learning to model intramolecular interactions. Moreover, GAABind proficiently learns the intermolecular many-body interactions and simulates the dynamic conformational adaptations of the ligand during its interaction with the protein through meticulously designed networks. We trained GAABind on the PDBbindv2020 and evaluated it on the CASF2016 dataset; the results indicate that GAABind achieves state-of-the-art performance in binding pose prediction and shows comparable binding affinity prediction performance. Notably, GAABind achieves a success rate of 82.8% in binding pose prediction, and the Pearson correlation between predicted and experimental binding affinities reaches up to 0.803. Additionally, we assessed GAABind's performance on the severe acute respiratory syndrome coronavirus 2 main protease cross-docking dataset. In this evaluation, GAABind demonstrates a notable success rate of 76.5% in binding pose prediction and achieves the highest Pearson correlation coefficient in binding affinity prediction compared with all baseline methods.
Collapse
Affiliation(s)
- Huishuang Tan
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Zhixin Wang
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute of Molecular Enzymology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
| | - Guang Hu
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
| |
Collapse
|
11
|
Di X, Rodriguez-Concepcion M. Exploring the Deoxy-D-xylulose-5-phosphate Synthase Gene Family in Tomato ( Solanum lycopersicum). PLANTS (BASEL, SWITZERLAND) 2023; 12:3886. [PMID: 38005784 PMCID: PMC10675008 DOI: 10.3390/plants12223886] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 11/10/2023] [Accepted: 11/15/2023] [Indexed: 11/26/2023]
Abstract
Isoprenoids are a wide family of metabolites including high-value chemicals, flavors, pigments, and drugs. Isoprenoids are particularly abundant and diverse in plants. The methyl-D-erythritol 4-phosphate (MEP) pathway produces the universal isoprenoid precursors isopentenyl diphosphate and dimethylallyl diphosphate in plant plastids for the downstream production of monoterpenes, diterpenes, and photosynthesis-related isoprenoids such as carotenoids, chlorophylls, tocopherols, phylloquinone, and plastoquinone. The enzyme deoxy-D-xylulose 5-phosphate synthase (DXS) is the first and main rate-determining enzyme of the MEP pathway. In tomato (Solanum lycopersicum), a plant with an active isoprenoid metabolism in several tissues, three genes encode DXS-like proteins (SlDXS1 to 3). Here, we show that the expression patterns of the three genes suggest distinct physiological roles without excluding that they might function together in some tissues. We also confirm that SlDXS1 and 2 are true DXS enzymes, whereas SlDXS3 lacks DXS activity. We further show that SlDXS1 and 2 co-localize in plastidial speckles and that they can be immunoprecipitated together, suggesting that they might form heterodimers in vivo in at least some tissues. These results provide novel insights for the biotechnological use of DXS isoforms in metabolic engineering strategies to up-regulate the MEP pathway flux.
Collapse
Affiliation(s)
- Xueni Di
- Institute for Plant Molecular and Cell Biology (IBMCP), CSIC—Universitat Politècnica de València, 46022 Valencia, Spain
| | - Manuel Rodriguez-Concepcion
- Institute for Plant Molecular and Cell Biology (IBMCP), CSIC—Universitat Politècnica de València, 46022 Valencia, Spain
| |
Collapse
|
12
|
Canner SW, Shanker S, Gray JJ. Structure-based neural network protein-carbohydrate interaction predictions at the residue level. FRONTIERS IN BIOINFORMATICS 2023; 3:1186531. [PMID: 37409346 PMCID: PMC10318439 DOI: 10.3389/fbinf.2023.1186531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 05/31/2023] [Indexed: 07/07/2023] Open
Abstract
Carbohydrates dynamically and transiently interact with proteins for cell-cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate-Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein-carbohydrate structures.
Collapse
Affiliation(s)
- Samuel W. Canner
- Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, MD, United States
| | - Sudhanshu Shanker
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Jeffrey J. Gray
- Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, MD, United States
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
13
|
Zhang J, Gao LX, Chen W, Zhong JJ, Qian C, Zhou WW. Rational Design of Daunorubicin C-14 Hydroxylase Based on the Understanding of Its Substrate-Binding Mechanism. Int J Mol Sci 2023; 24:ijms24098337. [PMID: 37176043 PMCID: PMC10179135 DOI: 10.3390/ijms24098337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 04/26/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023] Open
Abstract
Doxorubicin is one of the most widely used antitumor drugs and is currently produced via the chemical conversion method, which suffers from high production costs, complex product separation processes, and serious environmental pollution. Biocatalysis is considered a more efficient and environment-friendly method for drug production. The cytochrome daunorubicin C-14 hydroxylase (DoxA) is the essential enzyme catalyzing the conversion of daunorubicin to doxorubicin. Herein, the DoxA from Streptomyces peucetius subsp. caesius ATCC 27952 was expressed in Escherichia coli, and the rational design strategy was further applied to improve the enzyme activity. Eight amino acid residues were identified as the key sites via molecular docking. Using a constructed screening library, we obtained the mutant DoxA(P88Y) with a more rational protein conformation, and a 56% increase in bioconversion efficiency was achieved by the mutant compared to the wild-type DoxA. Molecular dynamics simulation was applied to understand the relationship between the enzyme's structural property and its substrate-binding efficiency. It was demonstrated that the mutant DoxA(P88Y) formed a new hydrophobic interaction with the substrate daunorubicin, which might have enhanced the binding stability and thus improved the catalytic activity. Our work lays a foundation for further exploration of DoxA and facilitates the industrial process of bio-production of doxorubicin.
Collapse
Affiliation(s)
- Jing Zhang
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
- School of Chemical and Biomolecular Engineering, The University of Sydney, Sydney, NSW 2006, Australia
| | - Ling-Xiao Gao
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| | - Wei Chen
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| | - Jian-Jiang Zhong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chao Qian
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, Zhejiang University, Hangzhou 310027, China
| | - Wen-Wen Zhou
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
14
|
Grasso D, Galderisi S, Santucci A, Bernini A. Pharmacological Chaperones and Protein Conformational Diseases: Approaches of Computational Structural Biology. Int J Mol Sci 2023; 24:ijms24065819. [PMID: 36982893 PMCID: PMC10054308 DOI: 10.3390/ijms24065819] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 03/09/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Whenever a protein fails to fold into its native structure, a profound detrimental effect is likely to occur, and a disease is often developed. Protein conformational disorders arise when proteins adopt abnormal conformations due to a pathological gene variant that turns into gain/loss of function or improper localization/degradation. Pharmacological chaperones are small molecules restoring the correct folding of a protein suitable for treating conformational diseases. Small molecules like these bind poorly folded proteins similarly to physiological chaperones, bridging non-covalent interactions (hydrogen bonds, electrostatic interactions, and van der Waals contacts) loosened or lost due to mutations. Pharmacological chaperone development involves, among other things, structural biology investigation of the target protein and its misfolding and refolding. Such research can take advantage of computational methods at many stages. Here, we present an up-to-date review of the computational structural biology tools and approaches regarding protein stability evaluation, binding pocket discovery and druggability, drug repurposing, and virtual ligand screening. The tools are presented as organized in an ideal workflow oriented at pharmacological chaperones' rational design, also with the treatment of rare diseases in mind.
Collapse
Affiliation(s)
- Daniela Grasso
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Silvia Galderisi
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Annalisa Santucci
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Andrea Bernini
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| |
Collapse
|
15
|
Canner SW, Shanker S, Gray JJ. Structure-Based Neural Network Protein-Carbohydrate Interaction Predictions at the Residue Level. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.14.531382. [PMID: 36993750 PMCID: PMC10054975 DOI: 10.1101/2023.03.14.531382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
Carbohydrates dynamically and transiently interact with proteins for cell-cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate binding sites on any given protein. Here, we present two deep learning models named CArbohydrate-Protein interaction Site IdentiFier (CAPSIF) that predict carbohydrate binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2 predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein-carbohydrate structures.
Collapse
Affiliation(s)
- Samuel W. Canner
- Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, MD, United States of America
| | - Sudhanshu Shanker
- Dept. of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States of America
| | - Jeffrey J. Gray
- Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, MD, United States of America
- Dept. of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States of America
- Correspondence: Jeffrey J. Gray,
| |
Collapse
|
16
|
Konc J, Janežič D. ProBiS-Fold Approach for Annotation of Human Structures from the AlphaFold Database with No Corresponding Structure in the PDB to Discover New Druggable Binding Sites. J Chem Inf Model 2022; 62:5821-5829. [PMID: 36269348 DOI: 10.1021/acs.jcim.2c00947] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
ProBiS (Protein Binding Sites), a local structure-based comparison algorithm, is used in the new ProBiS-Fold web server to annotate human structures from the AlphaFold database without a corresponding structure in the Protein Data Bank (PDB) to discover new druggable binding sites. The ProBiS algorithm is used to compare each query protein structure predicted by the AlphaFold approach with the protein structures from the PDB to identify similarities between known binding sites found in the PDB and yet unknown binding sites in the AlphaFold database. Ligands bound in these identified similar PDB sites are then transferred to each query protein from the AlphaFold database, and binding sites are identified as ligand clusters on an AlphaFold protein. Small molecule binding sites are assigned druggability scores based on the similarity of their ligands to known drugs, allowing them to be ranked according to their perceived and actual potential for drug development. ProBiS-Fold provides interactive and downloadable binding sites for the entire human structural proteome, including more than 3000 new druggable binding sites that have no corresponding structure in the PDB, taking into account AlphaFold's model quality, to enable protein structure-function relationship studies and pharmaceutical drug discovery research. The web server is freely accessible to academic users at http://probis-fold.insilab.org.
Collapse
Affiliation(s)
- Janez Konc
- Theory Department, National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
| | - Dušanka Janežič
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, SI-6000 Koper, Slovenia
| |
Collapse
|
17
|
Data-driven analysis and druggability assessment methods to accelerate the identification of novel cancer targets. Comput Struct Biotechnol J 2022; 21:46-57. [PMID: 36514341 PMCID: PMC9732000 DOI: 10.1016/j.csbj.2022.11.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 11/21/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022] Open
Abstract
Over the past few decades, drug discovery has greatly improved the outcomes for patients, but several challenges continue to hinder the rapid development of novel drugs. Addressing unmet clinical needs requires the pursuit of drug targets that have a higher likelihood to lead to the development of successful drugs. Here we describe a bioinformatic approach for identifying novel cancer drug targets by performing statistical analysis to ascertain quantitative changes in expression levels between protein-coding genes, as well as co-expression networks to classify these genes into groups. Subsequently, we provide an overview of druggability assessment methodologies to prioritize and select the best targets to pursue.
Collapse
|
18
|
Eguida M, Rognan D. Estimating the Similarity between Protein Pockets. Int J Mol Sci 2022; 23:12462. [PMID: 36293316 PMCID: PMC9604425 DOI: 10.3390/ijms232012462] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 10/28/2023] Open
Abstract
With the exponential increase in publicly available protein structures, the comparison of protein binding sites naturally emerged as a scientific topic to explain observations or generate hypotheses for ligand design, notably to predict ligand selectivity for on- and off-targets, explain polypharmacology, and design target-focused libraries. The current review summarizes the state-of-the-art computational methods applied to pocket detection and comparison as well as structural druggability estimates. The major strengths and weaknesses of current pocket descriptors, alignment methods, and similarity search algorithms are presented. Lastly, an exhaustive survey of both retrospective and prospective applications in diverse medicinal chemistry scenarios illustrates the capability of the existing methods and the hurdle that still needs to be overcome for more accurate predictions.
Collapse
Affiliation(s)
| | - Didier Rognan
- Laboratoire d’Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
19
|
Abstract
![]()
AlphaFold has burst into our lives. A powerful algorithm
that underscores
the strength of biological sequence data and artificial intelligence
(AI). AlphaFold has appended projects and research directions. The
database it has been creating promises an untold number of applications
with vast potential impacts that are still difficult to surmise. AI
approaches can revolutionize personalized treatments and usher in
better-informed clinical trials. They promise to make giant leaps
toward reshaping and revamping drug discovery strategies, selecting
and prioritizing combinations of drug targets. Here, we briefly overview
AI in structural biology, including in molecular dynamics simulations
and prediction of microbiota–human protein–protein interactions.
We highlight the advancements accomplished by the deep-learning-powered
AlphaFold in protein structure prediction and their powerful impact
on the life sciences. At the same time, AlphaFold does not resolve
the decades-long protein folding challenge, nor does it identify the
folding pathways. The models that AlphaFold provides do not capture
conformational mechanisms like frustration and allostery, which are
rooted in ensembles, and controlled by their dynamic distributions.
Allostery and signaling are properties of populations. AlphaFold also
does not generate ensembles of intrinsically disordered proteins and
regions, instead describing them by their low structural probabilities.
Since AlphaFold generates single ranked structures, rather than conformational
ensembles, it cannot elucidate the mechanisms of allosteric activating
driver hotspot mutations nor of allosteric drug resistance. However,
by capturing key features, deep learning techniques can use the single
predicted conformation as the basis for generating a diverse ensemble.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States.,Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Mingzhen Zhang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Yonglan Liu
- Cancer Innovation Laboratory, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| |
Collapse
|