1
|
Aldeghi M, Heifetz A, Bodkin MJ, Knapp S, Biggin PC. Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci 2016; 7:207-218. [PMID: 26798447 PMCID: PMC4700411 DOI: 10.1039/c5sc02678d] [Citation(s) in RCA: 242] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Accepted: 09/24/2015] [Indexed: 12/13/2022] Open
Abstract
Accurate prediction of binding affinities has been a central goal of computational chemistry for decades, yet remains elusive. Despite good progress, the required accuracy for use in a drug-discovery context has not been consistently achieved for drug-like molecules. Here, we perform absolute free energy calculations based on a thermodynamic cycle for a set of diverse inhibitors binding to bromodomain-containing protein 4 (BRD4) and demonstrate that a mean absolute error of 0.6 kcal mol-1 can be achieved. We also show a similar level of accuracy (1.0 kcal mol-1) can be achieved in pseudo prospective approach. Bromodomains are epigenetic mark readers that recognize acetylation motifs and regulate gene transcription, and are currently being investigated as therapeutic targets for cancer and inflammation. The unprecedented accuracy offers the exciting prospect that the binding free energy of drug-like compounds can be predicted for pharmacologically relevant targets.
Collapse
|
research-article |
9 |
242 |
2
|
Aldeghi M, Malhotra S, Selwood DL, Chan AWE. Two- and three-dimensional rings in drugs. Chem Biol Drug Des 2015; 83:450-61. [PMID: 24472495 PMCID: PMC4233953 DOI: 10.1111/cbdd.12260] [Citation(s) in RCA: 199] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 09/24/2013] [Accepted: 11/04/2013] [Indexed: 01/09/2023]
Abstract
Using small, flat aromatic rings as components of fragments or molecules is a common practice in fragment-based drug discovery and lead optimization. With an increasing focus on the exploration of novel biological and chemical space, and their improved synthetic accessibility, 3D fragments are attracting increasing interest. This study presents a detailed analysis of 3D and 2D ring fragments in marketed drugs. Several measures of properties were used, such as the type of ring assemblies and molecular shapes. The study also took into account the relationship between protein classes targeted by each ring fragment, providing target-specific information. The analysis shows the high structural and shape diversity of 3D ring systems and their importance in bioactive compounds. Major differences in 2D and 3D fragments are apparent in ligands that bind to the major drug targets such as GPCRs, ion channels, and enzymes.
Collapse
|
Journal Article |
10 |
199 |
3
|
Gapsys V, Pérez-Benito L, Aldeghi M, Seeliger D, van Vlijmen H, Tresadern G, de Groot BL. Large scale relative protein ligand binding affinities using non-equilibrium alchemy. Chem Sci 2019; 11:1140-1152. [PMID: 34084371 PMCID: PMC8145179 DOI: 10.1039/c9sc03754c] [Citation(s) in RCA: 154] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 12/01/2019] [Indexed: 12/14/2022] Open
Abstract
Ligand binding affinity calculations based on molecular dynamics (MD) simulations and non-physical (alchemical) thermodynamic cycles have shown great promise for structure-based drug design. However, their broad uptake and impact is held back by the notoriously complex setup of the calculations. Only a few tools other than the free energy perturbation approach by Schrödinger Inc. (referred to as FEP+) currently enable end-to-end application. Here, we present for the first time an approach based on the open-source software pmx that allows to easily set up and run alchemical calculations for diverse sets of small molecules using the GROMACS MD engine. The method relies on theoretically rigorous non-equilibrium thermodynamic integration (TI) foundations, and its flexibility allows calculations with multiple force fields. In this study, results from the Amber and Charmm force fields were combined to yield a consensus outcome performing on par with the commercial FEP+ approach. A large dataset of 482 perturbations from 13 different protein-ligand datasets led to an average unsigned error (AUE) of 3.64 ± 0.14 kJ mol-1, equivalent to Schrödinger's FEP+ AUE of 3.66 ± 0.14 kJ mol-1. For the first time, a setup is presented for overall high precision and high accuracy relative protein-ligand alchemical free energy calculations based on open-source software.
Collapse
|
research-article |
6 |
154 |
4
|
Aldeghi M, Heifetz A, Bodkin MJ, Knapp S, Biggin PC. Predictions of Ligand Selectivity from Absolute Binding Free Energy Calculations. J Am Chem Soc 2017; 139:946-957. [PMID: 28009512 PMCID: PMC5253712 DOI: 10.1021/jacs.6b11467] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Binding selectivity is a requirement for the development of a safe drug, and it is a critical property for chemical probes used in preclinical target validation. Engineering selectivity adds considerable complexity to the rational design of new drugs, as it involves the optimization of multiple binding affinities. Computationally, the prediction of binding selectivity is a challenge, and generally applicable methodologies are still not available to the computational and medicinal chemistry communities. Absolute binding free energy calculations based on alchemical pathways provide a rigorous framework for affinity predictions and could thus offer a general approach to the problem. We evaluated the performance of free energy calculations based on molecular dynamics for the prediction of selectivity by estimating the affinity profile of three bromodomain inhibitors across multiple bromodomain families, and by comparing the results to isothermal titration calorimetry data. Two case studies were considered. In the first one, the affinities of two similar ligands for seven bromodomains were calculated and returned excellent agreement with experiment (mean unsigned error of 0.81 kcal/mol and Pearson correlation of 0.75). In this test case, we also show how the preferred binding orientation of a ligand for different proteins can be estimated via free energy calculations. In the second case, the affinities of a broad-spectrum inhibitor for 22 bromodomains were calculated and returned a more modest accuracy (mean unsigned error of 1.76 kcal/mol and Pearson correlation of 0.48); however, the reparametrization of a sulfonamide moiety improved the agreement with experiment.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
132 |
5
|
Fedorov O, Castex J, Tallant C, Owen DR, Martin S, Aldeghi M, Monteiro O, Filippakopoulos P, Picaud S, Trzupek JD, Gerstenberger BS, Bountra C, Willmann D, Wells C, Philpott M, Rogers C, Biggin PC, Brennan PE, Bunnage ME, Schüle R, Günther T, Knapp S, Müller S. Selective targeting of the BRG/PB1 bromodomains impairs embryonic and trophoblast stem cell maintenance. SCIENCE ADVANCES 2015; 1:e1500723. [PMID: 26702435 PMCID: PMC4681344 DOI: 10.1126/sciadv.1500723] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 08/31/2015] [Indexed: 05/13/2023]
Abstract
Mammalian SWI/SNF [also called Brg/Brahma-associated factors (BAFs)] are evolutionarily conserved chromatin-remodeling complexes regulating gene transcription programs during development and stem cell differentiation. BAF complexes contain an ATP (adenosine 5'-triphosphate)-driven remodeling enzyme (either BRG1 or BRM) and multiple protein interaction domains including bromodomains, an evolutionary conserved acetyl lysine-dependent protein interaction motif that recruits transcriptional regulators to acetylated chromatin. We report a potent and cell active protein interaction inhibitor, PFI-3, that selectively binds to essential BAF bromodomains. The high specificity of PFI-3 was achieved on the basis of a novel binding mode of a salicylic acid head group that led to the replacement of water molecules typically maintained in other bromodomain inhibitor complexes. We show that exposure of embryonic stem cells to PFI-3 led to deprivation of stemness and deregulated lineage specification. Furthermore, differentiation of trophoblast stem cells in the presence of PFI-3 was markedly enhanced. The data present a key function of BAF bromodomains in stem cell maintenance and differentiation, introducing a novel versatile chemical probe for studies on acetylation-dependent cellular processes controlled by BAF remodeling complexes.
Collapse
|
research-article |
10 |
109 |
6
|
Bannigan P, Aldeghi M, Bao Z, Häse F, Aspuru-Guzik A, Allen C. Machine learning directed drug formulation development. Adv Drug Deliv Rev 2021; 175:113806. [PMID: 34019959 DOI: 10.1016/j.addr.2021.05.016] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/31/2021] [Accepted: 05/14/2021] [Indexed: 12/12/2022]
Abstract
Machine learning (ML) has enabled ground-breaking advances in the healthcare and pharmaceutical sectors, from improvements in cancer diagnosis, to the identification of novel drugs and drug targets as well as protein structure prediction. Drug formulation is an essential stage in the discovery and development of new medicines. Through the design of drug formulations, pharmaceutical scientists can engineer important properties of new medicines, such as improved bioavailability and targeted delivery. The traditional approach to drug formulation development relies on iterative trial-and-error, requiring a large number of resource-intensive and time-consuming in vitro and in vivo experiments. This review introduces the basic concepts of ML-directed workflows and discusses how these tools can be used to aid in the development of various types of drug formulations. ML-directed drug formulation development offers unparalleled opportunities to fast-track development efforts, uncover new materials, innovative formulations, and generate new knowledge in drug formulation science. The review also highlights the latest artificial intelligence (AI) technologies, such as generative models, Bayesian deep learning, reinforcement learning, and self-driving laboratories, which have been gaining momentum in drug discovery and chemistry and have potential in drug formulation development.
Collapse
|
Review |
4 |
100 |
7
|
Pollice R, dos Passos Gomes G, Aldeghi M, Hickman RJ, Krenn M, Lavigne C, Lindner-D’Addario M, Nigam A, Ser CT, Yao Z, Aspuru-Guzik A. Data-Driven Strategies for Accelerated Materials Design. Acc Chem Res 2021; 54:849-860. [PMID: 33528245 PMCID: PMC7893702 DOI: 10.1021/acs.accounts.0c00785] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Indexed: 01/06/2023]
Abstract
The ongoing revolution of the natural sciences by the advent of machine learning and artificial intelligence sparked significant interest in the material science community in recent years. The intrinsically high dimensionality of the space of realizable materials makes traditional approaches ineffective for large-scale explorations. Modern data science and machine learning tools developed for increasingly complicated problems are an attractive alternative. An imminent climate catastrophe calls for a clean energy transformation by overhauling current technologies within only several years of possible action available. Tackling this crisis requires the development of new materials at an unprecedented pace and scale. For example, organic photovoltaics have the potential to replace existing silicon-based materials to a large extent and open up new fields of application. In recent years, organic light-emitting diodes have emerged as state-of-the-art technology for digital screens and portable devices and are enabling new applications with flexible displays. Reticular frameworks allow the atom-precise synthesis of nanomaterials and promise to revolutionize the field by the potential to realize multifunctional nanoparticles with applications from gas storage, gas separation, and electrochemical energy storage to nanomedicine. In the recent decade, significant advances in all these fields have been facilitated by the comprehensive application of simulation and machine learning for property prediction, property optimization, and chemical space exploration enabled by considerable advances in computing power and algorithmic efficiency.In this Account, we review the most recent contributions of our group in this thriving field of machine learning for material science. We start with a summary of the most important material classes our group has been involved in, focusing on small molecules as organic electronic materials and crystalline materials. Specifically, we highlight the data-driven approaches we employed to speed up discovery and derive material design strategies. Subsequently, our focus lies on the data-driven methodologies our group has developed and employed, elaborating on high-throughput virtual screening, inverse molecular design, Bayesian optimization, and supervised learning. We discuss the general ideas, their working principles, and their use cases with examples of successful implementations in data-driven material discovery and design efforts. Furthermore, we elaborate on potential pitfalls and remaining challenges of these methods. Finally, we provide a brief outlook for the field as we foresee increasing adaptation and implementation of large scale data-driven approaches in material discovery and design campaigns.
Collapse
|
research-article |
4 |
100 |
8
|
Aldeghi M, Bodkin MJ, Knapp S, Biggin PC. Statistical Analysis on the Performance of Molecular Mechanics Poisson-Boltzmann Surface Area versus Absolute Binding Free Energy Calculations: Bromodomains as a Case Study. J Chem Inf Model 2017; 57:2203-2221. [PMID: 28786670 PMCID: PMC5615372 DOI: 10.1021/acs.jcim.7b00347] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Binding free energy calculations that make use of alchemical pathways are becoming increasingly feasible thanks to advances in hardware and algorithms. Although relative binding free energy (RBFE) calculations are starting to find widespread use, absolute binding free energy (ABFE) calculations are still being explored mainly in academic settings due to the high computational requirements and still uncertain predictive value. However, in some drug design scenarios, RBFE calculations are not applicable and ABFE calculations could provide an alternative. Computationally cheaper end-point calculations in implicit solvent, such as molecular mechanics Poisson-Boltzmann surface area (MMPBSA) calculations, could too be used if one is primarily interested in a relative ranking of affinities. Here, we compare MMPBSA calculations to previously performed absolute alchemical free energy calculations in their ability to correlate with experimental binding free energies for three sets of bromodomain-inhibitor pairs. Different MMPBSA approaches have been considered, including a standard single-trajectory protocol, a protocol that includes a binding entropy estimate, and protocols that take into account the ligand hydration shell. Despite the improvements observed with the latter two MMPBSA approaches, ABFE calculations were found to be overall superior in obtaining correlation with experimental affinities for the test cases considered. A difference in weighted average Pearson ([Formula: see text]) and Spearman ([Formula: see text]) correlations of 0.25 and 0.31 was observed when using a standard single-trajectory MMPBSA setup ([Formula: see text] = 0.64 and [Formula: see text] = 0.66 for ABFE; [Formula: see text] = 0.39 and [Formula: see text] = 0.35 for MMPBSA). The best performing MMPBSA protocols returned weighted average Pearson and Spearman correlations that were about 0.1 inferior to ABFE calculations: [Formula: see text] = 0.55 and [Formula: see text] = 0.56 when including an entropy estimate, and [Formula: see text] = 0.53 and [Formula: see text] = 0.55 when including explicit water molecules. Overall, the study suggests that ABFE calculations are indeed the more accurate approach, yet there is also value in MMPBSA calculations considering the lower compute requirements, and if agreement to experimental affinities in absolute terms is not of interest. Moreover, for the specific protein-ligand systems considered in this study, we find that including an explicit ligand hydration shell or a binding entropy estimate in the MMPBSA calculations resulted in significant performance improvements at a negligible computational cost.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
98 |
9
|
Yee AW, Aldeghi M, Blakeley MP, Ostermann A, Mas PJ, Moulin M, de Sanctis D, Bowler MW, Mueller-Dieckmann C, Mitchell EP, Haertlein M, de Groot BL, Boeri Erba E, Forsyth VT. A molecular mechanism for transthyretin amyloidogenesis. Nat Commun 2019; 10:925. [PMID: 30804345 PMCID: PMC6390107 DOI: 10.1038/s41467-019-08609-z] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 01/14/2019] [Indexed: 01/12/2023] Open
Abstract
Human transthyretin (TTR) is implicated in several fatal forms of amyloidosis. Many mutations of TTR have been identified; most of these are pathogenic, but some offer protective effects. The molecular basis underlying the vastly different fibrillation behaviours of these TTR mutants is poorly understood. Here, on the basis of neutron crystallography, native mass spectrometry and modelling studies, we propose a mechanism whereby TTR can form amyloid fibrils via a parallel equilibrium of partially unfolded species that proceeds in favour of the amyloidogenic forms of TTR. It is suggested that unfolding events within the TTR monomer originate at the C-D loop of the protein, and that destabilising mutations in this region enhance the rate of TTR fibrillation. Furthermore, it is proposed that the binding of small molecule drugs to TTR stabilises non-amyloidogenic states of TTR in a manner similar to that occurring for the protective mutants of the protein.
Collapse
|
research-article |
6 |
96 |
10
|
Heifetz A, Chudyk EI, Gleave L, Aldeghi M, Cherezov V, Fedorov DG, Biggin PC, Bodkin MJ. The Fragment Molecular Orbital Method Reveals New Insight into the Chemical Nature of GPCR–Ligand Interactions. J Chem Inf Model 2015; 56:159-72. [PMID: 26642258 DOI: 10.1021/acs.jcim.5b00644] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Our interpretation of ligand-protein interactions is often informed by high-resolution structures, which represent the cornerstone of structure-based drug design. However, visual inspection and molecular mechanics approaches cannot explain the full complexity of molecular interactions. Quantum Mechanics approaches are often too computationally expensive, but one method, Fragment Molecular Orbital (FMO), offers an excellent compromise and has the potential to reveal key interactions that would otherwise be hard to detect. To illustrate this, we have applied the FMO method to 18 Class A GPCR-ligand crystal structures, representing different branches of the GPCR genome. Our work reveals key interactions that are often omitted from structure-based descriptions, including hydrophobic interactions, nonclassical hydrogen bonds, and the involvement of backbone atoms. This approach provides a more comprehensive picture of receptor-ligand interactions than is currently used and should prove useful for evaluation of the chemical nature of ligand binding and to support structure-based drug design.
Collapse
|
|
10 |
79 |
11
|
Rizzi A, Jensen T, Slochower DR, Aldeghi M, Gapsys V, Ntekoumes D, Bosisio S, Papadourakis M, Henriksen NM, de Groot BL, Cournia Z, Dickson A, Michel J, Gilson MK, Shirts MR, Mobley DL, Chodera JD. The SAMPL6 SAMPLing challenge: assessing the reliability and efficiency of binding free energy calculations. J Comput Aided Mol Des 2020; 34:601-633. [PMID: 31984465 DOI: 10.1007/s10822-020-00290-5] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 01/13/2020] [Indexed: 12/22/2022]
Abstract
Approaches for computing small molecule binding free energies based on molecular simulations are now regularly being employed by academic and industry practitioners to study receptor-ligand systems and prioritize the synthesis of small molecules for ligand design. Given the variety of methods and implementations available, it is natural to ask how the convergence rates and final predictions of these methods compare. In this study, we describe the concept and results for the SAMPL6 SAMPLing challenge, the first challenge from the SAMPL series focusing on the assessment of convergence properties and reproducibility of binding free energy methodologies. We provided parameter files, partial charges, and multiple initial geometries for two octa-acid (OA) and one cucurbit[8]uril (CB8) host-guest systems. Participants submitted binding free energy predictions as a function of the number of force and energy evaluations for seven different alchemical and physical-pathway (i.e., potential of mean force and weighted ensemble of trajectories) methodologies implemented with the GROMACS, AMBER, NAMD, or OpenMM simulation engines. To rank the methods, we developed an efficiency statistic based on bias and variance of the free energy estimates. For the two small OA binders, the free energy estimates computed with alchemical and potential of mean force approaches show relatively similar variance and bias as a function of the number of energy/force evaluations, with the attach-pull-release (APR), GROMACS expanded ensemble, and NAMD double decoupling submissions obtaining the greatest efficiency. The differences between the methods increase when analyzing the CB8-quinine system, where both the guest size and correlation times for system dynamics are greater. For this system, nonequilibrium switching (GROMACS/NS-DS/SB) obtained the overall highest efficiency. Surprisingly, the results suggest that specifying force field parameters and partial charges is insufficient to generally ensure reproducibility, and we observe differences between seemingly converged predictions ranging approximately from 0.3 to 1.0 kcal/mol, even with almost identical simulations parameters and system setup (e.g., Lennard-Jones cutoff, ionic composition). Further work will be required to completely identify the exact source of these discrepancies. Among the conclusions emerging from the data, we found that Hamiltonian replica exchange-while displaying very small variance-can be affected by a slowly-decaying bias that depends on the initial population of the replicas, that bidirectional estimators are significantly more efficient than unidirectional estimators for nonequilibrium free energy calculations for systems considered, and that the Berendsen barostat introduces non-negligible artifacts in expanded ensemble simulations.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
5 |
73 |
12
|
Aldeghi M, Gapsys V, de Groot BL. Accurate Estimation of Ligand Binding Affinity Changes upon Protein Mutation. ACS CENTRAL SCIENCE 2018; 4:1708-1718. [PMID: 30648154 PMCID: PMC6311686 DOI: 10.1021/acscentsci.8b00717] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Indexed: 05/19/2023]
Abstract
The design of proteins with novel ligand-binding functions holds great potential for application in biomedicine and biotechnology. However, our ability to engineer ligand-binding proteins is still limited, and current approaches rely primarily on experimentation. Computation could reduce the cost of the development process and would allow rigorous testing of our understanding of the principles governing molecular recognition. While computational methods have proven successful in the early stages of the discovery process, optimization approaches that can quantitatively predict ligand affinity changes upon protein mutation are still lacking. Here, we assess the ability of free energy calculations based on first-principles statistical mechanics, as well as the latest Rosetta protocols, to quantitatively predict such affinity changes on a challenging set of 134 mutations. After evaluating different protocols with computational efficiency in mind, we investigate the performance of different force fields. We show that both the free energy calculations and Rosetta are able to quantitatively predict changes in ligand binding affinity upon protein mutations, yet the best predictions are the result of combining the estimates of both methods. These closely match the experimentally determined ΔΔG values, with a root-mean-square error of 1.2 kcal/mol for the full benchmark set and of 0.8 kcal/mol for a subset of protein systems providing the most reproducible results. The currently achievable accuracy offers the prospect of being able to employ computation for the optimization of ligand-binding proteins as well as the prediction of drug resistance.
Collapse
|
research-article |
7 |
70 |
13
|
Heifetz A, Trani G, Aldeghi M, MacKinnon CH, McEwan PA, Brookfield FA, Chudyk EI, Bodkin M, Pei Z, Burch JD, Ortwine DF. Fragment Molecular Orbital Method Applied to Lead Optimization of Novel Interleukin-2 Inducible T-Cell Kinase (ITK) Inhibitors. J Med Chem 2016; 59:4352-63. [PMID: 26950250 DOI: 10.1021/acs.jmedchem.6b00045] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Inhibition of inducible T-cell kinase (ITK), a nonreceptor tyrosine kinase, may represent a novel treatment for allergic asthma. In our previous reports, we described the discovery of sulfonylpyridine (SAP), benzothiazole (BZT), indazole (IND), and tetrahydroindazole (THI) series as novel ITK inhibitors and how computational tools such as dihedral scans and docking were used to support this process. X-ray crystallography and modeling were applied to provide essential insight into ITK-ligand interactions. However, "visual inspection" traditionally used for the rationalization of protein-ligand affinity cannot always explain the full complexity of the molecular interactions. The fragment molecular orbital (FMO) quantum-mechanical (QM) method provides a complete list of the interactions formed between the ligand and protein that are often omitted from traditional structure-based descriptions. FMO methodology was successfully used as part of a rational structure-based drug design effort to improve the ITK potency of high-throughput screening hits, ultimately delivering ligands with potency in the subnanomolar range.
Collapse
|
Journal Article |
9 |
61 |
14
|
Bannigan P, Bao Z, Hickman RJ, Aldeghi M, Häse F, Aspuru-Guzik A, Allen C. Machine learning models to accelerate the design of polymeric long-acting injectables. Nat Commun 2023; 14:35. [PMID: 36627280 PMCID: PMC9832011 DOI: 10.1038/s41467-022-35343-w] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 11/28/2022] [Indexed: 01/11/2023] Open
Abstract
Long-acting injectables are considered one of the most promising therapeutic strategies for the treatment of chronic diseases as they can afford improved therapeutic efficacy, safety, and patient compliance. The use of polymer materials in such a drug formulation strategy can offer unparalleled diversity owing to the ability to synthesize materials with a wide range of properties. However, the interplay between multiple parameters, including the physicochemical properties of the drug and polymer, make it very difficult to intuitively predict the performance of these systems. This necessitates the development and characterization of a wide array of formulation candidates through extensive and time-consuming in vitro experimentation. Machine learning is enabling leap-step advances in a number of fields including drug discovery and materials science. The current study takes a critical step towards data-driven drug formulation development with an emphasis on long-acting injectables. Here we show that machine learning algorithms can be used to predict experimental drug release from these advanced drug delivery systems. We also demonstrate that these trained models can be used to guide the design of new long acting injectables. The implementation of the described data-driven approach has the potential to reduce the time and cost associated with drug formulation development.
Collapse
|
research-article |
2 |
49 |
15
|
Khalak Y, Tresadern G, Aldeghi M, Baumann HM, Mobley DL, de Groot BL, Gapsys V. Alchemical absolute protein-ligand binding free energies for drug design. Chem Sci 2021; 12:13958-13971. [PMID: 34760182 PMCID: PMC8549785 DOI: 10.1039/d1sc03472c] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 09/23/2021] [Indexed: 12/13/2022] Open
Abstract
The recent advances in relative protein-ligand binding free energy calculations have shown the value of alchemical methods in drug discovery. Accurately assessing absolute binding free energies, although highly desired, remains a challenging endeavour, mostly limited to small model cases. Here, we demonstrate accurate first principles based absolute binding free energy estimates for 128 pharmaceutically relevant targets. We use a novel rigorous method to generate protein-ligand ensembles for the ligand in its decoupled state. Not only do the calculations deliver accurate protein-ligand binding affinity estimates, but they also provide detailed physical insight into the structural determinants of binding. We identify subtle rotamer rearrangements between apo and holo states of a protein that are crucial for binding. When compared to relative binding free energy calculations, obtaining absolute binding free energies is considerably more challenging in large part due to the need to explicitly account for the protein in its apo state. In this work we present several approaches to obtain apo state ensembles for accurate absolute ΔG calculations, thus outlining protocols for prospective application of the methods for drug discovery.
Collapse
|
research-article |
4 |
49 |
16
|
Aldeghi M, Ross GA, Bodkin MJ, Essex JW, Knapp S, Biggin PC. Large-scale analysis of water stability in bromodomain binding pockets with grand canonical Monte Carlo. Commun Chem 2018; 1:19. [PMID: 29863194 PMCID: PMC5978690 DOI: 10.1038/s42004-018-0019-x] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 03/01/2018] [Indexed: 12/17/2022] Open
Abstract
Conserved water molecules are of interest in drug design, as displacement of such waters can lead to higher affinity ligands and in some cases, contribute towards selectivity. Bromodomains, small protein domains involved in the epigenetic regulation of gene transcription, display a network of four conserved water molecules in their binding pockets and have recently been the focus of intense medicinal chemistry efforts. Understanding why certain bromodomains have displaceable water molecules and others do not is extremely challenging, and it remains unclear which water molecules in a given bromodomain can be targeted for displacement. Here we estimate the stability of the conserved water molecules in 35 bromodomains via binding free energy calculations using all-atom grand canonical Monte Carlo simulations. Encouraging quantitative agreement to the available experimental evidence is found. We thus discuss the expected ease of water displacement in different bromodomains and the implications for ligand selectivity.
Collapse
|
research-article |
7 |
48 |
17
|
Krenn M, Pollice R, Guo SY, Aldeghi M, Cervera-Lierta A, Friederich P, dos Passos Gomes G, Häse F, Jinich A, Nigam A, Yao Z, Aspuru-Guzik A. On scientific understanding with artificial intelligence. NATURE REVIEWS. PHYSICS 2022; 4:761-769. [PMID: 36247217 PMCID: PMC9552145 DOI: 10.1038/s42254-022-00518-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/30/2022] [Indexed: 05/27/2023]
Abstract
An oracle that correctly predicts the outcome of every particle physics experiment, the products of every possible chemical reaction or the function of every protein would revolutionize science and technology. However, scientists would not be entirely satisfied because they would want to comprehend how the oracle made these predictions. This is scientific understanding, one of the main aims of science. With the increase in the available computational power and advances in artificial intelligence, a natural question arises: how can advanced computational systems, and specifically artificial intelligence, contribute to new scientific understanding or gain it autonomously? Trying to answer this question, we adopted a definition of 'scientific understanding' from the philosophy of science that enabled us to overview the scattered literature on the topic and, combined with dozens of anecdotes from scientists, map out three dimensions of computer-assisted scientific understanding. For each dimension, we review the existing state of the art and discuss future developments. We hope that this Perspective will inspire and focus research directions in this multidisciplinary emerging field.
Collapse
|
Review |
3 |
38 |
18
|
Aldeghi M, Coley CW. A graph representation of molecular ensembles for polymer property prediction. Chem Sci 2022; 13:10486-10498. [PMID: 36277616 PMCID: PMC9473492 DOI: 10.1039/d2sc02839e] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 08/15/2022] [Indexed: 12/02/2022] Open
Abstract
Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles.
Collapse
|
|
3 |
33 |
19
|
Aldeghi M, Gapsys V, de Groot BL. Predicting Kinase Inhibitor Resistance: Physics-Based and Data-Driven Approaches. ACS CENTRAL SCIENCE 2019; 5:1468-1474. [PMID: 31482130 PMCID: PMC6716344 DOI: 10.1021/acscentsci.9b00590] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Indexed: 05/03/2023]
Abstract
Resistance to small molecule drugs often emerges in cancer cells, viruses, and bacteria as a result of the evolutionary pressure exerted by the therapy. Protein mutations that directly impair drug binding are frequently involved in resistance, and the ability to anticipate these mutations would be beneficial in drug development and clinical practice. Here, we evaluate the ability of three distinct computational methods to predict ligand binding affinity changes upon protein mutation for the cancer target Abl kinase. These structure-based approaches rely on first-principle statistical mechanics, mixed physics- and knowledge-based potentials, and machine learning, and were able to estimate binding affinity changes and identify resistant mutations with remarkable accuracy. We expect that these complementary approaches will enable the routine prediction of resistance-causing mutations in a variety of other target proteins.
Collapse
|
research-article |
6 |
31 |
20
|
Aldeghi M, Bluck JP, Biggin PC. Absolute Alchemical Free Energy Calculations for Ligand Binding: A Beginner's Guide. Methods Mol Biol 2018; 1762:199-232. [PMID: 29594774 DOI: 10.1007/978-1-4939-7756-7_11] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Many thermodynamic quantities can be extracted from computer simulations that generate an ensemble of microstates according to the principles of statistical mechanics. Among these quantities is the free energy of binding of a small molecule to a macromolecule, such as a protein. Here, we present an introductory overview of a protocol that allows for the estimation of ligand binding free energies via molecular dynamics simulations. While we focus on the binding of organic molecules to proteins, the approach is in principle transferable to any pair of molecules.
Collapse
|
|
7 |
30 |
21
|
Aldeghi M, de Groot BL, Gapsys V. Accurate Calculation of Free Energy Changes upon Amino Acid Mutation. Methods Mol Biol 2019; 1851:19-47. [PMID: 30298390 DOI: 10.1007/978-1-4939-8736-8_2] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Molecular dynamics based free energy calculations allow for a robust and accurate evaluation of free energy changes upon amino acid mutation in proteins. In this chapter we cover the basic theoretical concepts important for the use of calculations utilizing the non-equilibrium alchemical switching methodology. We further provide a detailed step-by-step protocol for estimating the effect of a single amino acid mutation on protein thermostability. In addition, the potential caveats and solutions to some frequently encountered issues concerning the non-equilibrium alchemical free energy calculations are discussed. The protocol comprises details for the hybrid structure/topology generation required for alchemical transitions, equilibrium simulation setup, and description of the fast non-equilibrium switching. Subsequently, the analysis of the obtained results is described. The steps in the protocol are complemented with an illustrative practical application: a destabilizing mutation in the Trp cage mini protein. The concepts that are described are generally applicable. The shown example makes use of the pmx software package for the free energy calculations using Gromacs as a molecular dynamics engine. Finally, we discuss how the current protocol can readily be adapted to carry out charge-changing or multiple mutations at once, as well as large-scale mutational scans.
Collapse
|
|
6 |
23 |
22
|
Häse F, Aldeghi M, Hickman RJ, Roch LM, Christensen M, Liles E, Hein JE, Aspuru-Guzik A. Olympus: a benchmarking framework for noisy optimization and experiment planning. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abedc8] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Abstract
Research challenges encountered across science, engineering, and economics can frequently be formulated as optimization tasks. In chemistry and materials science, recent growth in laboratory digitization and automation has sparked interest in optimization-guided autonomous discovery and closed-loop experimentation. Experiment planning strategies based on off-the-shelf optimization algorithms can be employed in fully autonomous research platforms to achieve desired experimentation goals with the minimum number of trials. However, the experiment planning strategy that is most suitable to a scientific discovery task is a priori unknown while rigorous comparisons of different strategies are highly time and resource demanding. As optimization algorithms are typically benchmarked on low-dimensional synthetic functions, it is unclear how their performance would translate to noisy, higher-dimensional experimental tasks encountered in chemistry and materials science. We introduce Olympus, a software package that provides a consistent and easy-to-use framework for benchmarking optimization algorithms against realistic experiments emulated via probabilistic deep-learning models. Olympus includes a collection of experimentally derived benchmark sets from chemistry and materials science and a suite of experiment planning strategies that can be easily accessed via a user-friendly Python interface. Furthermore, Olympus facilitates the integration, testing, and sharing of custom algorithms and user-defined datasets. In brief, Olympus mitigates the barriers associated with benchmarking optimization algorithms on realistic experimental scenarios, promoting data sharing and the creation of a standard framework for evaluating the performance of experiment planning strategies.
Collapse
|
|
4 |
15 |
23
|
Heifetz A, James T, Southey M, Morao I, Aldeghi M, Sarrat L, Fedorov DG, Bodkin MJ, Townsend-Nicholson A. Characterising GPCR-ligand interactions using a fragment molecular orbital-based approach. Curr Opin Struct Biol 2019; 55:85-92. [PMID: 31022570 DOI: 10.1016/j.sbi.2019.03.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 02/19/2019] [Accepted: 03/14/2019] [Indexed: 10/27/2022]
Abstract
There has been fantastic progress in solving GPCR crystal structures. However, the ability of X-ray crystallography to guide the drug discovery process for GPCR targets is limited by the availability of accurate tools to explore receptor-ligand interactions. Visual inspection and molecular mechanics approaches cannot explain the full complexity of molecular interactions. Quantum mechanical approaches (QM) are often too computationally expensive, but the fragment molecular orbital (FMO) method offers an excellent solution that combines accuracy, speed and the ability to reveal key interactions that would otherwise be hard to detect. Integration of GPCR crystallography or homology modelling with FMO reveals atomistic details of the individual contributions of each residue and water molecule towards ligand binding, including an analysis of their chemical nature.
Collapse
|
Review |
6 |
13 |
24
|
Heifetz A, Morao I, Babu MM, James T, Southey MWY, Fedorov DG, Aldeghi M, Bodkin MJ, Townsend-Nicholson A. Characterizing Interhelical Interactions of G-Protein Coupled Receptors with the Fragment Molecular Orbital Method. J Chem Theory Comput 2020; 16:2814-2824. [PMID: 32096994 PMCID: PMC7161079 DOI: 10.1021/acs.jctc.9b01136] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
G-protein coupled receptors (GPCRs) are the largest superfamily of membrane proteins, regulating almost every aspect of cellular activity and serving as key targets for drug discovery. We have identified an accurate and reliable computational method to characterize the strength and chemical nature of the interhelical interactions between the residues of transmembrane (TM) domains during different receptor activation states, something that cannot be characterized solely by visual inspection of structural information. Using the fragment molecular orbital (FMO) quantum mechanics method to analyze 35 crystal structures representing different branches of the class A GPCR family, we have identified 69 topologically equivalent TM residues that form a consensus network of 51 inter-TM interactions, providing novel results that are consistent with and help to rationalize experimental data. This discovery establishes a comprehensive picture of how defined molecular forces govern specific interhelical interactions which, in turn, support the structural stability, ligand binding, and activation of GPCRs.
Collapse
|
research-article |
5 |
12 |
25
|
Heifetz A, Storer RI, McMurray G, James T, Morao I, Aldeghi M, Bodkin MJ, Biggin PC. Application of an Integrated GPCR SAR-Modeling Platform To Explain the Activation Selectivity of Human 5-HT2C over 5-HT2B. ACS Chem Biol 2016; 11:1372-82. [PMID: 26900768 DOI: 10.1021/acschembio.5b01045] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Agonism of the 5-HT2C serotonin receptor has been associated with the treatment of a number of diseases including obesity, psychiatric disorders, sexual health, and urology. However, the development of effective 5-HT2C agonists has been hampered by the difficulty in obtaining selectivity over the closely related 5-HT2B receptor, agonism of which is associated with irreversible cardiac valvulopathy. Understanding how to design selective agonists requires exploration of the structural features governing the functional uniqueness of the target receptor relative to related off targets. X-ray crystallography, the major experimental source of structural information, is a slow and challenging process for integral membrane proteins, and so is currently not feasible for every GPCR or GPCR-ligand complex. Therefore, the integration of existing ligand SAR data with GPCR modeling can be a practical alternative to provide this essential structural insight. To demonstrate this, we integrated SAR data from 39 azepine series 5-HT2C agonists, comprising both selective and unselective examples, with our hierarchical GPCR modeling protocol (HGMP). Through this work we have been able to demonstrate how relatively small differences in the amino acid sequences of GPCRs can lead to significant differences in secondary structure and function, as supported by experimental data. In particular, this study suggests that conformational differences in the tilt of TM7 between 5-HT2B and 5-HT2C, which result from differences in interhelical interactions, may be the major source of selectivity in G-protein activation between these two receptors. Our approach also demonstrates how the use of GPCR models in conjunction with SAR data can be used to explain activity cliffs.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
11 |