1
|
Litman Y, Kapil V, Feldman YMY, Tisi D, Begušić T, Fidanyan K, Fraux G, Higer J, Kellner M, Li TE, Pós ES, Stocco E, Trenins G, Hirshberg B, Rossi M, Ceriotti M. i-PI 3.0: A flexible and efficient framework for advanced atomistic simulations. J Chem Phys 2024; 161:062504. [PMID: 39140447 DOI: 10.1063/5.0215869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 07/11/2024] [Indexed: 08/15/2024] Open
Abstract
Atomic-scale simulations have progressed tremendously over the past decade, largely thanks to the availability of machine-learning interatomic potentials. These potentials combine the accuracy of electronic structure calculations with the ability to reach extensive length and time scales. The i-PI package facilitates integrating the latest developments in this field with advanced modeling techniques thanks to a modular software architecture based on inter-process communication through a socket interface. The choice of Python for implementation facilitates rapid prototyping but can add computational overhead. In this new release, we carefully benchmarked and optimized i-PI for several common simulation scenarios, making such overhead negligible when i-PI is used to model systems up to tens of thousands of atoms using widely adopted machine learning interatomic potentials, such as Behler-Parinello, DeePMD, and MACE neural networks. We also present the implementation of several new features, including an efficient algorithm to model bosonic and fermionic exchange, a framework for uncertainty quantification to be used in conjunction with machine-learning potentials, a communication infrastructure that allows for deeper integration with electronic-driven simulations, and an approach to simulate coupled photon-nuclear dynamics in optical or plasmonic cavities.
Collapse
Affiliation(s)
- Yair Litman
- Y. Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Venkat Kapil
- Y. Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Department of Physics and Astronomy, University College London, 17-19 Gordon St, London WC1H 0AH, United Kingdom
- Thomas Young Centre and London Centre for Nanotechnology, 19 Gordon St, London WC1H 0AH, United Kingdom
| | | | - Davide Tisi
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Tomislav Begušić
- Div. of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Karen Fidanyan
- MPI for the Structure and Dynamics of Matter, Hamburg, Germany
| | - Guillaume Fraux
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Jacob Higer
- School of Physics, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Matthias Kellner
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Tao E Li
- Department of Physics and Astronomy, University of Delaware, Newark, Delaware 19716, USA
| | - Eszter S Pós
- MPI for the Structure and Dynamics of Matter, Hamburg, Germany
| | - Elia Stocco
- MPI for the Structure and Dynamics of Matter, Hamburg, Germany
| | - George Trenins
- MPI for the Structure and Dynamics of Matter, Hamburg, Germany
| | - Barak Hirshberg
- School of Chemistry, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Mariana Rossi
- MPI for the Structure and Dynamics of Matter, Hamburg, Germany
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
2
|
Kubečka J, Ayoubi D, Tang Z, Knattrup Y, Engsvang M, Wu H, Elm J. Accurate modeling of the potential energy surface of atmospheric molecular clusters boosted by neural networks. ENVIRONMENTAL SCIENCE. ADVANCES 2024:d4va00255e. [PMID: 39176037 PMCID: PMC11334116 DOI: 10.1039/d4va00255e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 08/09/2024] [Indexed: 08/24/2024]
Abstract
The computational cost of accurate quantum chemistry (QC) calculations of large molecular systems can often be unbearably high. Machine learning offers a lower computational cost compared to QC methods while maintaining their accuracy. In this study, we employ the polarizable atom interaction neural network (PaiNN) architecture to train and model the potential energy surface of molecular clusters relevant to atmospheric new particle formation, such as sulfuric acid-ammonia clusters. We compare the differences between PaiNN and previous kernel ridge regression modeling for the Clusteromics I-V data sets. We showcase three models capable of predicting electronic binding energies and interatomic forces with mean absolute errors of <0.3 kcal mol-1 and <0.2 kcal mol-1 Å-1, respectively. Furthermore, we demonstrate that the error of the modeled properties remains below the chemical accuracy of 1 kcal mol-1 even for clusters vastly larger than those in the training database (up to (H2SO4)15(NH3)15 clusters, containing 30 molecules). Consequently, we emphasize the potential applications of these models for faster and more thorough configurational sampling and for boosting molecular dynamics studies of large atmospheric molecular clusters.
Collapse
Affiliation(s)
- Jakub Kubečka
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| | - Daniel Ayoubi
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| | - Zeyuan Tang
- Center for Interstellar Catalysis, Department of Physics and Astronomy, Aarhus University Ny Munkegade 120 8000 Aarhus C Denmark
| | - Yosef Knattrup
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| | - Morten Engsvang
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| | - Haide Wu
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| | - Jonas Elm
- Department of Chemistry, Aarhus University Langelandsgade 140 8000 Aarhus C Denmark +420 724946622
| |
Collapse
|
3
|
Lange J, Anelli A, Alsenz J, Kuentz M, O’Dwyer PJ, Saal W, Wyttenbach N, Griffin BT. Comparative Analysis of Chemical Descriptors by Machine Learning Reveals Atomistic Insights into Solute-Lipid Interactions. Mol Pharm 2024; 21:3343-3355. [PMID: 38780534 PMCID: PMC11220795 DOI: 10.1021/acs.molpharmaceut.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/07/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024]
Abstract
This study explores the research area of drug solubility in lipid excipients, an area persistently complex despite recent advancements in understanding and predicting solubility based on molecular structure. To this end, this research investigated novel descriptor sets, employing machine learning techniques to understand the determinants governing interactions between solutes and medium-chain triglycerides (MCTs). Quantitative structure-property relationships (QSPR) were constructed on an extended solubility data set comprising 182 experimental values of structurally diverse drug molecules, including both development and marketed drugs to extract meaningful property relationships. Four classes of molecular descriptors, ranging from traditional representations to complex geometrical descriptions, were assessed and compared in terms of their predictive accuracy and interpretability. These include two-dimensional (2D) and three-dimensional (3D) descriptors, Abraham solvation parameters, extended connectivity fingerprints (ECFPs), and the smooth overlap of atomic position (SOAP) descriptor. Through testing three distinct regularized regression algorithms alongside various preprocessing schemes, the SOAP descriptor enabled the construction of a superior performing model in terms of interpretability and accuracy. Its atom-centered characteristics allowed contributions to be estimated at the atomic level, thereby enabling the ranking of prevalent molecular motifs and their influence on drug solubility in MCTs. The performance on a separate test set demonstrated high predictive accuracy (RMSE = 0.50) for 2D and 3D, SOAP, and Abraham Solvation descriptors. The model trained on ECFP4 descriptors resulted in inferior predictive accuracy. Lastly, uncertainty estimations for each model were introduced to assess their applicability domains and provide information on where the models may extrapolate in chemical space and, thus, where more data may be necessary to refine a data-driven approach to predict solubility in MCTs. Overall, the presented approaches further enable computationally informed formulation development by introducing a novel in silico approach for rational drug development and prediction of dose loading in lipids.
Collapse
Affiliation(s)
- Justus
Johann Lange
- School
of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork
County, Ireland
| | - Andrea Anelli
- Roche
Pharma Research and Early Development, Therapeutic Modalities, Roche
Innovation Center Basel, F. Hoffmann-La
Roche Limited, Grenzacherstrasse
124, Basel 4070, Switzerland
| | - Jochem Alsenz
- Roche
Pharma Research and Early Development, Therapeutic Modalities, Roche
Innovation Center Basel, F. Hoffmann-La
Roche Limited, Grenzacherstrasse
124, Basel 4070, Switzerland
| | - Martin Kuentz
- Insitute
of Pharma Technology, University of Applied
Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz CH-4231, Basel City, Switzerland
| | - Patrick J. O’Dwyer
- School
of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork
County, Ireland
| | - Wiebke Saal
- Roche
Pharma Research and Early Development, Therapeutic Modalities, Roche
Innovation Center Basel, F. Hoffmann-La
Roche Limited, Grenzacherstrasse
124, Basel 4070, Switzerland
| | - Nicole Wyttenbach
- Roche
Pharma Research and Early Development, Therapeutic Modalities, Roche
Innovation Center Basel, F. Hoffmann-La
Roche Limited, Grenzacherstrasse
124, Basel 4070, Switzerland
| | - Brendan T. Griffin
- School
of Pharmacy, University College Cork, College Road, Cork T12 R229, Cork
County, Ireland
| |
Collapse
|
4
|
Wan K, He J, Shi X. Construction of High Accuracy Machine Learning Interatomic Potential for Surface/Interface of Nanomaterials-A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2305758. [PMID: 37640376 DOI: 10.1002/adma.202305758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Indexed: 08/31/2023]
Abstract
The inherent discontinuity and unique dimensional attributes of nanomaterial surfaces and interfaces bestow them with various exceptional properties. These properties, however, also introduce difficulties for both experimental and computational studies. The advent of machine learning interatomic potential (MLIP) addresses some of the limitations associated with empirical force fields, presenting a valuable avenue for accurate simulations of these surfaces/interfaces of nanomaterials. Central to this approach is the idea of capturing the relationship between system configuration and potential energy, leveraging the proficiency of machine learning (ML) to precisely approximate high-dimensional functions. This review offers an in-depth examination of MLIP principles and their execution and elaborates on their applications in the realm of nanomaterial surface and interface systems. The prevailing challenges faced by this potent methodology are also discussed.
Collapse
Affiliation(s)
- Kaiwei Wan
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jianxin He
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xinghua Shi
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
5
|
Žugec I, Geilhufe RM, Lončarić I. Global machine learning potentials for molecular crystals. J Chem Phys 2024; 160:154106. [PMID: 38624120 DOI: 10.1063/5.0196232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/29/2024] [Indexed: 04/17/2024] Open
Abstract
Molecular crystals are difficult to model with accurate first-principles methods due to large unit cells. On the other hand, accurate modeling is required as polymorphs often differ by only 1 kJ/mol. Machine learning interatomic potentials promise to provide accuracy of the baseline first-principles methods with a cost lower by orders of magnitude. Using the existing databases of the density functional theory calculations for molecular crystals and molecules, we train global machine learning interatomic potentials, usable for any molecular crystal. We test the performance of the potentials on experimental benchmarks and show that they perform better than classical force fields and, in some cases, are comparable to the density functional theory calculations.
Collapse
Affiliation(s)
- Ivan Žugec
- Centro de Física de Materiales CFM/MPC (CSIC-UPV/EHU), Donostia-San Sebastián, Spain
| | - R Matthias Geilhufe
- Department of Physics, Chalmers University of Technology, Gothenburg, Sweden
| | - Ivor Lončarić
- Ruđer Bošković Institute, Bijenička 54, Zagreb, Croatia
| |
Collapse
|
6
|
Butler PV, Hafizi R, Day GM. Machine-Learned Potentials by Active Learning from Organic Crystal Structure Prediction Landscapes. J Phys Chem A 2024; 128:945-957. [PMID: 38277275 PMCID: PMC10860135 DOI: 10.1021/acs.jpca.3c07129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 01/28/2024]
Abstract
A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid-state density functional theory (DFT) methods allow for mostly reliable discrimination of the low-energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine-learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational costs close to those of force fields, can improve the efficiency of the CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently reranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Roohollah Hafizi
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| | - Graeme M. Day
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| |
Collapse
|
7
|
Gelžinytė E, Öeren M, Segall MD, Csányi G. Transferable Machine Learning Interatomic Potential for Bond Dissociation Energy Prediction of Drug-like Molecules. J Chem Theory Comput 2024; 20:164-177. [PMID: 38108269 PMCID: PMC10782450 DOI: 10.1021/acs.jctc.3c00710] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/19/2023]
Abstract
We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy (BDE) prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 data set, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific data sets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H BDE, which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalyzed by CYPs. On the "CYP 3A4" data set, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives: the semiempirical AM1 and GFN2-xTB methods and the ALFABET model that directly predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modeling the full reaction paths, not only BDEs.
Collapse
Affiliation(s)
- Elena Gelžinytė
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| | - Mario Öeren
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Matthew D. Segall
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| |
Collapse
|
8
|
Coste A, Slejko E, Zavadlav J, Praprotnik M. Developing an Implicit Solvation Machine Learning Model for Molecular Simulations of Ionic Media. J Chem Theory Comput 2024; 20:411-420. [PMID: 38118122 PMCID: PMC10782447 DOI: 10.1021/acs.jctc.3c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/04/2023] [Accepted: 12/04/2023] [Indexed: 12/22/2023]
Abstract
Molecular dynamics (MD) simulations of biophysical systems require accurate modeling of their native environment, i.e., aqueous ionic solution, as it critically impacts the structure and function of biomolecules. On the other hand, the models should be computationally efficient to enable simulations of large spatiotemporal scales. Here, we present the deep implicit solvation model for sodium chloride solutions that satisfies both requirements. Owing to the use of the neural network potential, the model can capture the many-body potential of mean force, while the implicit water treatment renders the model inexpensive. We demonstrate our approach first for pure ionic solutions with concentrations ranging from physiological to 2 M. We then extend the model to capture the effective ion interactions in the vicinity and far away from a DNA molecule. In both cases, the structural properties are in good agreement with all-atom MD, showcasing a general methodology for the efficient and accurate modeling of ionic media.
Collapse
Affiliation(s)
- Amaury Coste
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
| | - Ema Slejko
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| | - Julija Zavadlav
- Professorship
of Multiscale Modeling of Fluid Materials, TUM School of Engineering
and Design, Technical University of Munich, Garching Near Munich DE-85748, Germany
| | - Matej Praprotnik
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| |
Collapse
|
9
|
Beran GJO. Frontiers of molecular crystal structure prediction for pharmaceuticals and functional organic materials. Chem Sci 2023; 14:13290-13312. [PMID: 38033897 PMCID: PMC10685338 DOI: 10.1039/d3sc03903j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
The reliability of organic molecular crystal structure prediction has improved tremendously in recent years. Crystal structure predictions for small, mostly rigid molecules are quickly becoming routine. Structure predictions for larger, highly flexible molecules are more challenging, but their crystal structures can also now be predicted with increasing rates of success. These advances are ushering in a new era where crystal structure prediction drives the experimental discovery of new solid forms. After briefly discussing the computational methods that enable successful crystal structure prediction, this perspective presents case studies from the literature that demonstrate how state-of-the-art crystal structure prediction can transform how scientists approach problems involving the organic solid state. Applications to pharmaceuticals, porous organic materials, photomechanical crystals, organic semi-conductors, and nuclear magnetic resonance crystallography are included. Finally, efforts to improve our understanding of which predicted crystal structures can actually be produced experimentally and other outstanding challenges are discussed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California Riverside Riverside CA 92521 USA
| |
Collapse
|
10
|
Essen CV, Luedeker D. In silico co-crystal design: Assessment of the latest advances. Drug Discov Today 2023; 28:103763. [PMID: 37689178 DOI: 10.1016/j.drudis.2023.103763] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 08/18/2023] [Accepted: 08/31/2023] [Indexed: 09/11/2023]
Abstract
Pharmaceutical co-crystals represent a growing class of crystal forms in the context of pharmaceutical science. They are attractive to pharmaceutical scientists because they significantly expand the number of crystal forms that exist for an active pharmaceutical ingredient and can lead to improvements in physicochemical properties of clinical relevance. At the same time, machine learning is finding its way into all areas of drug discovery and delivers impressive results. In this review, we attempt to provide an overview of machine learning, deep learning and network-based recommendation approaches applied to pharmaceutical co-crystallization. We also present crystal structure prediction as an alternative to machine learning approaches.
Collapse
|
11
|
Cersonsky RK, Pakhnova M, Engel EA, Ceriotti M. A data-driven interpretation of the stability of organic molecular crystals. Chem Sci 2023; 14:1272-1285. [PMID: 36756329 PMCID: PMC9891366 DOI: 10.1039/d2sc06198h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 12/06/2022] [Indexed: 01/17/2023] Open
Abstract
Due to the subtle balance of intermolecular interactions that govern structure-property relations, predicting the stability of crystal structures formed from molecular building blocks is a highly non-trivial scientific problem. A particularly active and fruitful approach involves classifying the different combinations of interacting chemical moieties, as understanding the relative energetics of different interactions enables the design of molecular crystals and fine-tuning of their stabilities. While this is usually performed based on the empirical observation of the most commonly encountered motifs in known crystal structures, we propose to apply a combination of supervised and unsupervised machine-learning techniques to automate the construction of an extensive library of molecular building blocks. We introduce a structural descriptor tailored to the prediction of the binding (lattice) energy and apply it to a curated dataset of organic crystals, exploiting its atom-centered nature to obtain a data-driven assessment of the contribution of different chemical groups to the lattice energy of the crystal. We then interpret this library using a low-dimensional representation of the structure-energy landscape and discuss selected examples of the insights into crystal engineering that can be extracted from this analysis, providing a complete database to guide the design of molecular materials.
Collapse
Affiliation(s)
- Rose K Cersonsky
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| | - Maria Pakhnova
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| | - Edgar A Engel
- TCM Group, Trinity College, Cambridge University Cambridge UK
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| |
Collapse
|