1
|
Takiguchi Y, Nakane D, Akitsu T. The prediction of single-molecule magnet properties via deep learning. IUCRJ 2024; 11:182-189. [PMID: 38299376 PMCID: PMC10916298 DOI: 10.1107/s2052252524000770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/22/2024] [Indexed: 02/02/2024]
Abstract
This paper uses deep learning to present a proof-of-concept for data-driven chemistry in single-molecule magnets (SMMs). Previous discussions within SMM research have proposed links between molecular structures (crystal structures) and single-molecule magnetic properties; however, these have only interpreted the results. Therefore, this study introduces a data-driven approach to predict the properties of SMM structures using deep learning. The deep-learning model learns the structural features of the SMM molecules by extracting the single-molecule magnetic properties from the 3D coordinates presented in this paper. The model accurately determined whether a molecule was a single-molecule magnet, with an accuracy rate of approximately 70% in predicting the SMM properties. The deep-learning model found SMMs from 20 000 metal complexes extracted from the Cambridge Structural Database. Using deep-learning models for predicting SMM properties and guiding the design of novel molecules is promising.
Collapse
Affiliation(s)
- Yuji Takiguchi
- Department of Chemistry, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 1628601, Japan
| | - Daisuke Nakane
- Department of Chemistry, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 1628601, Japan
| | - Takashiro Akitsu
- Department of Chemistry, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 1628601, Japan
| |
Collapse
|
2
|
Butler PV, Hafizi R, Day GM. Machine-Learned Potentials by Active Learning from Organic Crystal Structure Prediction Landscapes. J Phys Chem A 2024; 128:945-957. [PMID: 38277275 PMCID: PMC10860135 DOI: 10.1021/acs.jpca.3c07129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 01/28/2024]
Abstract
A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid-state density functional theory (DFT) methods allow for mostly reliable discrimination of the low-energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine-learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational costs close to those of force fields, can improve the efficiency of the CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently reranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Roohollah Hafizi
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| | - Graeme M. Day
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| |
Collapse
|
3
|
Kadan A, Ryczko K, Wildman A, Wang R, Roitberg A, Yamazaki T. Accelerated Organic Crystal Structure Prediction with Genetic Algorithms and Machine Learning. J Chem Theory Comput 2023; 19:9388-9402. [PMID: 38059458 DOI: 10.1021/acs.jctc.3c00853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
We present a high-throughput, end-to-end pipeline for organic crystal structure prediction (CSP)─the problem of identifying the stable crystal structures that will form from a given molecule based only on its molecular composition. Our tool uses neural network potentials to allow for efficient screening and structural relaxation of generated crystal candidates. Our pipeline consists of two distinct stages: random search, whereby crystal candidates are randomly generated and screened, and optimization, where a genetic algorithm (GA) optimizes this screened population. We assess the performance of each stage of our pipeline on 21 molecules taken from the Cambridge Crystallographic Data Centre's CSP blind tests. We show that random search alone yields matches for ≈50% of targets. We then validate the potential of our full pipeline, making use of the GA to optimize the root-mean-square deviation between crystal candidates and the experimentally derived structure. With this approach, we are able to find matches for ≈80% of candidates with 10-100 times smaller initial population sizes than when using random search. Lastly, we run our full pipeline with an ANI model that is trained on a small data set of molecules extracted from crystal structures in the Cambridge Structural Database, generating ≈60% of targets. By leveraging machine learning models trained to predict energies at the density functional theory level, our pipeline has the potential to approach the accuracy of ab initio methods and the efficiency of empirical force fields.
Collapse
Affiliation(s)
- Amit Kadan
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Kevin Ryczko
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Andrew Wildman
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Rodrigo Wang
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Adrian Roitberg
- Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, Florida 32611-7200, United States
| | - Takeshi Yamazaki
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| |
Collapse
|
4
|
Beran GJO. Frontiers of molecular crystal structure prediction for pharmaceuticals and functional organic materials. Chem Sci 2023; 14:13290-13312. [PMID: 38033897 PMCID: PMC10685338 DOI: 10.1039/d3sc03903j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
The reliability of organic molecular crystal structure prediction has improved tremendously in recent years. Crystal structure predictions for small, mostly rigid molecules are quickly becoming routine. Structure predictions for larger, highly flexible molecules are more challenging, but their crystal structures can also now be predicted with increasing rates of success. These advances are ushering in a new era where crystal structure prediction drives the experimental discovery of new solid forms. After briefly discussing the computational methods that enable successful crystal structure prediction, this perspective presents case studies from the literature that demonstrate how state-of-the-art crystal structure prediction can transform how scientists approach problems involving the organic solid state. Applications to pharmaceuticals, porous organic materials, photomechanical crystals, organic semi-conductors, and nuclear magnetic resonance crystallography are included. Finally, efforts to improve our understanding of which predicted crystal structures can actually be produced experimentally and other outstanding challenges are discussed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California Riverside Riverside CA 92521 USA
| |
Collapse
|
5
|
Price AJA, Otero-de-la-Roza A, Johnson ER. XDM-corrected hybrid DFT with numerical atomic orbitals predicts molecular crystal lattice energies with unprecedented accuracy. Chem Sci 2023; 14:1252-1262. [PMID: 36756332 PMCID: PMC9891363 DOI: 10.1039/d2sc05997e] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular crystals are important for many applications, including energetic materials, organic semiconductors, and the development and commercialization of pharmaceuticals. The exchange-hole dipole moment (XDM) dispersion model has shown good performance in the calculation of relative and absolute lattice energies of molecular crystals, although it has traditionally been applied in combination with plane-wave/pseudopotential approaches. This has limited XDM to use with semilocal functional approximations, which suffer from delocalization error and poor quality conformational energies, and to systems with a few hundreds of atoms at most due to unfavorable scaling. In this work, we combine XDM with numerical atomic orbitals, which enable the efficient use of XDM-corrected hybrid functionals for molecular crystals. We test the new XDM-corrected functionals for their ability to predict the lattice energies of molecular crystals for the X23 set and 13 ice phases, the latter being a particularly stringent test. A composite approach using a XDM-corrected, 25% hybrid functional based on B86bPBE achieves a mean absolute error of 0.48 kcal mol-1 per molecule for the X23 set and 0.19 kcal mol-1 for the total lattice energies of the ice phases, compared to recent diffusion Monte-Carlo data. These results make the new XDM-corrected hybrids not only far more computationally efficient than previous XDM implementations, but also the most accurate density-functional methods for molecular crystal lattice energies to date.
Collapse
Affiliation(s)
- Alastair J. A. Price
- Department of Chemistry, Dalhousie University6274 Coburg RdHalifaxB3H 4R2Nova ScotiaCanada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica and MALTA-Consolider Team, Facultad de Química, Universidad de Oviedo Oviedo 33006 Spain
| | - Erin R. Johnson
- Department of Chemistry, Dalhousie University6274 Coburg RdHalifaxB3H 4R2Nova ScotiaCanada
| |
Collapse
|
6
|
Liao K, Dong S, Cheng Z, Li W, Li S. Combined fragment-based machine learning force field with classical force field and its application in the NMR calculations of macromolecules in solutions. Phys Chem Chem Phys 2022; 24:18559-18567. [PMID: 35916054 DOI: 10.1039/d2cp02192g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We have developed a combined fragment-based machine learning (ML) force field and molecular mechanics (MM) force field for simulating the structures of macromolecules in solutions, and then compute its NMR chemical shifts with the generalized energy-based fragmentation (GEBF) approach at the level of density functional theory (DFT). In this work, we first construct Gaussian approximation potential based on GEBF subsystems of macromolecules for MD simulations and then a GEBF-based neural network (GEBF-NN) with deep potential model for the studied macromolecule. Then, we develop a GEBF-NN/MM force field for macromolecules in solutions by combining the GEBF-NN force field for the solute molecule and ff14SB force field for solvent molecules. Using the GEBF-NN/MM MD simulation to generate snapshot structures of solute/solvent clusters, we then perform the NMR calculations with the GEBF approach at the DFT level to calculate NMR chemical shifts of the solute molecule. Taking a heptamer of oligopyridine-dicarboxamides in chloroform solution as an example, our results show that the GEBF-NN force field is quite accurate for this heptamer by comparing with the reference DFT results. For this heptamer in chloroform solution, both the GEBF-NN/MM and classical MD simulations could lead to helical structures from the same initial extended structure. The GEBF-DFT NMR results indicate that the GEBF-NN/MM force field could lead to more accurate NMR chemical shifts on hydrogen atoms by comparing with the experimental NMR results. Therefore, the GEBF-NN/MM force field could be employed for predicting more accurate dynamical behaviors than the classical force field for complex systems in solutions.
Collapse
Affiliation(s)
- Kang Liao
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shiyu Dong
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Zheng Cheng
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Wei Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shuhua Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| |
Collapse
|
7
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
8
|
Wengert S, Csányi G, Reuter K, Margraf JT. A Hybrid Machine Learning Approach for Structure Stability Prediction in Molecular Co-crystal Screenings. J Chem Theory Comput 2022; 18:4586-4593. [PMID: 35709378 PMCID: PMC9281391 DOI: 10.1021/acs.jctc.2c00343] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
Co-crystals are a
highly interesting material class as varying
their components and stoichiometry in principle allows tuning supramolecular
assemblies toward desired physical properties. The in silico prediction of co-crystal structures represents a daunting task,
however, as they span a vast search space and usually feature large
unit cells. This requires theoretical models that are accurate and
fast to evaluate, a combination that can in principle be accomplished
by modern machine-learned (ML) potentials trained on first-principles
data. Crucially, these ML potentials need to account for the description
of long-range interactions, which are essential for the stability
and structure of molecular crystals. In this contribution, we present
a strategy for developing Δ-ML potentials for co-crystals, which
use a physical baseline model to describe long-range interactions.
The applicability of this approach is demonstrated for co-crystals
of variable composition consisting of an active pharmaceutical ingredient
and various co-formers. We find that the Δ-ML approach offers
a strong and consistent improvement over the density functional tight
binding baseline. Importantly, this even holds true when extrapolating
beyond the scope of the training set, for instance in molecular dynamics
simulations under ambient conditions.
Collapse
Affiliation(s)
- Simon Wengert
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany.,Chair of Theoretical Chemistry, Technische Universitát München, 85747 Garching, Germany
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Karsten Reuter
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| | - Johannes T Margraf
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| |
Collapse
|
9
|
A complete description of thermodynamic stabilities of molecular crystals. Proc Natl Acad Sci U S A 2022; 119:2111769119. [PMID: 35131847 PMCID: PMC8832981 DOI: 10.1073/pnas.2111769119] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2021] [Indexed: 12/27/2022] Open
Abstract
Predicting stable polymorphs of molecular crystals remains one of the grand challenges of computational science. Current methods invoke approximations to electronic structure and statistical mechanics and thus fail to consistently reproduce the delicate balance of physical effects determining thermodynamic stability. We compute the rigorous ab initio Gibbs free energies for competing polymorphs of paradigmatic compounds, using machine learning to mitigate costs. The accurate description of electronic structure and full treatment of quantum statistical mechanics allow us to predict the experimentally observed phase behavior. This constitutes a key step toward the first-principles design of functional materials for applications from photovoltaics to pharmaceuticals. Predictions of relative stabilities of (competing) molecular crystals are of great technological relevance, most notably for the pharmaceutical industry. However, they present a long-standing challenge for modeling, as often minuscule free energy differences are sensitively affected by the description of electronic structure, the statistical mechanics of the nuclei and the cell, and thermal expansion. The importance of these effects has been individually established, but rigorous free energy calculations for general molecular compounds, which simultaneously account for all effects, have hitherto not been computationally viable. Here we present an efficient “end to end” framework that seamlessly combines state-of-the art electronic structure calculations, machine-learning potentials, and advanced free energy methods to calculate ab initio Gibbs free energies for general organic molecular materials. The facile generation of machine-learning potentials for a diverse set of polymorphic compounds—benzene, glycine, and succinic acid—and predictions of thermodynamic stabilities in qualitative and quantitative agreement with experiments highlight that predictive thermodynamic studies of industrially relevant molecular materials are no longer a daunting task.
Collapse
|
10
|
Wang J, Wang R, Yang M, Xu D. Understanding Zinc-Doped Hydroxyapatite Structures Using the First-Principles Calculations and Convolutional Neural Network Algorithm. J Mater Chem B 2022; 10:1281-1290. [DOI: 10.1039/d1tb02687a] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Element doping is widely used to improve the performance of materials by changing their intrinsic properties. However, the lack of direct crystallographic structures for dopants has restricted the effective high-throughput...
Collapse
|
11
|
Dudek MK, Druzbicki K. Along the road to Crystal Structure Prediction (CSP) of pharmaceutical-like molecules. CrystEngComm 2022. [DOI: 10.1039/d1ce01564h] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Computational methods used for predicting crystal structures of organic compounds are mature enough to be routinely used with many rigid and semi-rigid organic molecules. The usefulness of Crystal Structure Prediction...
Collapse
|
12
|
Beran GJO, Sugden IJ, Greenwell C, Bowskill DH, Pantelides CC, Adjiman CS. How many more polymorphs of ROY remain undiscovered. Chem Sci 2022; 13:1288-1297. [PMID: 35222912 PMCID: PMC8809489 DOI: 10.1039/d1sc06074k] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 12/10/2021] [Indexed: 12/15/2022] Open
Abstract
With 12 crystal forms, 5-methyl-2-[(2-nitrophenyl)amino]-3-thiophenecabonitrile (a.k.a. ROY) holds the current record for the largest number of fully characterized organic crystal polymorphs. Four of these polymorph structures have been reported since 2019, raising the question of how many more ROY polymorphs await future discovery. Employing crystal structure prediction and accurate energy rankings derived from conformational energy-corrected density functional theory, this study presents the first crystal energy landscape for ROY that agrees well with experiment. The lattice energies suggest that the seven most stable ROY polymorphs (and nine of the twelve lowest-energy forms) on the Z′ = 1 landscape have already been discovered experimentally. Discovering any new polymorphs at ambient pressure will likely require specialized crystallization techniques capable of trapping metastable forms. At pressures above 10 GPa, however, a new crystal form is predicted to become enthalpically more stable than all known polymorphs, suggesting that further high-pressure experiments on ROY may be warranted. This work highlights the value of high-accuracy crystal structure prediction for solid-form screening and demonstrates how pragmatic conformational energy corrections can overcome the limitations of conventional density functionals for conformational polymorphs. Crystal structure prediction suggests that the low-energy polymorphs of ROY have already been found, but a new high-pressure form is predicted.![]()
Collapse
Affiliation(s)
- Gregory J. O. Beran
- Department of Chemistry, University of California Riverside, Riverside, CA 92521, USA
| | - Isaac J. Sugden
- Department of Chemical Engineering, Sargent Centre for Process Systems Engineering, Imperial College London, London, SW7 2AZ, UK
| | - Chandler Greenwell
- Department of Chemistry, University of California Riverside, Riverside, CA 92521, USA
| | - David H. Bowskill
- Department of Chemical Engineering, Sargent Centre for Process Systems Engineering, Imperial College London, London, SW7 2AZ, UK
| | - Constantinos C. Pantelides
- Department of Chemical Engineering, Sargent Centre for Process Systems Engineering, Imperial College London, London, SW7 2AZ, UK
| | - Claire S. Adjiman
- Department of Chemical Engineering, Sargent Centre for Process Systems Engineering, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
13
|
Carpenter JE, Grünwald M. Pre-Nucleation Clusters Predict Crystal Structures in Models of Chiral Molecules. J Am Chem Soc 2021; 143:21580-21593. [PMID: 34918909 DOI: 10.1021/jacs.1c09321] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Kinetics can play an important role in the crystallization of molecules and can give rise to polymorphism, the tendency of molecules to form more than one crystal structure. Current computational methods of crystal structure prediction, however, focus almost exclusively on identifying the thermodynamically stable polymorph. Kinetic factors of nucleation and growth are often neglected because the underlying microscopic processes can be complex and accurate rate calculations are numerically cumbersome. In this work, we use molecular dynamics computer simulations to study simple molecular models that reproduce the crystallization behavior of real chiral molecules, including the formation of enantiopure and racemic crystals, as well as polymorphism. A significant fraction of these molecules forms crystals that do not have the lowest free energy. We demonstrate that at high supersaturation crystal formation can be accurately predicted by considering the similarities between oligomeric species in solution and molecular motifs in the crystal structure. For the case of racemic mixtures, we even find that knowledge of crystal free energies is not necessary and kinetic considerations are sufficient to determine if the system will undergo spontaneous chiral separation. Our results suggest conceptually simple ways of improving current crystal structure prediction methods.
Collapse
Affiliation(s)
- John E Carpenter
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Michael Grünwald
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
14
|
Heng T, Yang D, Wang R, Zhang L, Lu Y, Du G. Progress in Research on Artificial Intelligence Applied to Polymorphism and Cocrystal Prediction. ACS OMEGA 2021; 6:15543-15550. [PMID: 34179597 PMCID: PMC8223226 DOI: 10.1021/acsomega.1c01330] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 05/28/2021] [Indexed: 06/13/2023]
Abstract
Artificial intelligence (AI) is a technology that builds an artificial system with certain intelligence and uses computer software and hardware to simulate intelligent human behavior. When combined with drug research and development, AI can considerably shorten this cycle, improve research efficiency, and minimize costs. The use of machine learning to discover novel materials and predict material properties has become a new research direction. On the basis of the current status of worldwide research on the combination of AI and crystal form and cocrystal, this mini-review analyzes and explores the application of AI in polymorphism prediction, crystal structure analysis, crystal property prediction, cocrystal former (CCF) screening, cocrystal composition prediction, and cocrystal formation prediction. This study provides insights into the future applications of AI in related fields.
Collapse
Affiliation(s)
- Tianyu Heng
- Beijing
City Key Laboratory of Polymorphic Drugs, Center of Pharmaceutical
Polymorphs, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Dezhi Yang
- Beijing
City Key Laboratory of Polymorphic Drugs, Center of Pharmaceutical
Polymorphs, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Ruonan Wang
- Beijing
City Key Laboratory of Polymorphic Drugs, Center of Pharmaceutical
Polymorphs, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Li Zhang
- Beijing
City Key Laboratory of Polymorphic Drugs, Center of Pharmaceutical
Polymorphs, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Yang Lu
- Beijing
City Key Laboratory of Polymorphic Drugs, Center of Pharmaceutical
Polymorphs, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, P.R. China
| | - Guanhua Du
- Beijing
City Key Laboratory of Drug Target and Screening Research, National
Center for Pharmaceutical Screening, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union
Medical College, Beijing 100050, P.R. China
| |
Collapse
|
15
|
Affiliation(s)
- Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
16
|
Aina AA, Misquitta AJ, Price SL. A non-empirical intermolecular force-field for trinitrobenzene and its application in crystal structure prediction. J Chem Phys 2021; 154:094123. [PMID: 33685142 DOI: 10.1063/5.0043746] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An anisotropic atom-atom distributed intermolecular force-field (DIFF) for rigid trinitrobenzene (TNB) is developed using distributed multipole moments, dipolar polarizabilities, and dispersion coefficients derived from the charge density of the isolated molecule. The short-range parameters of the force-field are fitted to first- and second-order symmetry-adapted perturbation theory dimer interaction energy calculations using the distributed density-overlap model to guide the parameterization of the short-range anisotropy. The second-order calculations are used for fitting the damping coefficients of the long-range dispersion and polarization and also for relaxing the isotropic short-range coefficients in the final model, DIFF-srL2(rel). We assess the accuracy of the unrelaxed model, DIFF-srL2(norel), and its equivalent without short-range anisotropy, DIFF-srL0(norel), as these models are easier to derive. The model potentials are contrasted with empirical models for the repulsion-dispersion fitted to organic crystal structures with multipoles of iterated stockholder atoms (ISAs), FIT(ISA,L4), and with Gaussian Distributed Analysis (GDMA) multipoles, FIT(GDMA,L4), commonly used in modeling organic crystals. The potentials are tested for their ability to model the solid state of TNB. The non-empirical models provide more reasonable relative lattice energies of the three polymorphs of TNB and propose more sensible hypothetical structures than the empirical force-field (FIT). The DIFF-srL2(rel) model successfully has the most stable structure as one of the many structures that match the coordination sphere of form III. The neglect of the conformational flexibility of the nitro-groups is a significant approximation. This methodology provides a step toward force-fields capable of representing all phases of a molecule in molecular dynamics simulations.
Collapse
Affiliation(s)
- Alex A Aina
- Department of Chemistry, University College London, 20 Gordon St., London WC1H 0AJ, United Kingdom
| | - Alston J Misquitta
- School of Physics and Astronomy and The Thomas Young Centre for Theory and Simulation of Materials at Queen Mary, University of London, London E1 4NS, United Kingdom
| | - Sarah L Price
- Department of Chemistry, University College London, 20 Gordon St., London WC1H 0AJ, United Kingdom
| |
Collapse
|
17
|
Greenaway RL, Jelfs KE. Integrating Computational and Experimental Workflows for Accelerated Organic Materials Discovery. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2021; 33:e2004831. [PMID: 33565203 DOI: 10.1002/adma.202004831] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 09/28/2020] [Indexed: 06/12/2023]
Abstract
Organic materials find application in a range of areas, including optoelectronics, sensing, encapsulation, molecular separations, and photocatalysis. The discovery of materials is frustratingly slow however, particularly when contrasted to the vast chemical space of possibilities based on the near limitless options for organic molecular precursors. The difficulty in predicting the material assembly, and consequent properties, of any molecule is another significant roadblock to targeted materials design. There has been significant progress in the development of computational approaches to screen large numbers of materials, for both their structure and properties, helping guide synthetic researchers toward promising materials. In particular, artificial intelligence techniques have the potential to make significant impact in many elements of the discovery process. Alongside this, automation and robotics are increasing the scale and speed with which materials synthesis can be realized. Herein, the focus is on demonstrating the power of integrating computational and experimental materials discovery programmes, including both a summary of key situations where approaches can be combined and a series of case studies that demonstrate recent successes.
Collapse
Affiliation(s)
- Rebecca L Greenaway
- Department of Chemistry, Imperial College London, Molecular Sciences Research Hub, White City Campus, Wood Lane, London, W12 0BZ, UK
| | - Kim E Jelfs
- Department of Chemistry, Imperial College London, Molecular Sciences Research Hub, White City Campus, Wood Lane, London, W12 0BZ, UK
| |
Collapse
|
18
|
Wengert S, Csányi G, Reuter K, Margraf JT. Data-efficient machine learning for molecular crystal structure prediction. Chem Sci 2021; 12:4536-4546. [PMID: 34163719 PMCID: PMC8179468 DOI: 10.1039/d0sc05765g] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 02/05/2021] [Indexed: 12/16/2022] Open
Abstract
The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with an unrivalled accuracy/cost ratio. However, such methods are ultimately limited by the computational effort required to produce the reference data. In particular, reference calculations for periodic systems with many atoms can become prohibitively expensive for higher levels of theory. This trade-off is critical in the context of organic crystal structure prediction (CSP). Here, a data-efficient ML approach would be highly desirable, since screening a huge space of possible polymorphs in a narrow energy range requires the assessment of a large number of trial structures with high accuracy. In this contribution, we present tailored Δ-ML models that allow screening a wide range of crystal candidates while adequately describing the subtle interplay between intermolecular interactions such as H-bonding and many-body dispersion effects. This is achieved by enhancing a physics-based description of long-range interactions at the density functional tight binding (DFTB) level-for which an efficient implementation is available-with a short-range ML model trained on high-quality first-principles reference data. The presented workflow is broadly applicable to different molecular materials, without the need for a single periodic calculation at the reference level of theory. We show that this even allows the use of wavefunction methods in CSP.
Collapse
Affiliation(s)
- Simon Wengert
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge Cambridge CB2 1PZ UK
| | - Karsten Reuter
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Faradayweg 4-6 14195 Berlin Germany
| | - Johannes T Margraf
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Faradayweg 4-6 14195 Berlin Germany
| |
Collapse
|
19
|
Unzueta PA, Greenwell CS, Beran GJO. Predicting Density Functional Theory-Quality Nuclear Magnetic Resonance Chemical Shifts via Δ-Machine Learning. J Chem Theory Comput 2021; 17:826-840. [DOI: 10.1021/acs.jctc.0c00979] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Pablo A. Unzueta
- Department of Chemistry, University of California, Riverside, Riverside, California 92521, United States
| | - Chandler S. Greenwell
- Department of Chemistry, University of California, Riverside, Riverside, California 92521, United States
| | - Gregory J. O. Beran
- Department of Chemistry, University of California, Riverside, Riverside, California 92521, United States
| |
Collapse
|
20
|
Woodley SM, Day GM, Catlow R. Structure prediction of crystals, surfaces and nanoparticles. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2020; 378:20190600. [PMID: 33100162 DOI: 10.1098/rsta.2019.0600] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
We review the current techniques used in the prediction of crystal structures and their surfaces and of the structures of nanoparticles. The main classes of search algorithm and energy function are summarized, and we discuss the growing role of methods based on machine learning. We illustrate the current status of the field with examples taken from metallic, inorganic and organic systems. This article is part of a discussion meeting issue 'Dynamic in situ microscopy relating structure and function'.
Collapse
Affiliation(s)
- Scott M Woodley
- Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
| | - Graeme M Day
- Computational Systems Chemistry, School of Chemistry, University of Southampton, Southampton SO17 1BJ, UK
| | - R Catlow
- Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
- School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, UK
| |
Collapse
|
21
|
Abstract
We introduce new and robust decompositions of mean-field Hartree-Fock and Kohn-Sham density functional theory relying on the use of localized molecular orbitals and physically sound charge population protocols. The new lossless property decompositions, which allow for partitioning one-electron reduced density matrices into either bond-wise or atomic contributions, are compared to alternatives from the literature with regard to both molecular energies and dipole moments. Besides commenting on possible applications as an interpretative tool in the rationalization of certain electronic phenomena, we demonstrate how decomposed mean-field theory makes it possible to expose and amplify compositional features in the context of machine-learned quantum chemistry. This is made possible by improving upon the granularity of the underlying data. On the basis of our preliminary proof-of-concept results, we conjecture that many of the structure-property inferences in existence today may be further refined by efficiently leveraging an increase in dataset complexity and richness.
Collapse
Affiliation(s)
- Janus J Eriksen
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| |
Collapse
|
22
|
Rinderspacher BC. Heuristic Global Optimization in Chemical Compound Space. J Phys Chem A 2020; 124:9044-9060. [DOI: 10.1021/acs.jpca.0c05941] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- B. Christopher Rinderspacher
- Materials Discovery and Technology Branch, US Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
23
|
Egorova O, Hafizi R, Woods DC, Day GM. Multifidelity Statistical Machine Learning for Molecular Crystal Structure Prediction. J Phys Chem A 2020; 124:8065-8078. [PMID: 32881496 DOI: 10.1021/acs.jpca.0c05006] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The prediction of crystal structures from first-principles requires highly accurate energies for large numbers of putative crystal structures. High accuracy of solid state density functional theory (DFT) calculations is often required, but hundreds or more structures can be present in the low energy region of interest, so that the associated computational costs are prohibitive. Here, we apply statistical machine learning to predict expensive hybrid functional DFT (PBE0) calculations using a multifidelity approach to re-evaluate the energies of crystal structures predicted with an inexpensive force field. The method uses an autoregressive Gaussian process, making use of less expensive GGA DFT (PBE) calculations to bridge the gap between the force field and PBE0 energies. The method is benchmarked on the crystal structure landscapes of three small, hydrogen-bonded organic molecules and shown to produce accurate predictions of energies and crystal structure ranking using small numbers of the most expensive calculations; the PBE0 energies can be predicted with errors of less than 1 kJ mol-1 with between 4.2 and 6.8% of the cost of the full calculations. As the model that we have developed is probabilistic, we discuss how the uncertainties in predicted energies impact the assessment of the energetic ranking of crystal structures.
Collapse
Affiliation(s)
- Olga Egorova
- Statistical Sciences Research Institute, University of Southampton, Southampton, SO17 1BJ, U.K
| | - Roohollah Hafizi
- Computational Systems Chemistry, School of Chemistry, University of Southampton, Southampton, SO17 1BJ, U.K
| | - David C Woods
- Statistical Sciences Research Institute, University of Southampton, Southampton, SO17 1BJ, U.K
| | - Graeme M Day
- Computational Systems Chemistry, School of Chemistry, University of Southampton, Southampton, SO17 1BJ, U.K
| |
Collapse
|
24
|
Cheng CY, Campbell JE, Day GM. Evolutionary chemical space exploration for functional materials: computational organic semiconductor discovery. Chem Sci 2020; 11:4922-4933. [PMID: 34122948 PMCID: PMC8159259 DOI: 10.1039/d0sc00554a] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 04/21/2020] [Indexed: 11/26/2022] Open
Abstract
Computational methods, including crystal structure and property prediction, have the potential to accelerate the materials discovery process by enabling structure prediction and screening of possible molecular building blocks prior to their synthesis. However, the discovery of new functional molecular materials is still limited by the need to identify promising molecules from a vast chemical space. We describe an evolutionary method which explores a user specified region of chemical space to identify promising molecules, which are subsequently evaluated using crystal structure prediction. We demonstrate the methods for the exploration of aza-substituted pentacenes with the aim of finding small molecule organic semiconductors with high charge carrier mobilities, where the space of possible substitution patterns is too large to exhaustively search using a high throughput approach. The method efficiently explores this large space, typically requiring calculations on only ∼1% of molecules during a search. The results reveal two promising structural motifs: aza-substituted naphtho[1,2-a]anthracenes with reorganisation energies as low as pentacene and a series of pyridazine-based molecules having both low reorganisation energies and high electron affinities.
Collapse
Affiliation(s)
- Chi Y Cheng
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| | - Josh E Campbell
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| | - Graeme M Day
- Computational Systems Chemistry, School of Chemistry, University of Southampton Highfield Southampton SO17 1NX UK
| |
Collapse
|
25
|
LeBlanc LM, Johnson ER. Crystal-energy landscapes of active pharmaceutical ingredients using composite approaches. CrystEngComm 2019. [DOI: 10.1039/c9ce00895k] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Composite methods employing dispersion-corrected DFT consistently identify experimentally isolated polymorphs as the lowest-energy crystal structures of common APIs.
Collapse
Affiliation(s)
- Luc M. LeBlanc
- Department of Chemistry
- Dalhousie University
- Halifax
- Canada
| | | |
Collapse
|