1
|
Kadan A, Ryczko K, Wildman A, Wang R, Roitberg A, Yamazaki T. Accelerated Organic Crystal Structure Prediction with Genetic Algorithms and Machine Learning. J Chem Theory Comput 2023; 19:9388-9402. [PMID: 38059458 DOI: 10.1021/acs.jctc.3c00853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
We present a high-throughput, end-to-end pipeline for organic crystal structure prediction (CSP)─the problem of identifying the stable crystal structures that will form from a given molecule based only on its molecular composition. Our tool uses neural network potentials to allow for efficient screening and structural relaxation of generated crystal candidates. Our pipeline consists of two distinct stages: random search, whereby crystal candidates are randomly generated and screened, and optimization, where a genetic algorithm (GA) optimizes this screened population. We assess the performance of each stage of our pipeline on 21 molecules taken from the Cambridge Crystallographic Data Centre's CSP blind tests. We show that random search alone yields matches for ≈50% of targets. We then validate the potential of our full pipeline, making use of the GA to optimize the root-mean-square deviation between crystal candidates and the experimentally derived structure. With this approach, we are able to find matches for ≈80% of candidates with 10-100 times smaller initial population sizes than when using random search. Lastly, we run our full pipeline with an ANI model that is trained on a small data set of molecules extracted from crystal structures in the Cambridge Structural Database, generating ≈60% of targets. By leveraging machine learning models trained to predict energies at the density functional theory level, our pipeline has the potential to approach the accuracy of ab initio methods and the efficiency of empirical force fields.
Collapse
Affiliation(s)
- Amit Kadan
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Kevin Ryczko
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Andrew Wildman
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Rodrigo Wang
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Adrian Roitberg
- Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, Florida 32611-7200, United States
| | - Takeshi Yamazaki
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| |
Collapse
|
2
|
Beran GJO. Frontiers of molecular crystal structure prediction for pharmaceuticals and functional organic materials. Chem Sci 2023; 14:13290-13312. [PMID: 38033897 PMCID: PMC10685338 DOI: 10.1039/d3sc03903j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
The reliability of organic molecular crystal structure prediction has improved tremendously in recent years. Crystal structure predictions for small, mostly rigid molecules are quickly becoming routine. Structure predictions for larger, highly flexible molecules are more challenging, but their crystal structures can also now be predicted with increasing rates of success. These advances are ushering in a new era where crystal structure prediction drives the experimental discovery of new solid forms. After briefly discussing the computational methods that enable successful crystal structure prediction, this perspective presents case studies from the literature that demonstrate how state-of-the-art crystal structure prediction can transform how scientists approach problems involving the organic solid state. Applications to pharmaceuticals, porous organic materials, photomechanical crystals, organic semi-conductors, and nuclear magnetic resonance crystallography are included. Finally, efforts to improve our understanding of which predicted crystal structures can actually be produced experimentally and other outstanding challenges are discussed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California Riverside Riverside CA 92521 USA
| |
Collapse
|
3
|
O’Connor D, Bier I, Tom R, Hiszpanski AM, Steele BA, Marom N. Ab Initio Crystal Structure Prediction of the Energetic Materials LLM-105, RDX, and HMX. CRYSTAL GROWTH & DESIGN 2023; 23:6275-6289. [PMID: 38173900 PMCID: PMC10763925 DOI: 10.1021/acs.cgd.3c00027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 08/02/2023] [Indexed: 01/05/2024]
Abstract
Crystal structure prediction (CSP) is performed for the energetic materials (EMs) LLM-105 and α-RDX, as well as the α and β conformational polymorphs of 1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane (HMX), using the genetic algorithm (GA) code, GAtor, and its associated random structure generator, Genarris. Genarris and GAtor successfully generate the experimental structures of all targets. GAtor's symmetric crossover scheme, where the space group symmetries of parent structures are treated as genes inherited by offspring, is found to be particularly effective. However, conducting several GA runs with different settings is still important for achieving diverse samplings of the potential energy surface. For LLM-105 and α-RDX, the experimental structure is ranked as the most stable, with all of the dispersion-inclusive density functional theory (DFT) methods used here. For HMX, the α form was persistently ranked as more stable than the β form, in contrast to experimental observations, even when correcting for vibrational contributions and thermal expansion. This may be attributed to insufficient accuracy of dispersion-inclusive DFT methods or to kinetic effects not considered here. In general, the ranking of some putative structures is found to be sensitive to the choice of the DFT functional and the dispersion method. For LLM-105, GAtor generates a putative structure with a layered packing motif, which is desirable thanks to its correlation with low sensitivity. Our results demonstrate that CSP is a useful tool for studying the ubiquitous polymorphism of EMs and shows promise of becoming an integral part of the EM development pipeline.
Collapse
Affiliation(s)
- Dana O’Connor
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Imanuel Bier
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Rithwik Tom
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Anna M. Hiszpanski
- Materials
Science Division, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Brad A. Steele
- Materials
Science Division, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Noa Marom
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
4
|
Wang J, Gao H, Han Y, Ding C, Pan S, Wang Y, Jia Q, Wang HT, Xing D, Sun J. MAGUS: machine learning and graph theory assisted universal structure searcher. Natl Sci Rev 2023; 10:nwad128. [PMID: 37332628 PMCID: PMC10275355 DOI: 10.1093/nsr/nwad128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 03/30/2023] [Accepted: 04/28/2023] [Indexed: 06/20/2023] Open
Abstract
Crystal structure predictions based on first-principles calculations have gained great success in materials science and solid state physics. However, the remaining challenges still limit their applications in systems with a large number of atoms, especially the complexity of conformational space and the cost of local optimizations for big systems. Here, we introduce a crystal structure prediction method, MAGUS, based on the evolutionary algorithm, which addresses the above challenges with machine learning and graph theory. Techniques used in the program are summarized in detail and benchmark tests are provided. With intensive tests, we demonstrate that on-the-fly machine-learning potentials can be used to significantly reduce the number of expensive first-principles calculations, and the crystal decomposition based on graph theory can efficiently decrease the required configurations in order to find the target structures. We also summarized the representative applications of this method on several research topics, including unexpected compounds in the interior of planets and their exotic states at high pressure and high temperature (superionic, plastic, partially diffusive state, etc.); new functional materials (superhard, high-energy-density, superconducting, photoelectric materials), etc. These successful applications demonstrated that MAGUS code can help to accelerate the discovery of interesting materials and phenomena, as well as the significant value of crystal structure predictions in general.
Collapse
Affiliation(s)
| | | | | | - Chi Ding
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Shuning Pan
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Yong Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Qiuhan Jia
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Hui-Tian Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Dingyu Xing
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | | |
Collapse
|
5
|
Kilgour M, Rogal J, Tuckerman M. Geometric Deep Learning for Molecular Crystal Structure Prediction. J Chem Theory Comput 2023. [PMID: 37053511 PMCID: PMC10373482 DOI: 10.1021/acs.jctc.3c00031] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
We develop and test new machine learning strategies for accelerating molecular crystal structure ranking and crystal property prediction using tools from geometric deep learning on molecular graphs. Leveraging developments in graph-based learning and the availability of large molecular crystal data sets, we train models for density prediction and stability ranking which are accurate, fast to evaluate, and applicable to molecules of widely varying size and composition. Our density prediction model, MolXtalNet-D, achieves state-of-the-art performance, with lower than 2% mean absolute error on a large and diverse test data set. Our crystal ranking tool, MolXtalNet-S, correctly discriminates experimental samples from synthetically generated fakes and is further validated through analysis of the submissions to the Cambridge Structural Database Blind Tests 5 and 6. Our new tools are computationally cheap and flexible enough to be deployed within an existing crystal structure prediction pipeline both to reduce the search space and score/filter crystal structure candidates.
Collapse
Affiliation(s)
- Michael Kilgour
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Jutta Rogal
- Department of Chemistry, New York University, New York, New York 10003, United States
- Fachbereich Physik, Freie Universität Berlin, 14195 Berlin, Germany
| | - Mark Tuckerman
- Department of Chemistry, New York University, New York, New York 10003, United States
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Rd. North, Shanghai 200062, China
- Simons Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
| |
Collapse
|
6
|
Tom R, Gao S, Yang Y, Zhao K, Bier I, Buchanan EA, Zaykov A, Havlas Z, Michl J, Marom N. Inverse Design of Tetracene Polymorphs with Enhanced Singlet Fission Performance by Property-Based Genetic Algorithm Optimization. CHEMISTRY OF MATERIALS : A PUBLICATION OF THE AMERICAN CHEMICAL SOCIETY 2023; 35:1373-1386. [PMID: 36999121 PMCID: PMC10042130 DOI: 10.1021/acs.chemmater.2c03444] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/06/2023] [Indexed: 06/19/2023]
Abstract
The efficiency of solar cells may be improved by using singlet fission (SF), in which one singlet exciton splits into two triplet excitons. SF occurs in molecular crystals. A molecule may crystallize in more than one form, a phenomenon known as polymorphism. Crystal structure may affect SF performance. In the common form of tetracene, SF is experimentally known to be slightly endoergic. A second, metastable polymorph of tetracene has been found to exhibit better SF performance. Here, we conduct inverse design of the crystal packing of tetracene using a genetic algorithm (GA) with a fitness function tailored to simultaneously optimize the SF rate and the lattice energy. The property-based GA successfully generates more structures predicted to have higher SF rates and provides insight into packing motifs associated with improved SF performance. We find a putative polymorph predicted to have superior SF performance to the two forms of tetracene, whose structures have been determined experimentally. The putative structure has a lattice energy within 1.5 kJ/mol of the most stable common form of tetracene.
Collapse
Affiliation(s)
- Rithwik Tom
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Siyu Gao
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Yi Yang
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Kaiji Zhao
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Imanuel Bier
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Eric A. Buchanan
- Department
of Chemistry, University of Colorado, Boulder, Colorado80309, United States
| | - Alexandr Zaykov
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
- Department
of Physical Chemistry, University of Chemistry
and Technology, 166 28Prague 6, Czech Republic
| | - Zdeněk Havlas
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
| | - Josef Michl
- Department
of Chemistry, University of Colorado, Boulder, Colorado80309, United States
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
| | - Noa Marom
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
| |
Collapse
|
7
|
Petsev ND, Nikoubashman A, Latinwo F, Stillinger FH, Debenedetti PG. Crystal Prediction via Genetic Algorithms in a Model Chiral System. J Phys Chem B 2022; 126:7771-7780. [PMID: 36162405 DOI: 10.1021/acs.jpcb.2c04501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Chiral crystals and their constituent molecules play a prominent role in theories about the origin of biological homochirality and in drug discovery, design, and stability. Although the prediction and identification of stable chiral crystal structures is crucial for numerous technologies, including separation processes and polymorph selection and control, predictive ability is often complicated by a combination of many-body interactions and molecular complexity and handedness. In this work, we address these challenges by applying genetic algorithms to predict the ground-state crystal lattices formed by a chiral tetramer molecular model, which we have previously shown to exhibit complex fluid-phase behavior. Using this approach, we explore the relative stability and structures of the model's conglomerate and racemic crystals, and present a structural phase diagram for the stable Bravais crystal types in the zero-temperature limit.
Collapse
Affiliation(s)
- Nikolai D Petsev
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Arash Nikoubashman
- Institute of Physics, Johannes Gutenberg University Mainz, Staudingerweg 7, 55128 Mainz, Germany
| | - Folarin Latinwo
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States.,Synopsys Inc., Austin, Texas 78746, United States
| | - Frank H Stillinger
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Pablo G Debenedetti
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
8
|
Hofmann OT, Zojer E, Hörmann L, Jeindl A, Maurer RJ. First-principles calculations of hybrid inorganic-organic interfaces: from state-of-the-art to best practice. Phys Chem Chem Phys 2021; 23:8132-8180. [PMID: 33875987 PMCID: PMC8237233 DOI: 10.1039/d0cp06605b] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 03/05/2021] [Indexed: 12/18/2022]
Abstract
The computational characterization of inorganic-organic hybrid interfaces is arguably one of the technically most challenging applications of density functional theory. Due to the fundamentally different electronic properties of the inorganic and the organic components of a hybrid interface, the proper choice of the electronic structure method, of the algorithms to solve these methods, and of the parameters that enter these algorithms is highly non-trivial. In fact, computational choices that work well for one of the components often perform poorly for the other. As a consequence, default settings for one materials class are typically inadequate for the hybrid system, which makes calculations employing such settings inefficient and sometimes even prone to erroneous results. To address this issue, we discuss how to choose appropriate atomistic representations for the system under investigation, we highlight the role of the exchange-correlation functional and the van der Waals correction employed in the calculation and we provide tips and tricks how to efficiently converge the self-consistent field cycle and to obtain accurate geometries. We particularly focus on potentially unexpected pitfalls and the errors they incur. As a summary, we provide a list of best practice rules for interface simulations that should especially serve as a useful starting point for less experienced users and newcomers to the field.
Collapse
Affiliation(s)
- Oliver T Hofmann
- Institute of Solid State Physics, Graz University of Technology, NAWI Graz, Petersgasse 16/II, 8010 Graz, Austria.
| | - Egbert Zojer
- Institute of Solid State Physics, Graz University of Technology, NAWI Graz, Petersgasse 16/II, 8010 Graz, Austria.
| | - Lukas Hörmann
- Institute of Solid State Physics, Graz University of Technology, NAWI Graz, Petersgasse 16/II, 8010 Graz, Austria.
| | - Andreas Jeindl
- Institute of Solid State Physics, Graz University of Technology, NAWI Graz, Petersgasse 16/II, 8010 Graz, Austria.
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
9
|
Bowskill DH, Sugden IJ, Konstantinopoulos S, Adjiman CS, Pantelides CC. Crystal Structure Prediction Methods for Organic Molecules: State of the Art. Annu Rev Chem Biomol Eng 2021; 12:593-623. [PMID: 33770462 DOI: 10.1146/annurev-chembioeng-060718-030256] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The prediction of the crystal structures that a given organic molecule is likely to form is an important theoretical problem of significant interest for the pharmaceutical and agrochemical industries, among others. As evidenced by a series of six blind tests organized over the past 2 decades, methodologies for crystal structure prediction (CSP) have witnessed substantial progress and have now reached a stage of development where they can begin to be applied to systems of practical significance. This article reviews the state of the art in general-purpose methodologies for CSP, placing them within a common framework that highlights both their similarities and their differences. The review discusses specific areas that constitute the main focus of current research efforts toward improving the reliability and widening applicability of these methodologies, and offers some perspectives for the evolution of this technology over the next decade.
Collapse
Affiliation(s)
- David H Bowskill
- Molecular Systems Engineering Group, Centre for Process Systems Engineering, Department of Chemical Engineering, and Institute for Molecular Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom;
| | - Isaac J Sugden
- Molecular Systems Engineering Group, Centre for Process Systems Engineering, Department of Chemical Engineering, and Institute for Molecular Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom;
| | - Stefanos Konstantinopoulos
- Molecular Systems Engineering Group, Centre for Process Systems Engineering, Department of Chemical Engineering, and Institute for Molecular Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom;
| | - Claire S Adjiman
- Molecular Systems Engineering Group, Centre for Process Systems Engineering, Department of Chemical Engineering, and Institute for Molecular Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom;
| | - Constantinos C Pantelides
- Molecular Systems Engineering Group, Centre for Process Systems Engineering, Department of Chemical Engineering, and Institute for Molecular Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom;
| |
Collapse
|
10
|
Wengert S, Csányi G, Reuter K, Margraf JT. Data-efficient machine learning for molecular crystal structure prediction. Chem Sci 2021; 12:4536-4546. [PMID: 34163719 PMCID: PMC8179468 DOI: 10.1039/d0sc05765g] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 02/05/2021] [Indexed: 12/16/2022] Open
Abstract
The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with an unrivalled accuracy/cost ratio. However, such methods are ultimately limited by the computational effort required to produce the reference data. In particular, reference calculations for periodic systems with many atoms can become prohibitively expensive for higher levels of theory. This trade-off is critical in the context of organic crystal structure prediction (CSP). Here, a data-efficient ML approach would be highly desirable, since screening a huge space of possible polymorphs in a narrow energy range requires the assessment of a large number of trial structures with high accuracy. In this contribution, we present tailored Δ-ML models that allow screening a wide range of crystal candidates while adequately describing the subtle interplay between intermolecular interactions such as H-bonding and many-body dispersion effects. This is achieved by enhancing a physics-based description of long-range interactions at the density functional tight binding (DFTB) level-for which an efficient implementation is available-with a short-range ML model trained on high-quality first-principles reference data. The presented workflow is broadly applicable to different molecular materials, without the need for a single periodic calculation at the reference level of theory. We show that this even allows the use of wavefunction methods in CSP.
Collapse
Affiliation(s)
- Simon Wengert
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge Cambridge CB2 1PZ UK
| | - Karsten Reuter
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Faradayweg 4-6 14195 Berlin Germany
| | - Johannes T Margraf
- Chair of Theoretical Chemistry, Technische Universität München 85747 Garching Germany
- Fritz-Haber-Institut der Max-Planck-Gesellschaft Faradayweg 4-6 14195 Berlin Germany
| |
Collapse
|
11
|
Jesus WS, Prudente FV, Marques JMC, Pereira FB. Modeling microsolvation clusters with electronic-structure calculations guided by analytical potentials and predictive machine learning techniques. Phys Chem Chem Phys 2021; 23:1738-1749. [PMID: 33427847 DOI: 10.1039/d0cp05200k] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
We propose a new methodology to study, at the density functional theory (DFT) level, the clusters resulting from the microsolvation of alkali-metal ions with rare-gas atoms. The workflow begins with a global optimization search to generate a pool of low-energy minimum structures for different cluster sizes. This is achieved by employing an analytical potential energy surface (PES) and an evolutionary algorithm (EA). The next main stage of the methodology is devoted to establish an adequate DFT approach to treat the microsolvation system, through a systematic benchmark study involving several combinations of functionals and basis sets, in order to characterize the global minimum structures of the smaller clusters. In the next stage, we apply machine learning (ML) classification algorithms to predict how the low-energy minima of the analytical PES map to the DFT ones. An early and accurate detection of likely DFT local minima is extremely important to guide the choice of the most promising low-energy minima of large clusters to be re-optimized at the DFT level of theory. In this work, the methodology was applied to the Li+Krn (n = 2-14 and 16) microsolvation clusters for which the most competitive DFT approach was found to be the B3LYP-D3/aug-pcseg-1. Additionally, the ML classifier was able to accurately predict most of the solutions to be re-optimized at the DFT level of theory, thereby greatly enhancing the efficiency of the process and allowing its applicability to larger clusters.
Collapse
Affiliation(s)
- W S Jesus
- Instituto de Física, Universidade Federal da Bahia, 40170-115 Salvador, BA, Brazil.
| | - F V Prudente
- Instituto de Física, Universidade Federal da Bahia, 40170-115 Salvador, BA, Brazil.
| | - J M C Marques
- CQC, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal.
| | - F B Pereira
- Coimbra Polytechnic - ISEC, Coimbra, Portugal and Centro de Informática e Sistemas da Universidade de Coimbra (CISUC), Coimbra, Portugal.
| |
Collapse
|
12
|
Bier I, O'Connor D, Hsieh YT, Wen W, Hiszpanski AM, Han TYJ, Marom N. Crystal structure prediction of energetic materials and a twisted arene with Genarris and GAtor. CrystEngComm 2021. [DOI: 10.1039/d1ce00745a] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A molecular crystal structure prediction workflow, based on the random structure generator, Genarris, and the genetic algorithm (GA), GAtor, is successfully applied to two energetic materials and a chiral arene.
Collapse
Affiliation(s)
- Imanuel Bier
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Dana O'Connor
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Yun-Ting Hsieh
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Wen Wen
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Anna M. Hiszpanski
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - T. Yong-Jin Han
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Noa Marom
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
- Department of Physics, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| |
Collapse
|
13
|
Bier I, Marom N. Machine Learned Model for Solid Form Volume Estimation Based on Packing-Accessible Surface and Molecular Topological Fragments. J Phys Chem A 2020; 124:10330-10345. [DOI: 10.1021/acs.jpca.0c06791] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Imanuel Bier
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Noa Marom
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
14
|
Roberts J, Song Y, Crocker M, Risko C. A Genetic Algorithmic Approach to Determine the Structure of Li-Al Layered Double Hydroxides. J Chem Inf Model 2020; 60:4845-4855. [PMID: 32794767 DOI: 10.1021/acs.jcim.0c00493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Layered double hydroxides (LDH) demonstrate significant potential across a range of applications, including as catalysts, delivery vehicles for pharmaceuticals, environmental remediation, and supercapacitors. Explaining the mechanism of LDH action at the atomic scale in these and other applications is challenging, however, due to the difficulty in precisely defining the bulk and surface structure and chemical compositions. Here, we focus on the determination of the structure of lithium-aluminum (Li-Al) LDH, which has shown promise in the catalytic depolymerization of lignin, both directly as the catalyst and as a support for gold nanoparticles. While the relative positions of the Li and Al metals are generally well resolved by X-ray crystallography, it is the structures of the anionic layers, consisting of water and carbonate, that are less well established. Combinatorial analyses of all possible positions and rotations of the water and carbonate in the three-layered Li-AL LDH polytope reveals that the phase space is much too large to examine in any reasonable time frame in a one-by-one structure exploration. To overcome this limitation, we develop and deploy a genetic algorithm (GA) wherein fitness is determined by matching a calculated X-ray diffraction (XRD) pattern for a given structure to the known experimental XRD pattern. The GA approach results in structures of high fitness that portend the bulk Li-Al LDH structure. Importantly, the GA approach offers the potential to determine the structures of other LDH, and more generally layered materials, which are generally difficult to describe given the large chemical and structural space to be explored.
Collapse
Affiliation(s)
- Josiah Roberts
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Yang Song
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Mark Crocker
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Chad Risko
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
15
|
Zhang P, Shen L, Yang W. Solvation Free Energy Calculations with Quantum Mechanics/Molecular Mechanics and Machine Learning Models. J Phys Chem B 2019; 123:901-908. [PMID: 30557020 PMCID: PMC6448400 DOI: 10.1021/acs.jpcb.8b11905] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
For exploration of chemical and biological systems, the combined quantum mechanics and molecular mechanics (QM/MM) and machine learning (ML) models have been developed recently to achieve high accuracy and efficiency for molecular dynamics (MD) simulations. Despite its success on reaction free energy calculations, how to identify new configurations on insufficiently sampled regions during MD and how to update the current ML models with the growing database on the fly are both very important but still challenging. In this article, we apply the QM/MM ML method to solvation free energy calculations and address these two challenges. We employ three approaches to detect new data points and introduce the gradient boosting algorithm to reoptimize efficiently the ML model during ML-based MD sampling. The solvation free energy calculations on several typical organic molecules demonstrate that our developed method provides a systematic, robust, and efficient way to explore new chemistry using ML-based QM/MM MD simulations.
Collapse
Affiliation(s)
- Pan Zhang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, United States
| | - Lin Shen
- Department of Chemistry, Duke University, Durham, North Carolina 27708, United States
| | - Weitao Yang
- Department of Chemistry and Department of Physics, Duke University, Durham, NC 27708, United States
- Key laboratory of Theoretical Chemistry of Environment, Ministry of Education, School of Chemistry and Environment, South China Normal University, Guangzhou 510006, P.R.China
| |
Collapse
|
16
|
Rupp M, von Lilienfeld OA, Burke K. Guest Editorial: Special Topic on Data-Enabled Theoretical Chemistry. J Chem Phys 2018; 148:241401. [DOI: 10.1063/1.5043213] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Affiliation(s)
- Matthias Rupp
- Fritz Haber Institute of the Max Planck Society, Faradayweg 4-6, 14195 Berlin, Germany
| | - O. Anatole von Lilienfeld
- Department of Chemistry, Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials, University of Basel, 4056 Basel, Switzerland
| | - Kieron Burke
- Departments of Chemistry and Physics, University of California, Irvine, California 92697, USA
| |
Collapse
|
17
|
Curtis F, Li X, Rose T, Vázquez-Mayagoitia Á, Bhattacharya S, Ghiringhelli LM, Marom N. GAtor: A First-Principles Genetic Algorithm for Molecular Crystal Structure Prediction. J Chem Theory Comput 2018; 14:2246-2264. [PMID: 29481740 DOI: 10.1021/acs.jctc.7b01152] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present the implementation of GAtor, a massively parallel, first-principles genetic algorithm (GA) for molecular crystal structure prediction. GAtor is written in Python and currently interfaces with the FHI-aims code to perform local optimizations and energy evaluations using dispersion-inclusive density functional theory (DFT). GAtor offers a variety of fitness evaluation, selection, crossover, and mutation schemes. Breeding operators designed specifically for molecular crystals provide a balance between exploration and exploitation. Evolutionary niching is implemented in GAtor by using machine learning to cluster the dynamically updated population by structural similarity and then employing a cluster-based fitness function. Evolutionary niching promotes uniform sampling of the potential energy surface by evolving several subpopulations, which helps overcome initial pool biases and selection biases (genetic drift). The various settings offered by GAtor increase the likelihood of locating numerous low-energy minima, including those located in disconnected, hard to reach regions of the potential energy landscape. The best structures generated are re-relaxed and re-ranked using a hierarchy of increasingly accurate DFT functionals and dispersion methods. GAtor is applied to a chemically diverse set of four past blind test targets, characterized by different types of intermolecular interactions. The experimentally observed structures and other low-energy structures are found for all four targets. In particular, for Target II, 5-cyano-3-hydroxythiophene, the top ranked putative crystal structure is a Z' = 2 structure with P1̅ symmetry and a scaffold packing motif, which has not been reported previously.
Collapse
Affiliation(s)
- Farren Curtis
- Department of Physics , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States
| | - Xiayue Li
- Google , Mountain View , California 94030 , United States.,Department of Materials Science and Engineering , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States
| | - Timothy Rose
- Department of Materials Science and Engineering , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States
| | - Álvaro Vázquez-Mayagoitia
- Argonne Leadership Computing Facility , Argonne National Laboratory , Lemont , Illinois 60439 , United States
| | - Saswata Bhattacharya
- Department of Physics , Indian Institute of Technology Delhi , Hauz Khas , New Delhi 110016 , India
| | - Luca M Ghiringhelli
- Fritz-Haber-Institut der Max-Planck-Gesellschaft , Faradayweg 4-6 , 14195 , Berlin , Germany
| | - Noa Marom
- Department of Physics , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States.,Department of Materials Science and Engineering , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States.,Department of Chemistry , Carnegie Mellon University , Pittsburgh , Pennsylvania 15213 , United States
| |
Collapse
|
18
|
Curtis F, Rose T, Marom N. Evolutionary niching in the GAtor genetic algorithm for molecular crystal structure prediction. Faraday Discuss 2018; 211:61-77. [DOI: 10.1039/c8fd00067k] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The effects of evolutionary niching are investigated for the crystal structure prediction of 1,3-dibromo-2-chloro-5-fluorobenzene.
Collapse
Affiliation(s)
- Farren Curtis
- Department of Materials Science and Engineering
- Carnegie Mellon University
- Pittsburgh
- USA
- Department of Physics
| | - Timothy Rose
- Department of Materials Science and Engineering
- Carnegie Mellon University
- Pittsburgh
- USA
| | - Noa Marom
- Department of Materials Science and Engineering
- Carnegie Mellon University
- Pittsburgh
- USA
- Department of Physics
| |
Collapse
|