1
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
2
|
Lambor SM, Kasiraju S, Vlachos DG. CKineticsDB─An Extensible and FAIR Data Management Framework and Datahub for Multiscale Modeling in Heterogeneous Catalysis. J Chem Inf Model 2023; 63:4342-4354. [PMID: 37436913 DOI: 10.1021/acs.jcim.3c00123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2023]
Abstract
A great advantage of computational research is its reproducibility and reusability. However, an enormous amount of computational research data in heterogeneous catalysis is barricaded due to logistical limitations. Sufficient provenance and characterization of data and computational environment, with uniform organization and easy accessibility, can allow the development of software tools for integration across the multiscale modeling workflow. Here, we develop the Chemical Kinetics Database, CKineticsDB, a state-of-the-art datahub for multiscale modeling, designed to be compliant with the FAIR guiding principles for scientific data management. CKineticsDB utilizes a MongoDB back-end for extensibility and adaptation to varying data formats, with a referencing-based data model to reduce redundancy in storage. We have developed a Python software program for data processing operations and with built-in features to extract data for common applications. CKineticsDB evaluates the incoming data for quality and uniformity, retains curated information from simulations, enables accurate regeneration of publication results, optimizes storage, and allows the selective retrieval of files based on domain-relevant catalyst and simulation parameters. CKineticsDB provides data from multiple scales of theory (ab initio calculations, thermochemistry, and microkinetic models) to accelerate the development of new reaction pathways, kinetic analysis of reaction mechanisms, and catalysis discovery, along with several data-driven applications.
Collapse
Affiliation(s)
- Siddhant M Lambor
- RAPID Manufacturing Institute, Delaware Energy Institute, University of Delaware, Newark, Delaware 19716, United States
| | - Sashank Kasiraju
- RAPID Manufacturing Institute, Delaware Energy Institute, University of Delaware, Newark, Delaware 19716, United States
| | - Dionisios G Vlachos
- RAPID Manufacturing Institute, Delaware Energy Institute, University of Delaware, Newark, Delaware 19716, United States
- Department of Chemical and Biomolecular Engineering and Catalysis Center for Energy Innovation (CCEI), University of Delaware, Newark, Delaware 19716, United States
| |
Collapse
|
3
|
Li Y, Zhang R, Yan X, Fan K. Machine learning facilitating the rational design of nanozymes. J Mater Chem B 2023. [PMID: 37325942 DOI: 10.1039/d3tb00842h] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
As a component substitute for natural enzymes, nanozymes have the advantages of easy synthesis, convenient modification, low cost, and high stability, and are widely used in many fields. However, their application is seriously restricted by the difficulty of rapidly creating high-performance nanozymes. The use of machine learning techniques to guide the rational design of nanozymes holds great promise to overcome this difficulty. In this review, we introduce the recent progress of machine learning in assisting the design of nanozymes. Particular attention is given to the successful strategies of machine learning in predicting the activity, selectivity, catalytic mechanisms, optimal structures and other features of nanozymes. The typical procedures and approaches for conducting machine learning in the study of nanozymes are also highlighted. Moreover, we discuss in detail the difficulties of machine learning methods in dealing with the redundant and chaotic nanozyme data and provide an outlook on the future application of machine learning in the nanozyme field. We hope that this review will serve as a useful handbook for researchers in related fields and promote the utilization of machine learning in nanozyme rational design and related topics.
Collapse
Affiliation(s)
- Yucong Li
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
| | - Ruofei Zhang
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Xiyun Yan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
- Nanozyme Medical Center, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou 450052, China
| | - Kelong Fan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
- Nanozyme Medical Center, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou 450052, China
| |
Collapse
|
4
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
5
|
Zheng P, Yang W, Wu W, Isayev O, Dral PO. Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods. J Phys Chem Lett 2022; 13:3479-3491. [PMID: 35416675 DOI: 10.1021/acs.jpclett.2c00734] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Enthalpies of formation and reaction are important thermodynamic properties that have a crucial impact on the outcome of chemical transformations. Here we implement the calculation of enthalpies of formation with a general-purpose ANI-1ccx neural network atomistic potential. We demonstrate on a wide range of benchmark sets that both ANI-1ccx and our other general-purpose data-driven method AIQM1 approach the coveted chemical accuracy of 1 kcal/mol with the speed of semiempirical quantum mechanical methods (AIQM1) or faster (ANI-1ccx). It is remarkably achieved without specifically training the machine learning parts of ANI-1ccx or AIQM1 on formation enthalpies. Importantly, we show that these data-driven methods provide statistical means for uncertainty quantification of their predictions, which we use to detect and eliminate outliers and revise reference experimental data. Uncertainty quantification may also help in the systematic improvement of such data-driven methods.
Collapse
Affiliation(s)
- Peikun Zheng
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Wudi Yang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Wei Wu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
6
|
Pernot P. The long road to calibrated prediction uncertainty in computational chemistry. J Chem Phys 2022; 156:114109. [DOI: 10.1063/5.0084302] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Uncertainty quantification (UQ) in computational chemistry (CC) is still in its infancy. Very few CC methods are designed to provide a confidence level on their predictions, and most users still rely improperly on the mean absolute error as an accuracy metric. The development of reliable UQ methods is essential, notably for CC to be used confidently in industrial processes. A review of the CC-UQ literature shows that there is no common standard procedure to report or validate prediction uncertainty. I consider here analysis tools using concepts (calibration and sharpness) developed in meteorology and machine learning for the validation of probabilistic forecasters. These tools are adapted to CC-UQ and applied to datasets of prediction uncertainties provided by composite methods, Bayesian ensembles methods, and machine learning and a posteriori statistical methods.
Collapse
Affiliation(s)
- Pascal Pernot
- Institut de Chimie Physique, UMR8000 CNRS, Université Paris-Saclay, 91405 Orsay, France
| |
Collapse
|