1
|
Kirschbaum T, von Seggern B, Dzubiella J, Bande A, Noé F. Machine Learning Frontier Orbital Energies of Nanodiamonds. J Chem Theory Comput 2023; 19:4461-4473. [PMID: 37053438 DOI: 10.1021/acs.jctc.2c01275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Nanodiamonds have a wide range of applications including catalysis, sensing, tribology, and biomedicine. To leverage nanodiamond design via machine learning, we introduce the new data set ND5k, consisting of 5089 diamondoid and nanodiamond structures and their frontier orbital energies. ND5k structures are optimized via tight-binding density functional theory (DFTB) and their frontier orbital energies are computed using density functional theory (DFT) with the PBE0 hybrid functional. From this data set we derive a qualitative design suggestion for nanodiamonds in photocatalysis. We also compare recent machine learning models for predicting frontier orbital energies for similar structures as they have been trained on (interpolation on ND5k), and we test their abilities to extrapolate predictions to larger structures. For both the interpolation and extrapolation task, we find the best performance using the equivariant message passing neural network PaiNN. The second best results are achieved with a message passing neural network using a tailored set of atomic descriptors proposed here.
Collapse
Affiliation(s)
- Thorren Kirschbaum
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Börries von Seggern
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Arnimallee 22, 14195 Berlin, Germany
| | - Joachim Dzubiella
- Institute of Physics, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 3, 79104 Freiburg im Breisgau, Germany
| | - Annika Bande
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
2
|
Exploring Deep Learning for Metalloporphyrins: Databases, Molecular Representations, and Model Architectures. Catalysts 2022. [DOI: 10.3390/catal12111485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Metalloporphyrins have been studied as biomimetic catalysts for more than 120 years and have accumulated a large amount of data, which provides a solid foundation for deep learning to discover chemical trends and structure–function relationships. In this study, key components of deep learning of metalloporphyrins, including databases, molecular representations, and model architectures, were systematically investigated. A protocol to construct canonical SMILES for metalloporphyrins was proposed, which was then used to represent the two-dimensional structures of over 10,000 metalloporphyrins in an existing computational database. Subsequently, several state-of-the-art chemical deep learning models, including graph neural network-based models and natural language processing-based models, were employed to predict the energy gaps of metalloporphyrins. Two models showed satisfactory predictive performance (R2 0.94) with canonical SMILES as the only source of structural information. In addition, an unsupervised visualization algorithm was used to interpret the molecular features learned by the deep learning models.
Collapse
|
3
|
Balraadjsing S, Peijnenburg WJGM, Vijver MG. Exploring the potential of in silico machine learning tools for the prediction of acute Daphnia magna nanotoxicity. CHEMOSPHERE 2022; 307:135930. [PMID: 35961453 DOI: 10.1016/j.chemosphere.2022.135930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 07/19/2022] [Accepted: 07/31/2022] [Indexed: 06/15/2023]
Abstract
Engineered nanomaterials (ENMs) are ubiquitous nowadays, finding their application in different fields of technology and various consumer products. Virtually any chemical can be manipulated at the nano-scale to display unique characteristics which makes them appealing over larger sized materials. As the production and development of ENMs have increased considerably over time, so too have concerns regarding their adverse effects and environmental impacts. It is unfeasible to assess the risks associated with every single ENM through in vivo or in vitro experiments. As an alternative, in silico methods can be employed to evaluate ENMs. To perform such an evaluation, we collected data from databases and literature to create classification models based on machine learning algorithms in accordance with the principles laid out by the OECD for the creation of QSARs. The aim was to investigate the performance of various machine learning algorithms towards predicting a well-defined in vivo toxicity endpoint (Daphnia magna immobilization) and also to identify which features are important drivers of D. magna in vivo nanotoxicity. Results indicated highly comparable model performance between all algorithms and predictive performance exceeding ∼0.7 for all evaluated metrics (e.g. accuracy, sensitivity, specificity, balanced accuracy, Matthews correlation coefficient, area under the receiver operator characteristic curve). The random forest, artificial neural network, and k-nearest neighbor models displayed the best performance but this was only marginally better compared to the other models. Furthermore, the variable importance analysis indicated that molecular descriptors and physicochemical properties were generally important within most models, while features related to the exposure conditions produced slightly conflicting results. Lastly, results also indicate that reliable and robust machine learning models can be generated for in vivo endpoints with smaller datasets.
Collapse
Affiliation(s)
- Surendra Balraadjsing
- Institute of Environmental Sciences (CML), Leiden University, PO Box 9518, 2300 RA, Leiden, the Netherlands.
| | - Willie J G M Peijnenburg
- Institute of Environmental Sciences (CML), Leiden University, PO Box 9518, 2300 RA, Leiden, the Netherlands; Centre for Safety of Substances and Products, National Institute of Public Health and the Environment (RIVM), PO Box 1, 3720 BA, Bilthoven, the Netherlands
| | - Martina G Vijver
- Institute of Environmental Sciences (CML), Leiden University, PO Box 9518, 2300 RA, Leiden, the Netherlands
| |
Collapse
|
4
|
Storm FE, Folkmann LM, Hansen T, Mikkelsen KV. Machine learning the frontier orbital energies of SubPc based triads. J Mol Model 2022; 28:313. [PMID: 36098806 DOI: 10.1007/s00894-022-05262-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 08/05/2022] [Indexed: 11/24/2022]
Abstract
Organic photovoltaic devices are promising candidates for efficient energy harvesting from sunlight. Designing new dye molecules suitable for such devices is a challenging task restricted by the rapid increase of computational cost with system size. Solar cell material properties are closely related to the electronic structure of the dye, and an effective molecular orbital energy screening method for a family of dyes is therefore desired. In this work, a machine learning approach is used to sort through the chemical space of peripheral double-substituted boron-Subphthalocyanine dyes. A database of 12,102 PM6 optimized structures was built and for each of the structures time-dependent density functional theory (LC-[Formula: see text]HPBE/6-31+G(d)) calculations were performed. We investigated the changes of the molecular orbital energies of the molecular orbitals related to reduction and oxidation of the compounds. With the Electrotopological-state index moleculear representation all the tested algorithms, Support Vector Machine, Random Forest Regression, Neural Network, and Simple Linear Regression, captured the calculated frontier orbital energies with a prediction root-mean-square-error in the order of 0.05 eV. Finally, frontier orbital energies were predicted for more than 40,000 new structures by the trained Support Vector Machine algorithm. Compared to the parent boron-Subphthalocyanine structure, 237 and 132 functionalized dyes were predicted to have upshifted molecular orbital energies using the Electrotopological-state index and OneHot encoding feature vector, respectively. Out of 27 investigated donor and acceptor ligands, the acetamide and hydroxyl ligands gave rise to the desired increase in frontier molecular orbital energy.
Collapse
Affiliation(s)
- Freja E Storm
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark
| | - Linnea M Folkmann
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark
| | - Thorsten Hansen
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark.
| | - Kurt V Mikkelsen
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark.
| |
Collapse
|
5
|
Wang Z, Sun Z, Yin H, Liu X, Wang J, Zhao H, Pang CH, Wu T, Li S, Yin Z, Yu XF. Data-Driven Materials Innovation and Applications. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2022; 34:e2104113. [PMID: 35451528 DOI: 10.1002/adma.202104113] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 03/19/2022] [Indexed: 05/07/2023]
Abstract
Owing to the rapid developments to improve the accuracy and efficiency of both experimental and computational investigative methodologies, the massive amounts of data generated have led the field of materials science into the fourth paradigm of data-driven scientific research. This transition requires the development of authoritative and up-to-date frameworks for data-driven approaches for material innovation. A critical discussion on the current advances in the data-driven discovery of materials with a focus on frameworks, machine-learning algorithms, material-specific databases, descriptors, and targeted applications in the field of inorganic materials is presented. Frameworks for rationalizing data-driven material innovation are described, and a critical review of essential subdisciplines is presented, including: i) advanced data-intensive strategies and machine-learning algorithms; ii) material databases and related tools and platforms for data generation and management; iii) commonly used molecular descriptors used in data-driven processes. Furthermore, an in-depth discussion on the broad applications of material innovation, such as energy conversion and storage, environmental decontamination, flexible electronics, optoelectronics, superconductors, metallic glasses, and magnetic materials, is provided. Finally, how these subdisciplines (with insights into the synergy of materials science, computational tools, and mathematics) support data-driven paradigms is outlined, and the opportunities and challenges in data-driven material innovation are highlighted.
Collapse
Affiliation(s)
- Zhuo Wang
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
- Department of Chemical and Environmental Engineering, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
| | - Zhehao Sun
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Hang Yin
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Xinghui Liu
- Department of Chemistry, Sungkyunkwan University (SKKU), 2066 Seoburo, Jangan-Gu, Suwon, 16419, Republic of Korea
| | - Jinlan Wang
- School of Physics, Southeast University, Nanjing, 211189, P. R. China
| | - Haitao Zhao
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
| | - Cheng Heng Pang
- Department of Chemical and Environmental Engineering, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
- Municipal Key Laboratory of Clean Energy Conversion Technologies, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
| | - Tao Wu
- Key Laboratory for Carbonaceous Wastes Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
- New Materials Institute, University of Nottingham, Ningbo, China, Ningbo, 315100, P. R. China
| | - Shuzhou Li
- School of Materials Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
| | - Zongyou Yin
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Xue-Feng Yu
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
| |
Collapse
|
6
|
|
7
|
Nandy A, Duan C, Goffinet C, Kulik HJ. New Strategies for Direct Methane-to-Methanol Conversion from Active Learning Exploration of 16 Million Catalysts. JACS AU 2022; 2:1200-1213. [PMID: 35647589 PMCID: PMC9135396 DOI: 10.1021/jacsau.2c00176] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 04/12/2022] [Accepted: 04/15/2022] [Indexed: 05/03/2023]
Abstract
Despite decades of effort, no earth-abundant homogeneous catalysts have been discovered that can selectively oxidize methane to methanol. We exploit active learning to simultaneously optimize methane activation and methanol release calculated with machine learning-accelerated density functional theory in a space of 16 M candidate catalysts including novel macrocycles. By constructing macrocycles from fragments inspired by synthesized compounds, we ensure synthetic realism in our computational search. Our large-scale search reveals that low-spin Fe(II) compounds paired with strong-field (e.g., P or S-coordinating) ligands have among the best energetic tradeoffs between hydrogen atom transfer (HAT) and methanol release. This observation contrasts with prior efforts that have focused on high-spin Fe(II) with weak-field ligands. By decoupling equatorial and axial ligand effects, we determine that negatively charged axial ligands are critical for more rapid release of methanol and that higher-valency metals [i.e., M(III) vs M(II)] are likely to be rate-limited by slow methanol release. With full characterization of barrier heights, we confirm that optimizing for HAT does not lead to large oxo formation barriers. Energetic span analysis reveals designs for an intermediate-spin Mn(II) catalyst and a low-spin Fe(II) catalyst that are predicted to have good turnover frequencies. Our active learning approach to optimize two distinct reaction energies with efficient global optimization is expected to be beneficial for the search of large catalyst spaces where no prior designs have been identified and where linear scaling relationships between reaction energies or barriers may be limited or unknown.
Collapse
Affiliation(s)
- Aditya Nandy
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States
| | - Conrad Goffinet
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
8
|
Shi H, Jing W, Liu W, Li Y, Li Z, Qiao B, Zhao S, Xu Z, Song D. Key Factors Governing the External Quantum Efficiency of Thermally Activated Delayed Fluorescence Organic Light-Emitting Devices: Evidence from Machine Learning. ACS OMEGA 2022; 7:7893-7900. [PMID: 35284748 PMCID: PMC8908496 DOI: 10.1021/acsomega.1c06820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/14/2022] [Indexed: 06/14/2023]
Abstract
Thermally activated delayed fluorescence (TADF) materials enable organic light-emitting devices (OLEDs) to exhibit high external quantum efficiency (EQE), as they can fully utilize singlets and triplets. Despite the high theoretical limit in EQE of TADF OLEDs, the reported values of EQE in the literature vary a lot. Hence, it is critical to quantify the effects of the factors on device EQE based on data-driven approaches. Herein, we use machine learning (ML) algorithms to map the relationship between the material/device structural factors and the EQE. We established the dataset from a variety of experimental reports. Four algorithms are employed, among which the neural network performs best in predicting the EQE. The root-mean-square errors are 1.96 and 3.39% for the training and test sets. Based on the correlation and the feature importance studies, key factors governing the device EQE are screened out. These results provide essential guidance for material screening and experimental device optimization of TADF OLEDs.
Collapse
Affiliation(s)
- Haochen Shi
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Wenzhu Jing
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Wu Liu
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Yaoyao Li
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Zhaojun Li
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Bo Qiao
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Suling Zhao
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Zheng Xu
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| | - Dandan Song
- Key
Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education, Beijing 100044, China
- Institute
of Optoelectronics Technology, Beijing Jiaotong
University, Beijing 100044, China
| |
Collapse
|
9
|
Miyake Y, Saeki A. Machine Learning-Assisted Development of Organic Solar Cell Materials: Issues, Analyses, and Outlooks. J Phys Chem Lett 2021; 12:12391-12401. [PMID: 34939806 DOI: 10.1021/acs.jpclett.1c03526] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Nonfullerene, a small molecular electron acceptor, has substantially improved the power conversion efficiency of organic photovoltaics (OPVs). However, the large structural freedom of π-conjugated polymers and molecules makes it difficult to explore with limited resources. Machine learning, which is based on rapidly growing artificial intelligence technology, is a high-throughput method to accelerate the speed of material design and process optimization; however, it suffers from limitations in terms of prediction accuracy, interpretability, data collection, and available data (particularly, experimental data). This recognition motivates the present Perspective, which focuses on utilizing the experimental data set for ML to efficiently aid OPV research. This Perspective discusses the trends in ML-OPV publications, the NFA category, and the effects of data size and explanatory variables (fingerprints or Mordred descriptors) on the prediction accuracy and explainability, which broadens the scope of ML and would be useful for the development of next-generation solar cell materials.
Collapse
Affiliation(s)
- Yuta Miyake
- Department of Applied Chemistry, Graduate School of Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Akinori Saeki
- Department of Applied Chemistry, Graduate School of Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
- Innovative Catalysis Science Division, Institute for Open and Transdisciplinary Research Initiatives (ICS-OTRI), Osaka University, 1-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| |
Collapse
|
10
|
Ovchenkova EN, Bichan NG, Gostev FE, Shelaev IV, Nadtochenko VA, Lomova TN. The donor-acceptor dyad based on high substituted fullero[70]pyrrolidine-coordinated manganese (III) phthalocyanine for photoinduced electron transfer. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2021; 263:120166. [PMID: 34274635 DOI: 10.1016/j.saa.2021.120166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 07/05/2021] [Accepted: 07/06/2021] [Indexed: 06/13/2023]
Abstract
Donor-acceptor dyads based on manganese porphyrins/phthalocyanines and fullerene derivatives with N-basicity centers have proved as promising photoinduced electron-transfer systems for photovoltaic devices, biologically active compounds, and molecular magnetic materials. The macroheterocyclic chromophore characterized by rich UV-visible-near IR absorption is the basis for the applications above. The problem of the synthesis and the characterization of new effective dyads was solved in this work on the example of the self-organizing system consisting of (octakis-3,5-di-tert-butylphenoxy)phthalocyaninato)manganese(III) acetate, (AcO)MnPc(3,5-di-tBuPhO)8, 2',5-di(pyridin-2'-yl)-3,4-fullero[70]pyrrolidine, Py2C70, and toluene. The phthalocyanine-fullerene dyads in the molecular and cationic form (respectively (AcO)(Py2C70)MnPc(3,5-di-tBuPhO)8 and [(Py2C70)MnPc(3,5-di-tBuPhO)8]+(AcO)-) were observed and described using the chemical kinetics/thermodynamics, UV-vis, IR, 1H NMR spectroscopy and mass spectrometry methods. The 1: 1 stoichiometry of both dyads was confirmed; the equilibrium and rate constant value, K= (4.86 ± 0.56) × 104 L mol-1 and k = (4.455 ± 3.37) × 10-5 s-1 was observed for the formation of molecular and cationic dyad, respectively. The study of (AcO)MnPc(3,5-di-tBuPhO)8 and [(Py2C70)MnPc(3,5-di-tBuPhO)8]+AcO- femtosecond transient absorption spectra points to the photoinduced electron transfer in the dyad, for which the lifetimes and the rate constants of charge separation (τCS, kCS) and charge recombination (τCR, kCR) were defined. The analysis of the relationship of the dyad physicochemical parameters with the molecular structure is represented using previously published data.
Collapse
Affiliation(s)
- E N Ovchenkova
- G. A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 1 Akademicheskaya Str., 153045 Ivanovo, Russian Federation
| | - N G Bichan
- G. A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 1 Akademicheskaya Str., 153045 Ivanovo, Russian Federation.
| | - F E Gostev
- N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Kosygina st., 4, Moscow, Russia
| | - I V Shelaev
- N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Kosygina st., 4, Moscow, Russia
| | - V A Nadtochenko
- N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Kosygina st., 4, Moscow, Russia
| | - T N Lomova
- G. A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 1 Akademicheskaya Str., 153045 Ivanovo, Russian Federation
| |
Collapse
|
11
|
Duan C, Liu F, Nandy A, Kulik HJ. Putting Density Functional Theory to the Test in Machine-Learning-Accelerated Materials Discovery. J Phys Chem Lett 2021; 12:4628-4637. [PMID: 33973793 DOI: 10.1021/acs.jpclett.1c00631] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Accelerated discovery with machine learning (ML) has begun to provide the advances in efficiency needed to overcome the combinatorial challenge of computational materials design. Nevertheless, ML-accelerated discovery both inherits the biases of training data derived from density functional theory (DFT) and leads to many attempted calculations that are doomed to fail. Many compelling functional materials and catalytic processes involve strained chemical bonds, open-shell radicals and diradicals, or metal-organic bonds to open-shell transition-metal centers. Although promising targets, these materials present unique challenges for electronic structure methods and combinatorial challenges for their discovery. In this Perspective, we describe the advances needed in accuracy, efficiency, and approach beyond what is typical in conventional DFT-based ML workflows. These challenges have begun to be addressed through ML models trained to predict the results of multiple methods or the differences between them, enabling quantitative sensitivity analysis. For DFT to be trusted for a given data point in a high-throughput screen, it must pass a series of tests. ML models that predict the likelihood of calculation success and detect the presence of strong correlation will enable rapid diagnoses and adaptation strategies. These "decision engines" represent the first steps toward autonomous workflows that avoid the need for expert determination of the robustness of DFT-based materials discoveries.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
12
|
Wang CI, Joanito I, Lan CF, Hsu CP. Artificial neural networks for predicting charge transfer coupling. J Chem Phys 2020; 153:214113. [PMID: 33291923 DOI: 10.1063/5.0023697] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Quantum chemistry calculations have been very useful in providing many key detailed properties and enhancing our understanding of molecular systems. However, such calculation, especially with ab initio models, can be time-consuming. For example, in the prediction of charge-transfer properties, it is often necessary to work with an ensemble of different thermally populated structures. A possible alternative to such calculations is to use a machine-learning based approach. In this work, we show that the general prediction of electronic coupling, a property that is very sensitive to intermolecular degrees of freedom, can be obtained with artificial neural networks, with improved performance as compared to the popular kernel ridge regression method. We propose strategies for optimizing the learning rate and batch size, improving model performance, and further evaluating models to ensure that the physical signatures of charge-transfer coupling are well reproduced. We also address the effect of feature representation as well as statistical insights obtained from the loss function and the data structure. Our results pave the way for designing a general strategy for training such neural-network models for accurate prediction.
Collapse
Affiliation(s)
- Chun-I Wang
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan
| | | | - Chang-Feng Lan
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Chao-Ping Hsu
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
13
|
Eckhoff M, Lausch KN, Blöchl PE, Behler J. Predicting oxidation and spin states by high-dimensional neural networks: Applications to lithium manganese oxide spinels. J Chem Phys 2020; 153:164107. [PMID: 33138439 DOI: 10.1063/5.0021452] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Lithium ion batteries often contain transition metal oxides such as LixMn2O4 (0 ≤ x ≤ 2). Depending on the Li content, different ratios of MnIII to MnIV ions are present. In combination with electron hopping, the Jahn-Teller distortions of the MnIIIO6 octahedra can give rise to complex phenomena such as structural transitions and conductance. While for small model systems oxidation and spin states can be determined using density functional theory (DFT), the investigation of dynamical phenomena by DFT is too demanding. Previously, we have shown that a high-dimensional neural network potential can extend molecular dynamics (MD) simulations of LixMn2O4 to nanosecond time scales, but these simulations did not provide information about the electronic structure. Here, we extend the use of neural networks to the prediction of atomic oxidation and spin states. The resulting high-dimensional neural network is able to predict the spins of the Mn ions with an error of only 0.03 ℏ. We find that the Mn eg electrons are correctly conserved and that the number of Jahn-Teller distorted MnIIIO6 octahedra is predicted precisely for different Li loadings. A charge ordering transition is observed between 280 K and 300 K, which matches resistivity measurements. Moreover, the activation energy of the electron hopping conduction above the phase transition is predicted to be 0.18 eV, deviating only 0.02 eV from experiment. This work demonstrates that machine learning is able to provide an accurate representation of both the geometric and the electronic structure dynamics of LixMn2O4 on time and length scales that are not accessible by ab initio MD.
Collapse
Affiliation(s)
- Marco Eckhoff
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| | - Knut Nikolas Lausch
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| | - Peter E Blöchl
- Technische Universität Clausthal, Institut für Theoretische Physik, Leibnizstraße 10, 38678 Clausthal-Zellerfeld, Germany
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
14
|
Chen MS, Zuehlsdorff TJ, Morawietz T, Isborn CM, Markland TE. Exploiting Machine Learning to Efficiently Predict Multidimensional Optical Spectra in Complex Environments. J Phys Chem Lett 2020; 11:7559-7568. [PMID: 32808797 DOI: 10.1021/acs.jpclett.0c02168] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The excited-state dynamics of chromophores in complex environments determine a range of vital biological and energy capture processes. Time-resolved, multidimensional optical spectroscopies provide a key tool to investigate these processes. Although theory has the potential to decode these spectra in terms of the electronic and atomistic dynamics, the need for large numbers of excited-state electronic structure calculations severely limits first-principles predictions of multidimensional optical spectra for chromophores in the condensed phase. Here, we leverage the locality of chromophore excitations to develop machine learning models to predict the excited-state energy gap of chromophores in complex environments for efficiently constructing linear and multidimensional optical spectra. By analyzing the performance of these models, which span a hierarchy of physical approximations, across a range of chromophore-environment interaction strengths, we provide strategies for the construction of machine learning models that greatly accelerate the calculation of multidimensional optical spectra from first principles.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Tim J Zuehlsdorff
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Tobias Morawietz
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Christine M Isborn
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
15
|
Heinen S, Schwilk M, von Rudorff GF, von Lilienfeld OA. Machine learning the computational cost of quantum chemistry. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab6ac4] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
16
|
Li Z, Achenie LEK, Xin H. An Adaptive Machine Learning Strategy for Accelerating Discovery of Perovskite Electrocatalysts. ACS Catal 2020. [DOI: 10.1021/acscatal.9b05248] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Zheng Li
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United States
| | - Luke E. K. Achenie
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United States
| | - Hongliang Xin
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United States
| |
Collapse
|
17
|
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss three classes of mathematical approaches, including algebraic topology, differential geometry, and graph theory. We elucidate how the physical and biological challenges have guided the evolution and development of these mathematical apparatuses for massive and diverse biomolecular data. We focus the performance analysis on protein-ligand binding predictions in this review although these methods have had tremendous success in many other applications, such as protein classification, virtual screening, and the predictions of solubility, solvation free energies, toxicity, partition coefficients, protein folding stability changes upon mutation, etc.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA. and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA and Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
18
|
Gao H, Jia M, Chen S, Zhang X, Tan X. Efficient photocatalysts of a tetraphenylporphyrin/P25 hybrid for visible-light photoreduction of CO 2. NEW J CHEM 2020. [DOI: 10.1039/d0nj03351k] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
A highly efficient TPP/P25 hybrid for the photoreduction of CO2 was developed and prepared via weak interactions between TPP and P25. The optimized TPP/P25 hybrid shows excellent activity for CO2 reduction. TPP loading has an important influence on the CO2 reduction performance.
Collapse
Affiliation(s)
- Hongyi Gao
- School of Materials Science and Engineering
- University of Science and Technology Beijing
- Beijing 100083
- P. R. China
| | - Mengyi Jia
- School of Materials Science and Engineering
- University of Science and Technology Beijing
- Beijing 100083
- P. R. China
| | - Siyuan Chen
- School of Materials Science and Engineering
- University of Science and Technology Beijing
- Beijing 100083
- P. R. China
| | - Xiaowei Zhang
- Institute of Advanced Materials
- Beijing Normal University
- Beijing 100875
- P. R. China
| | - Xi Tan
- Guangdong Institute of New Materials
- Guangzhou 510650
- P. R. China
| |
Collapse
|
19
|
An Y, Deshmukh SA. Machine learning approach for accurate backmapping of coarse-grained models to all-atom models. Chem Commun (Camb) 2020; 56:9312-9315. [PMID: 32667366 DOI: 10.1039/d0cc02651d] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Four different machine learning (ML) regression models: artificial neural network, k-nearest neighbors, Gaussian process regression and random forest were built to backmap coarse-grained models to all-atom models.
Collapse
Affiliation(s)
- Yaxin An
- Department of Chemical Engineering
- Virginia Tech
- Blacksburg
- USA
| | | |
Collapse
|
20
|
Lu Z, Yadav S, Singh CV. Predicting aggregation energy for single atom bimetallic catalysts on clean and O* adsorbed surfaces through machine learning models. Catal Sci Technol 2020. [DOI: 10.1039/c9cy02070e] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Machine learning models are successfully developed for simultaneous prediction of stability and adsorption energy at single-atom bimetallic sites.
Collapse
Affiliation(s)
- Zhuole Lu
- Department of Materials Science and Engineering
- University of Toronto
- Toronto
- Canada
| | - Shwetank Yadav
- Department of Materials Science and Engineering
- University of Toronto
- Toronto
- Canada
| | - Chandra Veer Singh
- Department of Materials Science and Engineering
- University of Toronto
- Toronto
- Canada
- Department of Mechanical and Industrial Engineering
| |
Collapse
|
21
|
Wang CI, Braza MKE, Claudio GC, Nellas RB, Hsu CP. Machine Learning for Predicting Electron Transfer Coupling. J Phys Chem A 2019; 123:7792-7802. [PMID: 31429287 DOI: 10.1021/acs.jpca.9b04256] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Electron transfer coupling is a critical factor in determining electron transfer rates. This coupling strength can be sensitive to details in molecular geometries, especially intermolecular configurations. Thus, studying charge transporting behavior with a full first-principle approach demands a large amount of computation resources in quantum chemistry (QC) calculation. To address this issue, we developed a machine learning (ML) approach to evaluate electronic coupling. A prototypical ML model for an ethylene system was built by kernel ridge regression with Coulomb matrix representation. Since the performance of the ML models highly dependent on their building strategies, we systematically investigated the generality of the ML models, the choice of features and target labels. The best ML model trained with 40 000 samples achieved a mean absolute error of 3.5 meV and greater than 98% accuracy in predicting phases. The distance and orientation dependence of electronic coupling was successfully captured. Bypassing QC calculation, the ML model saved 10-104 times the computation cost. With the help of ML, reliable charge transport models and mechanisms can be further developed.
Collapse
Affiliation(s)
- Chun-I Wang
- Institute of Chemistry , Academia Sinica , Taipei 115 , Taiwan
| | - Mac Kevin E Braza
- Institute of Chemistry, College of Science , University of the Philippines Diliman , Quezon City 1101 , Philippines
| | - Gil C Claudio
- Institute of Chemistry, College of Science , University of the Philippines Diliman , Quezon City 1101 , Philippines
| | - Ricky B Nellas
- Institute of Chemistry, College of Science , University of the Philippines Diliman , Quezon City 1101 , Philippines
| | - Chao-Ping Hsu
- Institute of Chemistry , Academia Sinica , Taipei 115 , Taiwan
| |
Collapse
|
22
|
Nandy A, Zhu J, Janet JP, Duan C, Getman RB, Kulik HJ. Machine Learning Accelerates the Discovery of Design Rules and Exceptions in Stable Metal–Oxo Intermediate Formation. ACS Catal 2019. [DOI: 10.1021/acscatal.9b02165] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
| | - Jiazhou Zhu
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, South Carolina 29634, United States
| | | | | | - Rachel B. Getman
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, South Carolina 29634, United States
| | | |
Collapse
|
23
|
Back S, Tran K, Ulissi ZW. Toward a Design of Active Oxygen Evolution Catalysts: Insights from Automated Density Functional Theory Calculations and Machine Learning. ACS Catal 2019. [DOI: 10.1021/acscatal.9b02416] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Seoin Back
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
| | - Kevin Tran
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
| | - Zachary W. Ulissi
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
| |
Collapse
|
24
|
An Y, Singh S, Bejagam KK, Deshmukh SA. Development of an Accurate Coarse-Grained Model of Poly(acrylic acid) in Explicit Solvents. Macromolecules 2019. [DOI: 10.1021/acs.macromol.9b00615] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Yaxin An
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | | | - Karteek K. Bejagam
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Sanket A. Deshmukh
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
25
|
Singh SK, Bejagam KK, An Y, Deshmukh SA. Machine-Learning Based Stacked Ensemble Model for Accurate Analysis of Molecular Dynamics Simulations. J Phys Chem A 2019; 123:5190-5198. [DOI: 10.1021/acs.jpca.9b03420] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
| | - Karteek K. Bejagam
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Yaxin An
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Sanket A. Deshmukh
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
26
|
|
27
|
Nandy A, Duan C, Janet JP, Gugler S, Kulik HJ. Strategies and Software for Machine Learning Accelerated Discovery in Transition Metal Chemistry. Ind Eng Chem Res 2018. [DOI: 10.1021/acs.iecr.8b04015] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Jon Paul Janet
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Stefan Gugler
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Laboratorium für Physikalische Chemie, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|