1
|
Srinivasan K, Puliyanda A, Prasad V. Identification of Reaction Network Hypotheses for Complex Feedstocks from Spectroscopic Measurements with Minimal Human Intervention. J Phys Chem A 2024; 128:4714-4729. [PMID: 38836378 DOI: 10.1021/acs.jpca.4c01592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
In this work, we detail an automated reaction network hypothesis generation protocol for processes involving complex feedstocks where information about the species and reactions involved is unknown. Our methodology is process agnostic and can be utilized in any reactive process with spectroscopic measurements that provide information on the evolution of the components in the mixture. We decompose the mixture spectra to obtain spectroscopic signatures of the individual components and use a 1-D convolutional neural network to automatically identify functional groups indicated by them. We employ atom-atom mapping to automatically recover reaction rules that are applied on candidate molecules identified from chemistry databases through fingerprint similarity. The method is tested on synthetic data and on spectroscopic measurements of lab-scale batch hydrothermal liquefaction (HTL) of biomass to determine the accuracy of prediction across datasets of varying complexities. Our methodology is able to identify reaction network hypotheses containing reaction networks close to the ground truth in the case of synthetic data, and we are also able to recover candidate molecules and reaction networks close to the ones reported in the previous literature studies for biomass pyrolysis.
Collapse
Affiliation(s)
- Karthik Srinivasan
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| | - Anjana Puliyanda
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| | - Vinay Prasad
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| |
Collapse
|
2
|
Kim S, Woo J, Kim WY. Diffusion-based generative AI for exploring transition states from 2D molecular graphs. Nat Commun 2024; 15:341. [PMID: 38184661 PMCID: PMC10771475 DOI: 10.1038/s41467-023-44629-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024] Open
Abstract
The exploration of transition state (TS) geometries is crucial for elucidating chemical reaction mechanisms and modeling their kinetics. Recently, machine learning (ML) models have shown remarkable performance for prediction of TS geometries. However, they require 3D conformations of reactants and products often with their appropriate orientations as input, which demands substantial efforts and computational cost. Here, we propose a generative approach based on the stochastic diffusion method, namely TSDiff, for prediction of TS geometries just from 2D molecular graphs. TSDiff outperforms the existing ML models with 3D geometries in terms of both accuracy and efficiency. Moreover, it enables to sample various TS conformations, because it learns the distribution of TS geometries for diverse reactions in training. Thus, TSDiff finds more favorable reaction pathways with lower barrier heights than those in the reference database. These results demonstrate that TSDiff shows promising potential for an efficient and reliable TS exploration.
Collapse
Affiliation(s)
- Seonghwan Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Jeheon Woo
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Woo Youn Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
- AI Institute, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
| |
Collapse
|
3
|
Ramos-Sánchez P, Harvey JN, Gámez JA. An automated method for graph-based chemical space exploration and transition state finding. J Comput Chem 2022; 44:27-42. [PMID: 36239971 DOI: 10.1002/jcc.27011] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/28/2022] [Accepted: 09/05/2022] [Indexed: 12/24/2022]
Abstract
Algorithms that automatically explore the chemical space have been limited to chemical systems with a low number of atoms due to expensive involved quantum calculations and the large amount of possible reaction pathways. The method described here presents a novel solution to the problem of chemical exploration by generating reaction networks with heuristics based on chemical theory. First, a second version of the reaction network is determined through molecular graph transformations acting upon functional groups of the reacting. Only transformations that break two chemical bonds and form two new ones are considered, leading to a significant performance enhancement compared to previously presented algorithm. Second, energy barriers for this reaction network are estimated through quantum chemical calculations by a growing string method, which can also identify non-octet species missed during the previous step and further define the reaction network. The proposed algorithm has been successfully applied to five different chemical reactions, in all cases identifying the most important reaction pathways.
Collapse
Affiliation(s)
- Pablo Ramos-Sánchez
- Digital R&D, Covestro Deutschland AG, Leverkusen, Germany.,Department of Chemistry, KU Leuven, Leuven, Belgium
| | | | - José A Gámez
- Digital R&D, Covestro Deutschland AG, Leverkusen, Germany
| |
Collapse
|
4
|
Kim J, Gu GH, Noh J, Kim S, Gim S, Choi J, Jung Y. Predicting potentially hazardous chemical reactions using an explainable neural network. Chem Sci 2021; 12:11028-11037. [PMID: 34522300 PMCID: PMC8386654 DOI: 10.1039/d1sc01049b] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 07/12/2021] [Indexed: 11/21/2022] Open
Abstract
Predicting potentially dangerous chemical reactions is a critical task for laboratory safety. However, a traditional experimental investigation of reaction conditions for possible hazardous or explosive byproducts entails substantial time and cost, for which machine learning prediction could accelerate the process and help detailed experimental investigations. Several machine learning models have been developed which allow the prediction of major chemical reaction products with reasonable accuracy. However, these methods may not present sufficiently high accuracy for the prediction of hazardous products which particularly requires a low false negative result for laboratory safety in order not to miss any dangerous reactions. In this work, we propose an explainable artificial intelligence model that can predict the formation of hazardous reaction products in a binary classification fashion. The reactant molecules are transformed into substructure-encoded fingerprints and then fed into a convolutional neural network to make the binary decision of the chemical reaction. The proposed model shows a false negative rate of 0.09, which can be compared with 0.47-0.66 using the existing main product prediction models. To provide explanations for what substructures of the given reactant molecules are important to make a decision for target hazardous product formation, we apply an input attribution method, layer-wise relevance propagation, which computes the contributions of individual inputs per input data. The computed attributions indeed match some of the existing chemical intuitions and mechanisms, and also offer a way to analyze possible data-imbalance issues of the current predictions based on relatively small positive datasets. We expect that the proposed hazardous product prediction model will be complementary to existing main product prediction models and experimental investigations.
Collapse
Affiliation(s)
- Juhwan Kim
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST) Daejeon 34141 Republic of Korea
| | - Geun Ho Gu
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST) Daejeon 34141 Republic of Korea
| | - Juhwan Noh
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST) Daejeon 34141 Republic of Korea
| | - Seongun Kim
- Graduate School of Artificial Intelligence KAIST Daejeon: 291 Daehak-ro, N24, Yuseong-gu Daejeon 34141 Republic of Korea
| | - Suji Gim
- Environment & Safety Research Center, Samsung Electronics Co. 1, Samsungjeonja-ro Hwasung-si Gyeonggi-do Republic of Korea
| | - Jaesik Choi
- Graduate School of Artificial Intelligence KAIST Daejeon: 291 Daehak-ro, N24, Yuseong-gu Daejeon 34141 Republic of Korea
| | - Yousung Jung
- Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST) Daejeon 34141 Republic of Korea
| |
Collapse
|
5
|
Martínez-Núñez E, Barnes GL, Glowacki DR, Kopec S, Peláez D, Rodríguez A, Rodríguez-Fernández R, Shannon RJ, Stewart JJP, Tahoces PG, Vazquez SA. AutoMeKin2021: An open-source program for automated reaction discovery. J Comput Chem 2021; 42:2036-2048. [PMID: 34387374 DOI: 10.1002/jcc.26734] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/16/2021] [Accepted: 07/27/2021] [Indexed: 01/10/2023]
Abstract
AutoMeKin2021 is an updated version of tsscds2018, a program for the automated discovery of reaction mechanisms (J. Comput. Chem. 2018, 39, 1922). This release features a number of new capabilities: rare-event molecular dynamics simulations to enhance reaction discovery, extension of the original search algorithm to study van der Waals complexes, use of chemical knowledge, a new search algorithm based on bond-order time series analysis, statistics of the chemical reaction networks, a web application to submit jobs, and other features. The source code, manual, installation instructions and the website link are available at: https://rxnkin.usc.es/index.php/AutoMeKin.
Collapse
Affiliation(s)
- Emilio Martínez-Núñez
- Department of Physical Chemistry, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - George L Barnes
- Department of Chemistry and Biochemistry, Siena College, Loudonville, New York, USA
| | - David R Glowacki
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
| | - Sabine Kopec
- Institut de Sciences Moléculaires d'Orsay, UMR 8214, Université Paris-Sud - Université Paris-Saclay, Orsay, France
| | - Daniel Peláez
- Institut de Sciences Moléculaires d'Orsay, UMR 8214, Université Paris-Sud - Université Paris-Saclay, Orsay, France
| | - Aurelio Rodríguez
- Galicia Supercomputing Center (CESGA), Santiago de Compostela, Spain
| | | | - Robin J Shannon
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
| | | | - Pablo G Tahoces
- Department of Electronics and Computer Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Saulo A Vazquez
- Department of Physical Chemistry, University of Santiago de Compostela, Santiago de Compostela, Spain
| |
Collapse
|
6
|
Blau SM, Patel HD, Spotte-Smith EWC, Xie X, Dwaraknath S, Persson KA. A chemically consistent graph architecture for massive reaction networks applied to solid-electrolyte interphase formation. Chem Sci 2021; 12:4931-4939. [PMID: 34163740 PMCID: PMC8179555 DOI: 10.1039/d0sc05647b] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/23/2021] [Indexed: 01/09/2023] Open
Abstract
Modeling reactivity with chemical reaction networks could yield fundamental mechanistic understanding that would expedite the development of processes and technologies for energy storage, medicine, catalysis, and more. Thus far, reaction networks have been limited in size by chemically inconsistent graph representations of multi-reactant reactions (e.g. A + B → C) that cannot enforce stoichiometric constraints, precluding the use of optimized shortest-path algorithms. Here, we report a chemically consistent graph architecture that overcomes these limitations using a novel multi-reactant representation and iterative cost-solving procedure. Our approach enables the identification of all low-cost pathways to desired products in massive reaction networks containing reactions of any stoichiometry, allowing for the investigation of vastly more complex systems than previously possible. Leveraging our architecture, we construct the first ever electrochemical reaction network from first-principles thermodynamic calculations to describe the formation of the Li-ion solid electrolyte interphase (SEI), which is critical for passivation of the negative electrode. Using this network comprised of nearly 6000 species and 4.5 million reactions, we interrogate the formation of a key SEI component, lithium ethylene dicarbonate. We automatically identify previously proposed mechanisms as well as multiple novel pathways containing counter-intuitive reactions that have not, to our knowledge, been reported in the literature. We envision that our framework and data-driven methodology will facilitate efforts to engineer the composition-related properties of the SEI - or of any complex chemical process - through selective control of reactivity.
Collapse
Affiliation(s)
- Samuel M Blau
- Energy Technologies Area, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Hetal D Patel
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Materials Science Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Evan Walter Clark Spotte-Smith
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Materials Science Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Xiaowei Xie
- Materials Science Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
- College of Chemistry, University of California Berkeley CA 94720 USA
| | - Shyam Dwaraknath
- Materials Science Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Kristin A Persson
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Molecular Foundry, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| |
Collapse
|
7
|
Robertson C, Habershon S. Simple position and orientation preconditioning scheme for minimum energy path calculations. J Comput Chem 2021; 42:761-770. [PMID: 33617652 DOI: 10.1002/jcc.26495] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/16/2021] [Accepted: 01/22/2021] [Indexed: 11/08/2022]
Abstract
Minimum-energy path (MEP) calculations, such as those typified by the nudged elastic band method, require input of reactant and product molecular configurations at initialization. In the case of reactions involving more than one molecule, generating initial reactant and product configurations requires careful consideration of the relative position and orientations of the reactive molecules in order to ensure that the resulting MEP calculation proceeds without converging on an alternative reaction-path, and without requiring excessive numbers of optimization iterations; as such, this initial system set-up is most commonly performed "by hand," with an expert user arranging reactive molecules in space to ensure that the following MEP calculation runs smoothly. In this Article, we introduce a simple preconditioning scheme which replaces this labor-intensive, human-knowledge-based step with an automated deterministic computational scheme. In our approach, initial reactant and product configurations are generated such that steric hindrance between reactive molecules is minimized in the reactant and product configurations, while also simultaneously requiring minimal structural differences between the reactants and products. The method is demonstrated using a benchmark test-set of >3400 organic molecular reactions, where comparison of the reactant/product configurations generated using our approach compare very well to initial configurations which were generated on an ad hoc basis.
Collapse
Affiliation(s)
- Christopher Robertson
- Department of Chemistry and Centre for Scientific Computing, University of Warwick, Coventry, UK
| | - Scott Habershon
- Department of Chemistry and Centre for Scientific Computing, University of Warwick, Coventry, UK
| |
Collapse
|
8
|
Kim JW, Kim J, Kim J, Chwae J, Kang S, Jeon SO, Son WJ, Choi H, Choi B, Kim S, Kim WY. Holistic Approach to the Mechanism Study of Thermal Degradation of Organic Light-Emitting Diode Materials. J Phys Chem A 2020; 124:9589-9596. [DOI: 10.1021/acs.jpca.0c07766] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Jin Woo Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Joonghyuk Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Jaewook Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Jun Chwae
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Sungwoo Kang
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Soon Ok Jeon
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Won-Joon Son
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Hyeonho Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Byoungki Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Sunghan Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Suwon 16678, Korea
| | - Woo Youn Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| |
Collapse
|
9
|
Lee K, Woo Kim J, Youn Kim W. Efficient Construction of a Chemical Reaction Network Guided By a Monte Carlo Tree Search. CHEMSYSTEMSCHEM 2020. [DOI: 10.1002/syst.201900057] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Kyunghoon Lee
- Department of ChemistryKorea Advanced Institute of Science and Technology (KAIST) 291 Daehak-ro, Yuseong-gu Daejeon 305-701 Korea
| | - Jin Woo Kim
- Department of ChemistryKorea Advanced Institute of Science and Technology (KAIST) 291 Daehak-ro, Yuseong-gu Daejeon 305-701 Korea
| | - Woo Youn Kim
- Department of ChemistryKorea Advanced Institute of Science and Technology (KAIST) 291 Daehak-ro, Yuseong-gu Daejeon 305-701 Korea
- KI for Artificial IntelligenceKorea Advanced Institute of Science and Technology (KAIST) 291 Daehak-ro, Yuseong-gu Daejeon 305-701 Korea
| |
Collapse
|
10
|
Lee JU, Kim Y, Kim WY, Oh HB. Graph theory-based reaction pathway searches and DFT calculations for the mechanism studies of free radical-initiated peptide sequencing mass spectrometry (FRIPS MS): a model gas-phase reaction of GGR tri-peptide. Phys Chem Chem Phys 2020; 22:5057-5069. [PMID: 32073000 DOI: 10.1039/c9cp05433b] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Graph theory-based reaction pathway searches (ACE-Reaction program) and density functional theory calculations were performed to shed light on the mechanisms for the production of [an + H]+, xn+, yn+, zn+, and [yn + 2H]+ fragments formed in free radical-initiated peptide sequencing (FRIPS) mass spectrometry measurements of a small model system of glycine-glycine-arginine (GGR). In particular, the graph theory-based searches, which are rarely applied to gas-phase reaction studies, allowed us to investigate reaction mechanisms in an exhaustive manner without resorting to chemical intuition. As expected, radical-driven reaction pathways were favorable over charge-driven reaction pathways in terms of kinetics and thermodynamics. Charge- and radical-driven pathways for the formation of [yn + 2H]+ fragments were carefully compared, and it was revealed that the [yn + 2H]+ fragments observed in our FRIPS MS spectra originated from the radical-driven pathway, which is in contrast to the general expectation. The acquired understanding of the FRIPS fragmentation mechanism is expected to aid in the interpretation of FRIPS MS spectra. It should be emphasized that graph theory-based searches are powerful and effective methods for studying reaction mechanisms, including gas-phase reactions in mass spectrometry.
Collapse
Affiliation(s)
- Jae-Ung Lee
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea.
| | | | | | | |
Collapse
|
11
|
Jara‐Toro RA, Pino GA, Glowacki DR, Shannon RJ, Martínez‐Núñez E. Enhancing Automated Reaction Discovery with Boxed Molecular Dynamics in Energy Space. CHEMSYSTEMSCHEM 2019. [DOI: 10.1002/syst.201900024] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Rafael A. Jara‐Toro
- INIFIQC (CONICET-UNC) Dpto. De Fisicoquímica-Facultad de Ciencias Químicas-Centro Láser de Ciencias MolecularesUniversidad de Córdoba Ciudad Universitaria X50000HUA Córdoba Argentina
| | - Gustavo A. Pino
- INIFIQC (CONICET-UNC) Dpto. De Fisicoquímica-Facultad de Ciencias Químicas-Centro Láser de Ciencias MolecularesUniversidad de Córdoba Ciudad Universitaria X50000HUA Córdoba Argentina
| | - David R. Glowacki
- Centre for Computational Chemistry School of ChemistryUniversity of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Robin J. Shannon
- Centre for Computational Chemistry School of ChemistryUniversity of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Emilio Martínez‐Núñez
- Departmento de Química Física, Facultade de QuímicaUniversidade de Santiago de Compostela 15782 Santiago de Compostela Spain
| |
Collapse
|