1
|
Zhang C, Arun A, Lapkin AA. Completing and Balancing Database Excerpted Chemical Reactions with a Hybrid Mechanistic-Machine Learning Approach. ACS OMEGA 2024; 9:18385-18399. [PMID: 38680356 PMCID: PMC11044172 DOI: 10.1021/acsomega.4c00262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/31/2024] [Accepted: 04/03/2024] [Indexed: 05/01/2024]
Abstract
Computer-aided synthesis planning (CASP) development of reaction routes requires an understanding of complete reaction structures. However, most reactions in the current databases are missing reaction coparticipants. Although reaction prediction and atom mapping tools can predict major reaction participants and trace atom rearrangements in reactions, they fail to identify the missing molecules to complete reactions. This is because these approaches are data-driven models trained on the current reaction databases, which comprise incomplete reactions. In this work, a workflow was developed to tackle the reaction completion challenge. This includes a heuristic-based method to identify balanced reactions from reaction databases and complete some imbalanced reactions by adding candidate molecules. A machine learning masked language model (MLM) was trained to learn from simplified molecular input line entry system (SMILES) sentences of these completed reactions. The model predicted missing molecules for the incomplete reactions, a workflow analogous to predicting missing words in sentences. The model is promising for the prediction of small- and middle-sized missing molecules in incomplete reaction records. The workflow combining both the heuristic and machine learning methods completed more than half of the entire reaction space.
Collapse
Affiliation(s)
- Chonghuan Zhang
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Adarsh Arun
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- Cambridge
Centre for Advanced Research and Education in Singapore, CARES Ltd., 1 CREATE Way, CREATE Tower #05-05, Singapore 138602 Singapore
- Chemical
Data Intelligence (CDI) Pte., Ltd., 9 Raffles Place #26-01, Republic Plaza, Singapore 048619 Singapore
| | - Alexei A. Lapkin
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- Cambridge
Centre for Advanced Research and Education in Singapore, CARES Ltd., 1 CREATE Way, CREATE Tower #05-05, Singapore 138602 Singapore
- Chemical
Data Intelligence (CDI) Pte., Ltd., 9 Raffles Place #26-01, Republic Plaza, Singapore 048619 Singapore
| |
Collapse
|
2
|
Pham TT, Guo Z, Li B, Lapkin AA, Yan N. Synthesis of Pyrrole-2-Carboxylic Acid from Cellulose- and Chitin-Based Feedstocks Discovered by the Automated Route Search. CHEMSUSCHEM 2024; 17:e202300538. [PMID: 37792551 DOI: 10.1002/cssc.202300538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 10/06/2023]
Abstract
The shift towards sustainable feedstocks for platform chemicals requires new routes to access functional molecules that contain heteroatoms, but there are limited bio-derived feedstocks that lead to heteroatoms in platform chemicals. Combining renewable molecules of different origins could be a solution to optimize the use of atoms from renewable sources. However, the lack of retrosynthetic tools makes it challenging to examine the extensive reaction networks of various platform molecules focusing on multiple bio-based feedstocks. In this study, a protocol was developed to identify potential transformation pathways that allow for the use of feedstocks from different origins. By analyzing existing knowledge on chemical reactions in large databases, several promising synthetic routes were shortlisted, with the reaction of D-glucosamine and pyruvic acid being the most interesting to make pyrrole-2-carboxylic acid (PCA). The optimized synthetic conditions resulted in 50 % yield of PCA, with insights gained from temperature variant NMR studies. The use of substrates obtained from two different bio-feedstock bases, namely cellulose and chitin, allowed for the establishment of a PCA-based chemical space.
Collapse
Affiliation(s)
- Thuy Trang Pham
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 4 Engineering Drive 4, 117585, Singapore City, Singapore
| | - Zhen Guo
- Cambridge Centre for Advanced Research and Education in Singapore (CARES Ltd), 1 CREATE Way, #05-05 Create Tower, 138602, Singapore City, Singapore
- Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road #02-00, 068898, Singapore City, Singapore
| | - Bing Li
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 4 Engineering Drive 4, 117585, Singapore City, Singapore
| | - Alexei A Lapkin
- Cambridge Centre for Advanced Research and Education in Singapore (CARES Ltd), 1 CREATE Way, #05-05 Create Tower, 138602, Singapore City, Singapore
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, CB3 0AS, UK
| | - Ning Yan
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 4 Engineering Drive 4, 117585, Singapore City, Singapore
| |
Collapse
|
3
|
Huang X, Gu KM, Guo CM, Cheng XL. Dissociation cross sections and rates in O 2 + N collisions: molecular dynamics simulations combined with machine learning. Phys Chem Chem Phys 2023; 25:29475-29485. [PMID: 37888773 DOI: 10.1039/d3cp04044e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
The collision-induced dissociation reaction of O2 (v, j) + N, a fundamental process in nonequilibrium air flows around reentry vehicles, has been studied systematically by applying molecular dynamics simulations on the 2A', 4A' and 6A' potential energy surfaces of NO2 in a wide temperature range. In particular, we have directly investigated the role of the 6A' surface in this process and discussed the applicability of the simplified approximate rate models proposed by Esposito et al. and Andrienko et al. based on the lowest two surfaces. The present work indicates that the state-selected dissociation of O2 + N is dominated by the 6A' surface for all except for the low-lying O2 states. Furthermore, a complete database of rovibrationally detailed cross sections and rate coefficients is a prerequisite for modeling the relevant nonequilibrium air flows in spacecraft reentry. Here, the combination of the quasi-classical trajectory (QCT) and the neural network (NN) has been proposed to predict all state-selected dissociation cross sections and further construct dissociation parameter sets. All NN-based models established in this work accurately reproduce the results calculated from QCT simulations over a wide range of rovibrational quantum numbers with R2 > 0.99. Compared with the explicit QCT simulations, the computational requirement for predicting cross sections and rates based on the NN models significantly reduces. Finally, thermal equilibrium rate coefficients computed from NN models match remarkably well the available theoretical and experimental results in the whole temperature range explored.
Collapse
Affiliation(s)
- Xia Huang
- Institute of Atomic and Molecular Physics, Sichuan University, Chengdu 610065, China.
| | - Kun-Ming Gu
- Institute of Atomic and Molecular Physics, Sichuan University, Chengdu 610065, China.
| | - Chang-Min Guo
- Institute of Atomic and Molecular Physics, Sichuan University, Chengdu 610065, China.
| | - Xin-Lu Cheng
- Institute of Atomic and Molecular Physics, Sichuan University, Chengdu 610065, China.
- Key Laboratory of High Energy Density Physics and Technology of Ministry of Education, Sichuan University, Chengdu 610065, China
| |
Collapse
|
4
|
Vogel G, Schulze Balhorn L, Schweidtmann AM. Learning from flowsheets: A generative transformer model for autocompletion of flowsheets. Comput Chem Eng 2023. [DOI: 10.1016/j.compchemeng.2023.108162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
5
|
Jablonka KM, Charalambous C, Sanchez Fernandez E, Wiechers G, Monteiro J, Moser P, Smit B, Garcia S. Machine learning for industrial processes: Forecasting amine emissions from a carbon capture plant. SCIENCE ADVANCES 2023; 9:eadc9576. [PMID: 36598993 PMCID: PMC9812371 DOI: 10.1126/sciadv.adc9576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 11/23/2022] [Indexed: 06/17/2023]
Abstract
One of the main environmental impacts of amine-based carbon capture processes is the emission of the solvent into the atmosphere. To understand how these emissions are affected by the intermittent operation of a power plant, we performed stress tests on a plant operating with a mixture of two amines, 2-amino-2-methyl-1-propanol and piperazine (CESAR1). To forecast the emissions and model the impact of interventions, we developed a machine learning model. Our model showed that some interventions have opposite effects on the emissions of the components of the solvent. Thus, mitigation strategies required for capture plants operating on a single component solvent (e.g., monoethanolamine) need to be reconsidered if operated using a mixture of amines. Amine emissions from a solvent-based carbon capture plant are an example of a process that is too complex to be described by conventional process models. We, therefore, expect that our approach can be more generally applied.
Collapse
Affiliation(s)
- Kevin Maik Jablonka
- Laboratory of Molecular Simulation (LSMO), École Polytechnique Fédérale de Lausanne (EPFL), Sion, Switzerland
| | - Charithea Charalambous
- The Research Centre for Carbon Solutions (RCCS), School of Engineering and Physical Sciences, Heriot-Watt University, EH14 4AS Edinburgh, UK
| | | | | | | | - Peter Moser
- RWE Power AG, Ernestinenstraße 60, 45141 Essen, Germany
| | - Berend Smit
- Laboratory of Molecular Simulation (LSMO), École Polytechnique Fédérale de Lausanne (EPFL), Sion, Switzerland
| | - Susana Garcia
- The Research Centre for Carbon Solutions (RCCS), School of Engineering and Physical Sciences, Heriot-Watt University, EH14 4AS Edinburgh, UK
| |
Collapse
|
6
|
Kondinski A, Bai J, Mosbach S, Akroyd J, Kraft M. Knowledge Engineering in Chemistry: From Expert Systems to Agents of Creation. Acc Chem Res 2022; 56:128-139. [PMID: 36516456 PMCID: PMC9850921 DOI: 10.1021/acs.accounts.2c00617] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Passing knowledge from human to human is a natural process that has continued since the beginning of humankind. Over the past few decades, we have witnessed that knowledge is no longer passed only between humans but also from humans to machines. The latter form of knowledge transfer represents a cornerstone in artificial intelligence (AI) and lays the foundation for knowledge engineering (KE). In order to pass knowledge to machines, humans need to structure, formalize, and make knowledge machine-readable. Subsequently, humans also need to develop software that emulates their decision-making process. In order to engineer chemical knowledge, chemists are often required to challenge their understanding of chemistry and thinking processes, which may help improve the structure of chemical knowledge.Knowledge engineering in chemistry dates from the development of expert systems that emulated the thinking process of analytical and organic chemists. Since then, many different expert systems employing rather limited knowledge bases have been developed, solving problems in retrosynthesis, analytical chemistry, chemical risk assessment, etc. However, toward the end of the 20th century, the AI winters slowed down the development of expert systems for chemistry. At the same time, the increasing complexity of chemical research, alongside the limitations of the available computing tools, made it difficult for many chemistry expert systems to keep pace.In the past two decades, the semantic web, the popularization of object-oriented programming, and the increase in computational power have revitalized knowledge engineering. Knowledge formalization through ontologies has become commonplace, triggering the subsequent development of knowledge graphs and cognitive software agents. These tools enable the possibility of interoperability, enabling the representation of more complex systems, inference capabilities, and the synthesis of new knowledge.This Account introduces the history, the core principles of KE, and its applications within the broad realm of chemical research and engineering. In this regard, we first discuss how chemical knowledge is formalized and how a chemist's cognition can be emulated with the help of reasoning algorithms. Following this, we discuss various applications of knowledge graph and agent technology used to solve problems in chemistry related to molecular engineering, chemical mechanisms, multiscale modeling, automation of calculations and experiments, and chemist-machine interactions. These developments are discussed in the context of a universal and dynamic knowledge ecosystem, referred to as The World Avatar (TWA).
Collapse
Affiliation(s)
- Aleksandar Kondinski
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Jiaru Bai
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Sebastian Mosbach
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.,CARES,
Cambridge Centre for Advanced Research and Education in Singapore, 1 Create Way, CREATE Tower, #05-05, 138602 Singapore
| | - Jethro Akroyd
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.,CMCL
Innovations, Sheraton
House, Castle Park, Cambridge CB3 0AX, U.K.
| | - Markus Kraft
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.,CARES,
Cambridge Centre for Advanced Research and Education in Singapore, 1 Create Way, CREATE Tower, #05-05, 138602 Singapore,School
of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, 637459 Singapore,E-mail:
| |
Collapse
|
7
|
Seidenberg JR, Khan AA, Lapkin AA. Boosting autonomous process design and intensification with formalized domain knowledge. Comput Chem Eng 2022. [DOI: 10.1016/j.compchemeng.2022.108097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
8
|
Zhu LT, Chen XZ, Ouyang B, Yan WC, Lei H, Chen Z, Luo ZH. Review of Machine Learning for Hydrodynamics, Transport, and Reactions in Multiphase Flows and Reactors. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.2c01036] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Li-Tao Zhu
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Xi-Zhong Chen
- Department of Chemical and Biological Engineering, University of Sheffield, Sheffield, S1 3JD, U.K
| | - Bo Ouyang
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Wei-Cheng Yan
- School of Chemistry and Chemical Engineering, Jiangsu University, Zhenjiang, Jiangsu 212013, China
| | - He Lei
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Zhe Chen
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Zheng-Hong Luo
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| |
Collapse
|
9
|
|
10
|
Kondinski A, Rasmussen M, Mangelsen S, Pienack N, Simjanoski V, Näther C, Stares DL, Schalley CA, Bensch W. Composition-driven archetype dynamics in polyoxovanadates. Chem Sci 2022; 13:6397-6412. [PMID: 35733899 PMCID: PMC9159092 DOI: 10.1039/d2sc01004f] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 04/29/2022] [Indexed: 12/13/2022] Open
Abstract
Molecular metal oxides often adopt common structural frameworks (i.e. archetypes), many of them boasting impressive structural robustness and stability. However, the ability to adapt and to undergo transformations between different structural archetypes is a desirable material design feature offering applicability in different environments. Using systems thinking approach that integrates synthetic, analytical and computational techniques, we explore the transformations governing the chemistry of polyoxovanadates (POVs) constructed of arsenate and vanadate building units. The water-soluble salt of the low nuclearity polyanion [V6As8O26]4− can be effectively used for the synthesis of the larger spherical (i.e. kegginoidal) mixed-valent [V12As8O40]4− precipitate, while the novel [V10As12O40]8− POVs having tubular cyclic structures are another, well soluble product. Surprisingly, in contrast to the common observation that high-nuclearity polyoxometalate (POM) clusters are fragmented to form smaller moieties in solution, the low nuclearity [V6As8O26]4− anion is in situ transformed into the higher nuclearity cluster anions. The obtained products support a conceptually new model that is outlined in this article and that describes a continuous evolution between spherical and cyclic POV assemblies. This new model represents a milestone on the way to rational and designable POV self-assemblies. Systems-based elucidation of the polyoxovanadate speciation reveals that heterogroup substitution can transform spherical kegginoids into tubular architectures in a programmable manner.![]()
Collapse
Affiliation(s)
- Aleksandar Kondinski
- Department of Chemical Engineering and Biotechnology, University of Cambridge Philippa Fawcett Drive S CB3 0AS UK
| | - Maren Rasmussen
- Institut für Anorganische Chemie, Christian-Albrechts-Universität zu Kiel 24118 Kiel Germany
| | - Sebastian Mangelsen
- Institut für Anorganische Chemie, Christian-Albrechts-Universität zu Kiel 24118 Kiel Germany
| | - Nicole Pienack
- Institut für Anorganische Chemie, Christian-Albrechts-Universität zu Kiel 24118 Kiel Germany
| | - Viktor Simjanoski
- Primer affiliate of University of Chicago Master Program Chicago IL USA
| | - Christian Näther
- Institut für Anorganische Chemie, Christian-Albrechts-Universität zu Kiel 24118 Kiel Germany
| | - Daniel L Stares
- Institut für Chemie und Biochemie der Freien Universität Berlin Arnimallee 20 14195 Berlin Germany
| | - Christoph A Schalley
- Institut für Chemie und Biochemie der Freien Universität Berlin Arnimallee 20 14195 Berlin Germany
| | - Wolfgang Bensch
- Institut für Anorganische Chemie, Christian-Albrechts-Universität zu Kiel 24118 Kiel Germany
| |
Collapse
|
11
|
Xing Y, Dong Y, Goergakis C, Zhuang Y, Zhang L, Du J, Meng Q. Automatic Data‐driven Stoichiometry Identification and Kinetic Modeling Framework for Homogeneous Organic Reactions. AIChE J 2022. [DOI: 10.1002/aic.17713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Yafeng Xing
- School of Chemical Engineering Institute of Chemical Process Systems Engineering, Dalian University of Technology Dalian Liaoning China
| | - Yachao Dong
- School of Chemical Engineering Institute of Chemical Process Systems Engineering, Dalian University of Technology Dalian Liaoning China
| | - Christos Goergakis
- Chemical and Biological Engineering and Systems Research Institute, Tufts University Medford Massachusetts USA
| | - Yu Zhuang
- School of Chemical Engineering Institute of Chemical Process Systems Engineering, Dalian University of Technology Dalian Liaoning China
| | - Lei Zhang
- School of Chemical Engineering Institute of Chemical Process Systems Engineering, Dalian University of Technology Dalian Liaoning China
| | - Jian Du
- School of Chemical Engineering Institute of Chemical Process Systems Engineering, Dalian University of Technology Dalian Liaoning China
| | - Qingwei Meng
- State Key Laboratory of Fine Chemicals, School of Pharmaceutical Science and Technology Department Dalian University of Technology Dalian China
| |
Collapse
|
12
|
Stocker M, Heger T, Schweidtmann A, Ćwiek-Kupczyńska H, Penev L, Dojchinovski M, Willighagen E, Vidal ME, Turki H, Balliet D, Tiddi I, Kuhn T, Mietchen D, Karras O, Vogt L, Hellmann S, Jeschke J, Krajewski P, Auer S. SKG4EOSC - Scholarly Knowledge Graphs for EOSC: Establishing a backbone of knowledge graphs for FAIR Scholarly Information in EOSC. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e83789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In the age of advanced information systems powering fast-paced knowledge economies that face global societal challenges, it is no longer adequate to express scholarly information - an essential resource for modern economies - primarily as article narratives in document form. Despite being a well-established tradition in scholarly communication, PDF-based text publishing is hindering scientific progress as it buries scholarly information into non-machine-readable formats. The key objective of SKG4EOSC is to improve science productivity through development and implementation of services for text and data conversion, and production, curation, and re-use of FAIR scholarly information. This will be achieved by (1) establishing the Open Research Knowledge Graph (ORKG, orkg.org), a service operated by the SKG4EOSC coordinator, as a Hub for access to FAIR scholarly information in the EOSC; (2) lifting to EOSC of numerous and heterogeneous domain-specific research infrastructures through the ORKG Hub’s harmonized access facilities; and (3) leverage the Hub to support cross-disciplinary research and policy decisions addressing societal challenges. SKG4EOSC will pilot the devised approaches and technologies in four research domains: biodiversity crisis, precision oncology, circular processes, and human cooperation. With the aim to improve machine-based scholarly information use, SKG4EOSC addresses an important current and future need of researchers. It extends the application of the FAIR data principles to scholarly communication practices, hence a more comprehensive coverage of the entire research lifecycle. Through explicit, machine actionable provenance links between FAIR scholarly information, primary data and contextual entities, it will substantially contribute to reproducibility, validation and trust in science. The resulting advanced machine support will catalyse new discoveries in basic research and solutions in key application areas.
Collapse
|