1
|
Pasquini M, Stenta M. LinChemIn: Route Arithmetic─Operations on Digital Synthetic Routes. J Chem Inf Model 2024; 64:1765-1771. [PMID: 38480486 DOI: 10.1021/acs.jcim.3c01819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Computational tools are revolutionizing our understanding and prediction of chemical reactivity by combining traditional data analysis techniques with new predictive models. These tools extract additional value from the reaction data corpus, but to effectively convert this value into actionable knowledge, domain specialists need to interact easily with the computer-generated output. In this application note, we demonstrate the capabilities of the open-source Python toolkit LinChemIn, which simplifies the manipulation of reaction networks and provides advanced functionality for working with synthetic routes. LinChemIn ensures chemical consistency when merging, editing, mining, and analyzing reaction networks. Its flexible input interface can process routes from various sources, including predictive models and expert input. The toolkit also efficiently extracts individual routes from the combined synthetic tree, identifying alternative paths and reaction combinations. By reducing the operational barrier to accessing and analyzing synthetic routes from multiple sources, LinChemIn facilitates a constructive interplay between artificial intelligence and human expertise.
Collapse
Affiliation(s)
- Marta Pasquini
- Syngenta Crop Protection AG, Schaffhauserstrasse, 4332 Stein, AG, Switzerland
| | - Marco Stenta
- Syngenta Crop Protection AG, Schaffhauserstrasse, 4332 Stein, AG, Switzerland
| |
Collapse
|
2
|
Pasquini M, Stenta M. LinChemIn: SynGraph-a data model and a toolkit to analyze and compare synthetic routes. J Cheminform 2023; 15:41. [PMID: 37005691 PMCID: PMC10067316 DOI: 10.1186/s13321-023-00714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/20/2023] [Indexed: 04/04/2023] Open
Abstract
BACKGROUND The increasing amount of chemical reaction data makes traditional ways to navigate its corpus less effective, while the demand for novel approaches and instruments is rising. Recent data science and machine learning techniques support the development of new ways to extract value from the available reaction data. On the one side, Computer-Aided Synthesis Planning tools can predict synthetic routes in a model-driven approach; on the other side, experimental routes can be extracted from the Network of Organic Chemistry, in which reaction data are linked in a network. In this context, the need to combine, compare and analyze synthetic routes generated by different sources arises naturally. RESULTS Here we present LinChemIn, a python toolkit that allows chemoinformatics operations on synthetic routes and reaction networks. Wrapping some third-party packages for handling graph arithmetic and chemoinformatics and implementing new data models and functionalities, LinChemIn allows the interconversion between data formats and data models and enables route-level analysis and operations, including route comparison and descriptors calculation. Object-Oriented Design principles inspire the software architecture, and the modules are structured to maximize code reusability and support code testing and refactoring. The code structure should facilitate external contributions, thus encouraging open and collaborative software development. CONCLUSIONS The current version of LinChemIn allows users to combine synthetic routes generated from various tools and analyze them, and constitutes an open and extensible framework capable of incorporating contributions from the community and fostering scientific discussion. Our roadmap envisages the development of sophisticated metrics for routes evaluation, a multi-parameter scoring system, and the implementation of an entire "ecosystem" of functionalities operating on synthetic routes. LinChemIn is freely available at https://github.com/syngenta/linchemin.
Collapse
Affiliation(s)
- Marta Pasquini
- Syngenta Crop Protection AG, Schaffhauserstrasse, 4332, Stein, AG, Switzerland.
| | - Marco Stenta
- Syngenta Crop Protection AG, Schaffhauserstrasse, 4332, Stein, AG, Switzerland
| |
Collapse
|
3
|
Andronov M, Voinarovska V, Andronova N, Wand M, Clevert DA, Schmidhuber J. Reagent prediction with a molecular transformer improves reaction data quality. Chem Sci 2023; 14:3235-3246. [PMID: 36970100 PMCID: PMC10034139 DOI: 10.1039/d2sc06798f] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 02/12/2023] [Indexed: 03/05/2023] Open
Abstract
Automated synthesis planning is key for efficient generative chemistry. Since reactions of given reactants may yield different products depending on conditions such as the chemical context imposed by specific reagents, computer-aided synthesis planning should benefit from recommendations of reaction conditions. Traditional synthesis planning software, however, typically proposes reactions without specifying such conditions, relying on human organic chemists who know the conditions to carry out suggested reactions. In particular, reagent prediction for arbitrary reactions, a crucial aspect of condition recommendation, has been largely overlooked in cheminformatics until recently. Here we employ the Molecular Transformer, a state-of-the-art model for reaction prediction and single-step retrosynthesis, to tackle this problem. We train the model on the US patents dataset (USPTO) and test it on Reaxys to demonstrate its out-of-distribution generalization capabilities. Our reagent prediction model also improves the quality of product prediction: the Molecular Transformer is able to substitute the reagents in the noisy USPTO data with reagents that enable product prediction models to outperform those trained on plain USPTO. This makes it possible to improve upon the state-of-the-art in reaction product prediction on the USPTO MIT benchmark.
Collapse
Affiliation(s)
- Mikhail Andronov
- IDSIA, USI, SUPSI 6900 Lugano Switzerland
- Machine Learning Research, Pfizer Worldwide Research Development and Medical Linkstr.10 Berlin Germany
| | - Varvara Voinarovska
- Institute of Structural Biology, Molecular Targets and Therapeutics Center, Helmholtz Munich - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) 85764 Neuherberg Germany
| | | | - Michael Wand
- IDSIA, USI, SUPSI 6900 Lugano Switzerland
- Institute for Digital Technologies for Personalized Healthcare, SUPSI 6900 Lugano Switzerland
| | - Djork-Arné Clevert
- Machine Learning Research, Pfizer Worldwide Research Development and Medical Linkstr.10 Berlin Germany
| | | |
Collapse
|
4
|
Menon A, Pascazio L, Nurkowski D, Farazi F, Mosbach S, Akroyd J, Kraft M. OntoPESScan: An Ontology for Potential Energy Surface Scans. ACS OMEGA 2023; 8:2462-2475. [PMID: 36687109 PMCID: PMC9850739 DOI: 10.1021/acsomega.2c06948] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 11/30/2022] [Indexed: 06/17/2023]
Abstract
In this work, a new OntoPESScan ontology is developed for the semantic representation of one-dimensional potential energy surface (PES) scans, a central concept in computational chemistry. This ontology is developed in line with knowledge graph principles and The World Avatar (TWA) project. OntoPESScan is linked to other ontologies for chemistry in TWA, including OntoSpecies, which helps uniquely identify species along the PES and access their properties, and OntoCompChem, which allows the association of potential energy surfaces with quantum chemical calculations and the concepts used to derive them. A force-field fitting agent is also developed that makes use of the information in the OntoPESScan ontology to fit force fields to reactive surfaces of interest on the fly by making use of the empirical valence bond methodology. This agent is demonstrated to successfully parametrize two cases, namely, a PES scan on ethanol and a PES scan on a localized π-radical PAH hypothesized to play a role in soot formation during combustion. OntoPESScan is an extension to the capabilities of TWA and, in conjunction with potential further ontological support for molecular dynamics and reactions, will further progress toward an open, continuous, and self-growing knowledge graph for chemistry.
Collapse
Affiliation(s)
- Angiras Menon
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Laura Pascazio
- CARES, Cambridge Centre for Advanced Research and Education
in Singapore, 1 Create
Way, CREATE Tower, #05-05, Singapore 138602
| | - Daniel Nurkowski
- CMCL
Innovations, Sheraton House, Castle Park, Cambridge CB3 0AX, U.K.
| | - Feroz Farazi
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Sebastian Mosbach
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- CARES, Cambridge Centre for Advanced Research and Education
in Singapore, 1 Create
Way, CREATE Tower, #05-05, Singapore 138602
| | - Jethro Akroyd
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- CARES, Cambridge Centre for Advanced Research and Education
in Singapore, 1 Create
Way, CREATE Tower, #05-05, Singapore 138602
| | - Markus Kraft
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- CARES, Cambridge Centre for Advanced Research and Education
in Singapore, 1 Create
Way, CREATE Tower, #05-05, Singapore 138602
- School
of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, Singapore 637459
- The
Alan Turing Institute, London NW1 2BD, United
Kingdom
| |
Collapse
|
5
|
Hooe SL, Ellis GA, Medintz IL. Alternative design strategies to help build the enzymatic retrosynthesis toolbox. RSC Chem Biol 2022; 3:1301-1313. [PMID: 36349225 PMCID: PMC9627731 DOI: 10.1039/d2cb00096b] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 09/11/2022] [Indexed: 05/30/2024] Open
Abstract
Most of the complex molecules found in nature still cannot be synthesized by current organic chemistry methods. Given the number of enzymes that exist in nature and the incredible potential of directed evolution, the field of synthetic biology contains perhaps all the necessary building blocks to bring about the realization of applied enzymatic retrosynthesis. Current thinking anticipates that enzymatic retrosynthesis will be implemented using conventional cell-based synthetic biology approaches where requisite native, heterologous, designer, and evolved enzymes making up a given multi-enzyme pathway are hosted by chassis organisms to carry out designer synthesis. In this perspective, we suggest that such an effort should not be limited by solely exploiting living cells and enzyme evolution and describe some useful yet less intensive complementary approaches that may prove especially productive in this grand scheme. By decoupling reactions from the environment of a living cell, a significantly larger portion of potential synthetic chemical space becomes available for exploration; most of this area is currently unavailable to cell-based approaches due to toxicity issues. In contrast, in a cell-free reaction a variety of classical enzymatic approaches can be exploited to improve performance and explore and understand a given enzyme's substrate specificity and catalytic profile towards non-natural substrates. We expect these studies will reveal unique enzymatic capabilities that are not accessible in living cells.
Collapse
Affiliation(s)
- Shelby L Hooe
- Center for Bio/Molecular Science and Engineering Code 6900, U.S. Naval Research Laboratory Washington DC 20375 USA
- National Research Council Washington DC 20001 USA
| | - Gregory A Ellis
- Center for Bio/Molecular Science and Engineering Code 6900, U.S. Naval Research Laboratory Washington DC 20375 USA
| | - Igor L Medintz
- Center for Bio/Molecular Science and Engineering Code 6900, U.S. Naval Research Laboratory Washington DC 20375 USA
| |
Collapse
|
6
|
Garay-Ruiz D, Álvarez-Moreno M, Bo C, Martínez-Núñez E. New Tools for Taming Complex Reaction Networks: The Unimolecular Decomposition of Indole Revisited. ACS PHYSICAL CHEMISTRY AU 2022; 2:225-236. [PMID: 36855573 PMCID: PMC9718323 DOI: 10.1021/acsphyschemau.1c00051] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The level of detail attained in the computational description of reaction mechanisms can be vastly improved through tools for automated chemical space exploration, particularly for systems of small to medium size. Under this approach, the unimolecular decomposition landscape for indole was explored through the automated reaction mechanism discovery program AutoMeKin. Nevertheless, the sheer complexity of the obtained mechanisms might be a hindrance regarding their chemical interpretation. In this spirit, the new Python library amk-tools has been designed to read and manipulate complex reaction networks, greatly simplifying their overall analysis. The package provides interactive dashboards featuring visualizations of the network, the three-dimensional (3D) molecular structures and vibrational normal modes of all chemical species, and the corresponding energy profiles for selected pathways. The combination of the joined mechanism generation and postprocessing workflow with the rich chemistry of indole decomposition enabled us to find new details of the reaction (obtained at the CCSD(T)/aug-cc-pVTZ//M06-2X/MG3S level of theory) that were not reported before: (i) 16 pathways leading to the formation of HCN and NH3 (via amino radical); (ii) a barrierless reaction between methylene radical and phenyl isocyanide, which might be an operative mechanism under the conditions of the interstellar medium; and (iii) reaction channels leading to both hydrogen cyanide and hydrogen isocyanide, of potential astrochemical interest as the computed HNC/HCN ratios greatly exceed the calculated equilibrium value at very low temperatures. The reported reaction networks can be very valuable to supplement databases of kinetic data, which is of remarkable interest for pyrolysis and astrochemical studies.
Collapse
Affiliation(s)
- Diego Garay-Ruiz
- Institute
of Chemical Research of Catalonia (ICIQ), Barcelona Institute of Science & Technology (BIST), Avinguda Països Catalans,
16, 43007 Tarragona, Spain,Departament
de Química Física i Inorgànica, Universitat Rovira i Virgili (URV), Marcel·lí Domingo s/n, 43007 Tarragona, Spain
| | - Moises Álvarez-Moreno
- Institute
of Chemical Research of Catalonia (ICIQ), Barcelona Institute of Science & Technology (BIST), Avinguda Països Catalans,
16, 43007 Tarragona, Spain
| | - Carles Bo
- Institute
of Chemical Research of Catalonia (ICIQ), Barcelona Institute of Science & Technology (BIST), Avinguda Països Catalans,
16, 43007 Tarragona, Spain,Departament
de Química Física i Inorgànica, Universitat Rovira i Virgili (URV), Marcel·lí Domingo s/n, 43007 Tarragona, Spain,
| | - Emilio Martínez-Núñez
- Departmento
de Química Física, Facultade de Química, Universidade de Santiago de Compostela, 15782 Santiago
de Compostela, Spain,
| |
Collapse
|
7
|
Szymkuć S, Badowski T, Grzybowski BA. Is Organic Chemistry Really Growing Exponentially? Angew Chem Int Ed Engl 2021. [DOI: 10.1002/ange.202111540] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Sara Szymkuć
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
- Allchemy, Inc. Highland IN USA
| | - Tomasz Badowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
- Allchemy, Inc. Highland IN USA
| | - Bartosz A. Grzybowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
- Allchemy, Inc. Highland IN USA
- IBS Center for Soft and Living Matter and Department of Chemistry UNIST 50, UNIST-gil, Eonyang-eup, Ulju-gun Ulsan South Korea
| |
Collapse
|
8
|
Szymkuć S, Badowski T, Grzybowski BA. Is Organic Chemistry Really Growing Exponentially? Angew Chem Int Ed Engl 2021; 60:26226-26232. [PMID: 34558168 DOI: 10.1002/anie.202111540] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Indexed: 11/05/2022]
Abstract
In terms of molecules and specific reaction examples, organic chemistry features an impressive, exponential growth. However, new reaction classes/types that fuel this growth are being discovered at a much slower and only linear (or even sublinear) rate. The proportion of newly discovered reaction types to all reactions being performed keeps decreasing, suggesting that synthetic chemistry becomes more reliant on reusing the well-known methods. The newly discovered chemistries are more complex than decades ago and allow for the rapid construction of complex scaffolds in fewer numbers of steps. We study these and other trends in the function of time, reaction-type popularity and complexity based on the algorithm that extracts generalized reaction class templates. These analyses are useful in the context of computer-assisted synthesis, machine learning (to estimate the numbers of models with sufficient reaction statistics), and identifying erroneous entries in reaction databases.
Collapse
Affiliation(s)
- Sara Szymkuć
- Institute of Organic Chemistry, Polish Academy of Sciences, Ul. Kasprzaka 44/52, 01-224, Warsaw, Poland.,Allchemy, Inc., Highland, IN, USA
| | - Tomasz Badowski
- Institute of Organic Chemistry, Polish Academy of Sciences, Ul. Kasprzaka 44/52, 01-224, Warsaw, Poland.,Allchemy, Inc., Highland, IN, USA
| | - Bartosz A Grzybowski
- Institute of Organic Chemistry, Polish Academy of Sciences, Ul. Kasprzaka 44/52, 01-224, Warsaw, Poland.,Allchemy, Inc., Highland, IN, USA.,IBS Center for Soft and Living Matter and Department of Chemistry, UNIST, 50, UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan, South Korea
| |
Collapse
|
9
|
Weber JM, Guo Z, Zhang C, Schweidtmann AM, Lapkin AA. Chemical data intelligence for sustainable chemistry. Chem Soc Rev 2021; 50:12013-12036. [PMID: 34520507 DOI: 10.1039/d1cs00477h] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
This study highlights new opportunities for optimal reaction route selection from large chemical databases brought about by the rapid digitalisation of chemical data. The chemical industry requires a transformation towards more sustainable practices, eliminating its dependencies on fossil fuels and limiting its impact on the environment. However, identifying more sustainable process alternatives is, at present, a cumbersome, manual, iterative process, based on chemical intuition and modelling. We give a perspective on methods for automated discovery and assessment of competitive sustainable reaction routes based on renewable or waste feedstocks. Three key areas of transition are outlined and reviewed based on their state-of-the-art as well as bottlenecks: (i) data, (ii) evaluation metrics, and (iii) decision-making. We elucidate their synergies and interfaces since only together these areas can bring about the most benefit. The field of chemical data intelligence offers the opportunity to identify the inherently more sustainable reaction pathways and to identify opportunities for a circular chemical economy. Our review shows that at present the field of data brings about most bottlenecks, such as data completion and data linkage, but also offers the principal opportunity for advancement.
Collapse
Affiliation(s)
- Jana M Weber
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK. .,Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore
| | - Zhen Guo
- Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore.,Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, 138602, Singapore
| | - Chonghuan Zhang
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK.
| | - Artur M Schweidtmann
- Department of Chemical Engineering, Delft University of Technology, Van der Maasweg 9, Delft 2629 HZ, The Netherlands
| | - Alexei A Lapkin
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK. .,Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore.,Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, 138602, Singapore
| |
Collapse
|
10
|
Schweidtmann AM, Esche E, Fischer A, Kloft M, Repke J, Sager S, Mitsos A. Machine Learning in Chemical Engineering: A Perspective. CHEM-ING-TECH 2021. [DOI: 10.1002/cite.202100083] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Affiliation(s)
- Artur M. Schweidtmann
- Delft University of Technology Department of Chemical Engineering Van der Maasweg 9 2629 HZ Delft The Netherlands
- RWTH Aachen University Aachener Verfahrenstechnik Forckenbeckstr. 51 52074 Aachen Germany
| | - Erik Esche
- Technische Universität Berlin Fachgebiet Dynamik und Betrieb technischer Anlagen Straße des 17. Juni 135 10623 Berlin Germany
| | - Asja Fischer
- Ruhr-Universität Bochum Department of Mathematics Universitätsstraße 150 44801 Bochum Germany
| | - Marius Kloft
- Technische Universität Kaiserslautern Department of Computer Science Erwin-Schrödinger-Straße 52 67663 Kaiserslautern Germany
| | - Jens‐Uwe Repke
- Technische Universität Berlin Fachgebiet Dynamik und Betrieb technischer Anlagen Straße des 17. Juni 135 10623 Berlin Germany
| | - Sebastian Sager
- Otto-von-Guericke-Universität Magdeburg Department of Mathematics Universitätsplatz 2 39106 Magdeburg Germany
| | - Alexander Mitsos
- RWTH Aachen University Aachener Verfahrenstechnik Forckenbeckstr. 51 52074 Aachen Germany
- JARA Center for Simulation and Data Science (CSD) Aachen Germany
- Forschungszentrum Jülich Institute for Energy and Climate Research IEK-10 Energy Systems Engineering Wilhelm-Johnen-Straße 52428 Jülich Germany
| |
Collapse
|
11
|
Martínez-Núñez E, Barnes GL, Glowacki DR, Kopec S, Peláez D, Rodríguez A, Rodríguez-Fernández R, Shannon RJ, Stewart JJP, Tahoces PG, Vazquez SA. AutoMeKin2021: An open-source program for automated reaction discovery. J Comput Chem 2021; 42:2036-2048. [PMID: 34387374 DOI: 10.1002/jcc.26734] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/16/2021] [Accepted: 07/27/2021] [Indexed: 01/10/2023]
Abstract
AutoMeKin2021 is an updated version of tsscds2018, a program for the automated discovery of reaction mechanisms (J. Comput. Chem. 2018, 39, 1922). This release features a number of new capabilities: rare-event molecular dynamics simulations to enhance reaction discovery, extension of the original search algorithm to study van der Waals complexes, use of chemical knowledge, a new search algorithm based on bond-order time series analysis, statistics of the chemical reaction networks, a web application to submit jobs, and other features. The source code, manual, installation instructions and the website link are available at: https://rxnkin.usc.es/index.php/AutoMeKin.
Collapse
Affiliation(s)
- Emilio Martínez-Núñez
- Department of Physical Chemistry, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - George L Barnes
- Department of Chemistry and Biochemistry, Siena College, Loudonville, New York, USA
| | - David R Glowacki
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
| | - Sabine Kopec
- Institut de Sciences Moléculaires d'Orsay, UMR 8214, Université Paris-Sud - Université Paris-Saclay, Orsay, France
| | - Daniel Peláez
- Institut de Sciences Moléculaires d'Orsay, UMR 8214, Université Paris-Sud - Université Paris-Saclay, Orsay, France
| | - Aurelio Rodríguez
- Galicia Supercomputing Center (CESGA), Santiago de Compostela, Spain
| | | | - Robin J Shannon
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK
| | | | - Pablo G Tahoces
- Department of Electronics and Computer Science, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Saulo A Vazquez
- Department of Physical Chemistry, University of Santiago de Compostela, Santiago de Compostela, Spain
| |
Collapse
|
12
|
Thakkar A, Johansson S, Jorner K, Buttar D, Reymond JL, Engkvist O. Artificial intelligence and automation in computer aided synthesis planning. REACT CHEM ENG 2021. [DOI: 10.1039/d0re00340a] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In this perspective we deal with questions pertaining to the development of synthesis planning technologies over the course of recent years.
Collapse
Affiliation(s)
- Amol Thakkar
- Hit Discovery
- Discovery Sciences
- R&D
- AstraZeneca
- Gothenburg
| | | | - Kjell Jorner
- Early Chemical Development
- Pharmaceutical Sciences
- R&D
- AstraZeneca
- Macclesfield
| | - David Buttar
- Early Chemical Development
- Pharmaceutical Sciences
- R&D
- AstraZeneca
- Macclesfield
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry
- University of Bern
- 3012 Bern
- Switzerland
| | - Ola Engkvist
- Hit Discovery
- Discovery Sciences
- R&D
- AstraZeneca
- Gothenburg
| |
Collapse
|
13
|
Tran QP, Adam ZR, Fahrenbach AC. Prebiotic Reaction Networks in Water. Life (Basel) 2020; 10:E352. [PMID: 33339192 PMCID: PMC7765580 DOI: 10.3390/life10120352] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/05/2020] [Accepted: 12/06/2020] [Indexed: 02/07/2023] Open
Abstract
A prevailing strategy in origins of life studies is to explore how chemistry constrained by hypothetical prebiotic conditions could have led to molecules and system level processes proposed to be important for life's beginnings. This strategy has yielded model prebiotic reaction networks that elucidate pathways by which relevant compounds can be generated, in some cases, autocatalytically. These prebiotic reaction networks provide a rich platform for further understanding and development of emergent "life-like" behaviours. In this review, recent advances in experimental and analytical procedures associated with classical prebiotic reaction networks, like formose and Miller-Urey, as well as more recent ones are highlighted. Instead of polymeric networks, i.e., those based on nucleic acids or peptides, the focus is on small molecules. The future of prebiotic chemistry lies in better understanding the genuine complexity that can result from reaction networks and the construction of a centralised database of reactions useful for predicting potential network evolution is emphasised.
Collapse
Affiliation(s)
| | - Zachary R. Adam
- Department of Planetary Sciences, University of Arizona, Tucson, AZ 85721, USA;
| | | |
Collapse
|
14
|
Stocker S, Csányi G, Reuter K, Margraf JT. Machine learning in chemical reaction space. Nat Commun 2020; 11:5505. [PMID: 33127879 PMCID: PMC7603480 DOI: 10.1038/s41467-020-19267-x] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 10/01/2020] [Indexed: 12/29/2022] Open
Abstract
Chemical compound space refers to the vast set of all possible chemical compounds, estimated to contain 1060 molecules. While intractable as a whole, modern machine learning (ML) is increasingly capable of accurately predicting molecular properties in important subsets. Here, we therefore engage in the ML-driven study of even larger reaction space. Central to chemistry as a science of transformations, this space contains all possible chemical reactions. As an important basis for 'reactive' ML, we establish a first-principles database (Rad-6) containing closed and open-shell organic molecules, along with an associated database of chemical reaction energies (Rad-6-RE). We show that the special topology of reaction spaces, with central hub molecules involved in multiple reactions, requires a modification of existing compound space ML-concepts. Showcased by the application to methane combustion, we demonstrate that the learned reaction energies offer a non-empirical route to rationally extract reduced reaction networks for detailed microkinetic analyses.
Collapse
Affiliation(s)
- Sina Stocker
- Chair of Theoretical Chemistry and Catalysis Research Center, Technische Universität München, Garching, Germany
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Karsten Reuter
- Chair of Theoretical Chemistry and Catalysis Research Center, Technische Universität München, Garching, Germany
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Berlin, Germany
| | - Johannes T Margraf
- Chair of Theoretical Chemistry and Catalysis Research Center, Technische Universität München, Garching, Germany.
| |
Collapse
|
15
|
Johansson S, Thakkar A, Kogej T, Bjerrum E, Genheden S, Bastys T, Kannas C, Schliep A, Chen H, Engkvist O. AI-assisted synthesis prediction. DRUG DISCOVERY TODAY. TECHNOLOGIES 2020; 32-33:65-72. [PMID: 33386096 DOI: 10.1016/j.ddtec.2020.06.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/01/2020] [Accepted: 06/10/2020] [Indexed: 11/25/2022]
Abstract
Application of AI technologies in synthesis prediction has developed very rapidly in recent years. We attempt here to give a comprehensive summary on the latest advancement on retro-synthesis planning, forward synthesis prediction as well as quantum chemistry-based reaction prediction models. Besides an introduction on the AI/ML models for addressing various synthesis related problems, the sources of the reaction datasets used in model building is also covered. In addition to the predictive models, the robotics based high throughput experimentation technology will be another crucial factor for conducting synthesis in an automated fashion. Some state-of-the-art of high throughput experimentation practices carried out in the pharmaceutical industry are highlighted in this chapter to give the reader a sense of how future chemistry will be conducted to make compounds faster and cheaper.
Collapse
Affiliation(s)
- Simon Johansson
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden; Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden.
| | - Amol Thakkar
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden; Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Thierry Kogej
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| | - Esben Bjerrum
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| | - Samuel Genheden
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| | - Tomas Bastys
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| | - Christos Kannas
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, University of Gothenburg, Gothenburg, Sweden
| | - Hongming Chen
- Centre of Chemistry and Chemical Biology, Guangzhou Regenerative Medicine and Health - Guangdong Laboratory, Guangzhou 510530, China
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Sweden
| |
Collapse
|
16
|
|
17
|
Jara‐Toro RA, Pino GA, Glowacki DR, Shannon RJ, Martínez‐Núñez E. Enhancing Automated Reaction Discovery with Boxed Molecular Dynamics in Energy Space. CHEMSYSTEMSCHEM 2019. [DOI: 10.1002/syst.201900024] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Rafael A. Jara‐Toro
- INIFIQC (CONICET-UNC) Dpto. De Fisicoquímica-Facultad de Ciencias Químicas-Centro Láser de Ciencias MolecularesUniversidad de Córdoba Ciudad Universitaria X50000HUA Córdoba Argentina
| | - Gustavo A. Pino
- INIFIQC (CONICET-UNC) Dpto. De Fisicoquímica-Facultad de Ciencias Químicas-Centro Láser de Ciencias MolecularesUniversidad de Córdoba Ciudad Universitaria X50000HUA Córdoba Argentina
| | - David R. Glowacki
- Centre for Computational Chemistry School of ChemistryUniversity of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Robin J. Shannon
- Centre for Computational Chemistry School of ChemistryUniversity of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Emilio Martínez‐Núñez
- Departmento de Química Física, Facultade de QuímicaUniversidade de Santiago de Compostela 15782 Santiago de Compostela Spain
| |
Collapse
|
18
|
Lin GM, Warden-Rothman R, Voigt CA. Retrosynthetic design of metabolic pathways to chemicals not found in nature. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.coisb.2019.04.004] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
19
|
Weber JM, Lió P, Lapkin AA. Identification of strategic molecules for future circular supply chains using large reaction networks. REACT CHEM ENG 2019. [DOI: 10.1039/c9re00213h] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Networks of chemical reactions represent relationships between molecules within chemical supply chains and promise to enhance planning of multi-step synthesis routes from bio-renewable feedstocks.
Collapse
Affiliation(s)
- Jana Marie Weber
- Department of Chemical Engineering and Biotechnology
- University of Cambridge
- West Cambridge Site
- Cambridge CB3 0AS
- UK
| | - Pietro Lió
- Department of Computer Science and Technology
- University of Cambridge
- Cambridge CB3 0FD
- UK
| | - Alexei A. Lapkin
- Department of Chemical Engineering and Biotechnology
- University of Cambridge
- West Cambridge Site
- Cambridge CB3 0AS
- UK
| |
Collapse
|
20
|
Abstract
The Internet of Things (IoT), Industry 4.0, and the digitalization of business processes offer new opportunities and business models for the process industry, including education and training.
Collapse
Affiliation(s)
- Norbert Kockmann
- Laboratory of Equipment Design
- Department of Biochemical and Chemical Engineering
- Dortmund
- Germany
| |
Collapse
|
21
|
Kim H, Smith HB, Mathis C, Raymond J, Walker SI. Universal scaling across biochemical networks on Earth. SCIENCE ADVANCES 2019; 5:eaau0149. [PMID: 30746442 PMCID: PMC6357746 DOI: 10.1126/sciadv.aau0149] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 12/06/2018] [Indexed: 06/09/2023]
Abstract
The application of network science to biology has advanced our understanding of the metabolism of individual organisms and the organization of ecosystems but has scarcely been applied to life at a planetary scale. To characterize planetary-scale biochemistry, we constructed biochemical networks using a global database of 28,146 annotated genomes and metagenomes and 8658 cataloged biochemical reactions. We uncover scaling laws governing biochemical diversity and network structure shared across levels of organization from individuals to ecosystems, to the biosphere as a whole. Comparing real biochemical reaction networks to random reaction networks reveals that the observed biological scaling is not a product of chemistry alone but instead emerges due to the particular structure of selected reactions commonly participating in living processes. We show that the topology of biochemical networks for the three domains of life is quantitatively distinguishable, with >80% accuracy in predicting evolutionary domain based on biochemical network size and average topology. Together, our results point to a deeper level of organization in biochemical networks than what has been understood so far.
Collapse
Affiliation(s)
- Hyunju Kim
- Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, AZ, USA
- School of Earth and Space Exploration, Arizona State University, Tempe, AZ, USA
| | - Harrison B. Smith
- School of Earth and Space Exploration, Arizona State University, Tempe, AZ, USA
| | - Cole Mathis
- Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, AZ, USA
- Department of Physics, Arizona State University, Tempe, AZ, USA
| | - Jason Raymond
- School of Earth and Space Exploration, Arizona State University, Tempe, AZ, USA
| | - Sara I. Walker
- Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, AZ, USA
- School of Earth and Space Exploration, Arizona State University, Tempe, AZ, USA
- ASU-SFI Center for Biosocial Complex Systems, Tempe, AZ, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| |
Collapse
|