1
|
Raghavan P, Rago AJ, Verma P, Hassan MM, Goshu GM, Dombrowski AW, Pandey A, Coley CW, Wang Y. Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie's 15-Year Parallel Library Data Set. J Am Chem Soc 2024; 146:15070-15084. [PMID: 38768950 PMCID: PMC11157529 DOI: 10.1021/jacs.4c00098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/24/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields in medicinal chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste could be reduced, and a greater number of relevant compounds could be delivered to advance the design, make, test, analyze (DMTA) cycle. In this work, we detail the evaluation of AbbVie's medicinal chemistry library data set to build machine learning models for the prediction of Suzuki coupling reaction yields. The combination of density functional theory (DFT)-derived features and Morgan fingerprints was identified to perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, we observe modest generalization to unseen reactant structures within the 15-year retrospective library data set. Additionally, we compare predictions made by the model to those made by expert medicinal chemists, finding that the model can often predict both reaction success and reaction yields with greater accuracy. Finally, we demonstrate the application of this approach to suggest structurally and electronically similar building blocks to replace those predicted or observed to be unsuccessful prior to or after synthesis, respectively. The yield prediction model was used to select similar monomers predicted to have higher yields, resulting in greater synthesis efficiency of relevant drug-like molecules.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Alexander J. Rago
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Pritha Verma
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Majdi M. Hassan
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Gashaw M. Goshu
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Amanda W. Dombrowski
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Abhishek Pandey
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Ying Wang
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| |
Collapse
|
2
|
Strieth-Kalthoff F, Szymkuć S, Molga K, Aspuru-Guzik A, Glorius F, Grzybowski BA. Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge. J Am Chem Soc 2024. [PMID: 38598363 DOI: 10.1021/jacs.4c00338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Rapid advancements in artificial intelligence (AI) have enabled breakthroughs across many scientific disciplines. In organic chemistry, the challenge of planning complex multistep chemical syntheses should conceptually be well-suited for AI. Yet, the development of AI synthesis planners trained solely on reaction-example-data has stagnated and is not on par with the performance of "hybrid" algorithms combining AI with expert knowledge. This Perspective examines possible causes of these shortcomings, extending beyond the established reasoning of insufficient quantities of reaction data. Drawing attention to the intricacies and data biases that are specific to the domain of synthetic chemistry, we advocate augmenting the unique capabilities of AI with the knowledge base and the reasoning strategies of domain experts. By actively involving synthetic chemists, who are the end users of any synthesis planning software, into the development process, we envision to bridge the gap between computer algorithms and the intricate nature of chemical synthesis.
Collapse
Affiliation(s)
- Felix Strieth-Kalthoff
- University of Toronto, Department of Chemistry and Department of Computer Science, 80 St. George St., Toronto, Ontario M5S 3H6, Canada
- University of Toronto, Department of Computer Science, 10 King's College Road, Toronto, Ontario M5S 3G4, Canada
| | - Sara Szymkuć
- Allchemy, 2145 45th Street #201, Highland, Indiana 46322, United States
- Institute of Organic Chemistry, Polish Academy of Sciences, ul. Kasprzaka 44/52, Warsaw 01-224, Poland
| | - Karol Molga
- Allchemy, 2145 45th Street #201, Highland, Indiana 46322, United States
- Institute of Organic Chemistry, Polish Academy of Sciences, ul. Kasprzaka 44/52, Warsaw 01-224, Poland
| | - Alán Aspuru-Guzik
- University of Toronto, Department of Chemistry and Department of Computer Science, 80 St. George St., Toronto, Ontario M5S 3H6, Canada
- University of Toronto, Department of Computer Science, 10 King's College Road, Toronto, Ontario M5S 3G4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave., Toronto, Ontario M5G 1M1, Canada
- University of Toronto, Department of Chemical Engineering and Applied Chemistry, 200 College St., Toronto, Ontario M5S 3E5, Canada
- University of Toronto, Department of Materials Science and Engineering, 184 College St., Toronto, Ontario M5S 3E4, Canada
| | - Frank Glorius
- Universität Münster, Organisch-Chemisches Institut, Corrensstr. 36, 48149 Münster, Germany
| | - Bartosz A Grzybowski
- Institute of Organic Chemistry, Polish Academy of Sciences, ul. Kasprzaka 44/52, Warsaw 01-224, Poland
- IBS Center for Algorithmic and Robotized Synthesis, CARS, UNIST 50, UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan 689-798, South Korea
- Department of Chemistry, UNIST, 50, UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan 689-798, South Korea
| |
Collapse
|
3
|
Dolfus U, Briem H, Gutermuth T, Rarey M. Full Modification Control over Retrosynthetic Routes for Guided Optimization of Lead Structures. J Chem Inf Model 2023; 63:6587-6597. [PMID: 37910814 DOI: 10.1021/acs.jcim.3c01155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Synthesizability is essential for compounds designed in silico. Regardless, synthetic accessibility is often considered only as an afterthought in the design and optimization process. In addition, the trend with modern computer-aided drug design methods is going toward full automation and away from the possibility of incorporating user knowledge. With this work, we present the second major release of our software tool, Synthesia, for synthesis-aware lead structure modification, where the user's expertise is now fully utilized. A provided retrosynthetic route is used as a pathway to guide structural modifications that introduce desired structural changes in the target compound. Moreover, the approach allows the user to define the exact position or component in the retrosynthetic route, which should be modified, further integrating the user's expert knowledge. This paper describes the functionality of Synthesia, its basic concepts, and several application scenarios ranging from simple examples to a comparison of the effects of the different exchange functions to an analysis of a set of bioisosteric linker structures, highlighting potential synthetically feasible replacements.
Collapse
Affiliation(s)
- Uschi Dolfus
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraβe 43, 20146 Hamburg, Germany
| | - Hans Briem
- Bayer AG, Research & Development, Pharmaceuticals, Computational Molecular Design Berlin, Building S110, 711, 13342 Berlin, Germany
| | - Torben Gutermuth
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraβe 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraβe 43, 20146 Hamburg, Germany
| |
Collapse
|
4
|
Li B, Su S, Zhu C, Lin J, Hu X, Su L, Yu Z, Liao K, Chen H. A deep learning framework for accurate reaction prediction and its application on high-throughput experimentation data. J Cheminform 2023; 15:72. [PMID: 37568183 PMCID: PMC10422736 DOI: 10.1186/s13321-023-00732-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 06/30/2023] [Indexed: 08/13/2023] Open
Abstract
In recent years, it has been seen that artificial intelligence (AI) starts to bring revolutionary changes to chemical synthesis. However, the lack of suitable ways of representing chemical reactions and the scarceness of reaction data has limited the wider application of AI to reaction prediction. Here, we introduce a novel reaction representation, GraphRXN, for reaction prediction. It utilizes a universal graph-based neural network framework to encode chemical reactions by directly taking two-dimension reaction structures as inputs. The GraphRXN model was evaluated by three publically available chemical reaction datasets and gave on-par or superior results compared with other baseline models. To further evaluate the effectiveness of GraphRXN, wet-lab experiments were carried out for the purpose of generating reaction data. GraphRXN model was then built on high-throughput experimentation data and a decent accuracy (R2 of 0.712) was obtained on our in-house data. This highlights that the GraphRXN model can be deployed in an integrated workflow which combines robotics and AI technologies for forward reaction prediction.
Collapse
Affiliation(s)
- Baiqing Li
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Shimin Su
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Chan Zhu
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Jie Lin
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Xinyue Hu
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Lebin Su
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Zhunzhun Yu
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China
| | - Kuangbiao Liao
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China.
| | - Hongming Chen
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong, China.
| |
Collapse
|
5
|
Mahjour B, Zhang R, Shen Y, McGrath A, Zhao R, Mohamed OG, Lin Y, Zhang Z, Douthwaite JL, Tripathi A, Cernak T. Rapid planning and analysis of high-throughput experiment arrays for reaction discovery. Nat Commun 2023; 14:3924. [PMID: 37400469 DOI: 10.1038/s41467-023-39531-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 06/13/2023] [Indexed: 07/05/2023] Open
Abstract
High-throughput experimentation (HTE) is an increasingly important tool in reaction discovery. While the hardware for running HTE in the chemical laboratory has evolved significantly in recent years, there remains a need for software solutions to navigate data-rich experiments. Here we have developed phactor™, a software that facilitates the performance and analysis of HTE in a chemical laboratory. phactor™ allows experimentalists to rapidly design arrays of chemical reactions or direct-to-biology experiments in 24, 96, 384, or 1,536 wellplates. Users can access online reagent data, such as a chemical inventory, to virtually populate wells with experiments and produce instructions to perform the reaction array manually, or with the assistance of a liquid handling robot. After completion of the reaction array, analytical results can be uploaded for facile evaluation, and to guide the next series of experiments. All chemical data, metadata, and results are stored in machine-readable formats that are readily translatable to various software. We also demonstrate the use of phactor™ in the discovery of several chemistries, including the identification of a low micromolar inhibitor of the SARS-CoV-2 main protease. Furthermore, phactor™ has been made available for free academic use in 24- and 96-well formats via an online interface.
Collapse
Affiliation(s)
- Babak Mahjour
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Rui Zhang
- Department of Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Yuning Shen
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Andrew McGrath
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Ruheng Zhao
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Osama G Mohamed
- Natural Products Discovery Core, Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA
| | - Yingfu Lin
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Zirong Zhang
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - James L Douthwaite
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Ashootosh Tripathi
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
- Natural Products Discovery Core, Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA
| | - Tim Cernak
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA.
- Department of Chemistry, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
6
|
Saebi M, Nan B, Herr JE, Wahlers J, Guo Z, Zurański AM, Kogej T, Norrby PO, Doyle AG, Chawla NV, Wiest O. On the use of real-world datasets for reaction yield prediction. Chem Sci 2023; 14:4997-5005. [PMID: 37206399 PMCID: PMC10189898 DOI: 10.1039/d2sc06041h] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 03/09/2023] [Indexed: 09/30/2023] Open
Abstract
The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application of machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such datasets have been made publicly available. The first real-world dataset from the ELNs of a large pharmaceutical company is disclosed and its relationship to high-throughput experimentation (HTE) datasets is described. For chemical yield predictions, a key task in chemical synthesis, an attributed graph neural network (AGNN) performs as well as or better than the best previous models on two HTE datasets for the Suzuki-Miyaura and Buchwald-Hartwig reactions. However, training the AGNN on an ELN dataset does not lead to a predictive model. The implications of using ELN data for training ML-based models are discussed in the context of yield predictions.
Collapse
Affiliation(s)
- Mandana Saebi
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Bozhao Nan
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - John E Herr
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Jessica Wahlers
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Zhichun Guo
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Andrzej M Zurański
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
| | - Thierry Kogej
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Per-Ola Norrby
- Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Abigail G Doyle
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
- Department of Chemistry and Biochemistry, University of California Los Angeles California 90095 USA
| | - Nitesh V Chawla
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Olaf Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| |
Collapse
|
7
|
A Review on Artificial Intelligence Enabled Design, Synthesis, and Process Optimization of Chemical Products for Industry 4.0. Processes (Basel) 2023. [DOI: 10.3390/pr11020330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
With the development of Industry 4.0, artificial intelligence (AI) is gaining increasing attention for its performance in solving particularly complex problems in industrial chemistry and chemical engineering. Therefore, this review provides an overview of the application of AI techniques, in particular machine learning, in chemical design, synthesis, and process optimization over the past years. In this review, the focus is on the application of AI for structure-function relationship analysis, synthetic route planning, and automated synthesis. Finally, we discuss the challenges and future of AI in making chemical products.
Collapse
|
8
|
Kang PL, Shi YF, Shang C, Liu ZP. Artificial intelligence pathway search to resolve catalytic glycerol hydrogenolysis selectivity. Chem Sci 2022; 13:8148-8160. [PMID: 35919423 PMCID: PMC9278456 DOI: 10.1039/d2sc02107b] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/20/2022] [Indexed: 11/29/2022] Open
Abstract
The complex interaction between molecules and catalyst surfaces leads to great difficulties in understanding and predicting the activity and selectivity in heterogeneous catalysis. Here we develop an end-to-end artificial intelligence framework for the activity prediction of heterogeneous catalytic systems (AI-Cat method), which takes simple inputs from names of molecules and metal catalysts and outputs the reaction energy profile from the input molecule to low energy pathway products. The AI-Cat method combines two neural network models, one for predicting reaction patterns and the other for providing the reaction barrier and energy, with a Monte Carlo tree search to resolve the low energy pathways in a reaction network. We then apply AI-Cat to resolve the reaction network of glycerol hydrogenolysis on Cu surfaces, which is a typical selective C-O bond activation system and of key significance for biomass-derived polyol utilization. We show that glycerol hydrogenolysis features a huge reaction network of relevant candidates, containing 420 reaction intermediates and 2467 elementary reactions. Among them, the surface-mediated enol-keto tautomeric resonance is a key step to facilitate the primary C-OH bond breaking and thus selects 1,2-propanediol as the major product on Cu catalysts. 1,3-Propanediol can only be produced under strong acidic conditions and high surface H coverage by following a hydrogenation-dehydration pathway. AI-Cat further discovers six low-energy reaction patterns for C-O bond activation on metals that is of general significance to polyol catalysis. Our results demonstrate that the reaction prediction for complex heterogeneous catalysis is now feasible with AI-based atomic simulation and a Monte Carlo tree search.
Collapse
Affiliation(s)
- Pei-Lin Kang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
| | - Yun-Fei Shi
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
- Shanghai Qi Zhi Institution Shanghai 200030 China
| | - Zhi-Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
- Shanghai Qi Zhi Institution Shanghai 200030 China
- Key Laboratory of Synthetic and Self-Assembly Chemistry for Organic Functional Molecules, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences Shanghai 200032 China
| |
Collapse
|
9
|
Urbina F, Lowden CT, Culberson JC, Ekins S. MegaSyn: Integrating Generative Molecular Design, Automated Analog Designer, and Synthetic Viability Prediction. ACS OMEGA 2022; 7:18699-18713. [PMID: 35694522 PMCID: PMC9178760 DOI: 10.1021/acsomega.2c01404] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 05/11/2022] [Indexed: 05/04/2023]
Abstract
Generative machine learning models have become widely adopted in drug discovery and other fields to produce new molecules and explore molecular space, with the goal of discovering novel compounds with optimized properties. These generative models are frequently combined with transfer learning or scoring of the physicochemical properties to steer generative design, yet often, they are not capable of addressing a wide variety of potential problems, as well as converge into similar molecular space when combined with a scoring function for the desired properties. In addition, these generated compounds may not be synthetically feasible, reducing their capabilities and limiting their usefulness in real-world scenarios. Here, we introduce a suite of automated tools called MegaSyn representing three components: a new hill-climb algorithm, which makes use of SMILES-based recurrent neural network (RNN) generative models, analog generation software, and retrosynthetic analysis coupled with fragment analysis to score molecules for their synthetic feasibility. We show that by deconstructing the targeted molecules and focusing on substructures, combined with an ensemble of generative models, MegaSyn generally performs well for the specific tasks of generating new scaffolds as well as targeted analogs, which are likely synthesizable and druglike. We now describe the development, benchmarking, and testing of this suite of tools and propose how they might be used to optimize molecules or prioritize promising lead compounds using these RNN examples provided by multiple test case examples.
Collapse
Affiliation(s)
- Fabio Urbina
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Christopher T. Lowden
- Workflow
Informatics Corporation, 9316 Bramden Court, Wake Forest, North Carolina 27587, United States
| | - J. Christopher Culberson
- Workflow
Informatics Corporation, 9316 Bramden Court, Wake Forest, North Carolina 27587, United States
| | - Sean Ekins
- Collaborations
Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
10
|
Wigh DS, Goodman JM, Lapkin AA. A review of molecular representation in the age of machine learning. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1603] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Daniel S. Wigh
- Department of Chemical Engineering and Biotechnology University of Cambridge Cambridge UK
| | | | - Alexei A. Lapkin
- Department of Chemical Engineering and Biotechnology University of Cambridge Cambridge UK
| |
Collapse
|
11
|
Green biomanufacturing promoted by automatic retrobiosynthesis planning and computational enzyme design. Chin J Chem Eng 2022. [DOI: 10.1016/j.cjche.2021.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
12
|
Batchu SP, Hernandez Blazquez B, Malhotra A, Fang H, Ierapetritou M, Vlachos D. Accelerating Manufacturing for Biomass Conversion via Integrated Process and Bench Digitalization: A Perspective. REACT CHEM ENG 2022. [DOI: 10.1039/d1re00560j] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We present a perspective for accelerating biomass manufacturing via digitalization. We summarize the challenges for manufacturing and identify areas where digitalization can help. A profound potential in using lignocellulosic biomass...
Collapse
|
13
|
Wang Z, Zhang W, Liu B. Computational Analysis of Synthetic Planning: Past and Future. CHINESE J CHEM 2021. [DOI: 10.1002/cjoc.202100273] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Zhuang Wang
- Key Laboratory of Green Chemistry & Technology of Ministry of Education, College of Chemistry, Sichuan University, 29 Wangjiang Rd., Chengdu, Sichuan 610064 (China) Center for Molecular Discovery, Department of Chemistry, Boston University, 590 Commonwealth Ave., Boston, Massachusetts 02215, United States cCurrent Address: One Amgen Center Dr. Amgen Inc., Thousand Oaks California 91320 United States
| | - Wenhan Zhang
- Key Laboratory of Green Chemistry & Technology of Ministry of Education, College of Chemistry, Sichuan University, 29 Wangjiang Rd., Chengdu, Sichuan 610064 (China) Center for Molecular Discovery, Department of Chemistry, Boston University, 590 Commonwealth Ave., Boston, Massachusetts 02215, United States cCurrent Address: One Amgen Center Dr. Amgen Inc., Thousand Oaks California 91320 United States
| | - Bo Liu
- Key Laboratory of Green Chemistry & Technology of Ministry of Education, College of Chemistry, Sichuan University, 29 Wangjiang Rd., Chengdu, Sichuan 610064 (China) Center for Molecular Discovery, Department of Chemistry, Boston University, 590 Commonwealth Ave., Boston, Massachusetts 02215, United States cCurrent Address: One Amgen Center Dr. Amgen Inc., Thousand Oaks California 91320 United States
| |
Collapse
|
14
|
Machine learning modelling of chemical reaction characteristics: yesterday, today, tomorrow. MENDELEEV COMMUNICATIONS 2021. [DOI: 10.1016/j.mencom.2021.11.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
15
|
Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021; 16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.
Collapse
Affiliation(s)
- José Jiménez-Luna
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an Der Riss, Germany
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
16
|
|
17
|
Daley SK, Cordell GA. Natural Products, the Fourth Industrial Revolution, and the Quintuple Helix. Nat Prod Commun 2021. [DOI: 10.1177/1934578x211003029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The profound interconnectedness of the sciences and technologies embodied in the Fourth Industrial Revolution is discussed in terms of the global role of natural products, and how that interplays with the development of sustainable and climate-conscious practices of cyberecoethnopharmacolomics within the Quintuple Helix for the promotion of a healthier planet and society.
Collapse
Affiliation(s)
| | - Geoffrey A. Cordell
- Natural Products Inc., Evanston, IL, USA
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, FL, USA
| |
Collapse
|
18
|
Hasic H, Ishida T. Single-Step Retrosynthesis Prediction Based on the Identification of Potential Disconnection Sites Using Molecular Substructure Fingerprints. J Chem Inf Model 2021; 61:641-652. [PMID: 33534997 DOI: 10.1021/acs.jcim.0c01100] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The proper application of retrosynthesis to identify possible transformations for a given target compound requires a lot of chemistry knowledge and experience. However, because the complexity of this technique scales together with the complexity of the target, efficient application on compounds with intricate molecular structures becomes almost impossible for human chemists. The idea of using computers in such situations has existed for a long time, but the accuracy was not sufficient for practical applications. Nevertheless, with the steady improvement of machine learning and artificial intelligence in recent years, computer-assisted retrosynthesis has been gaining research attention again. Because of the overall lack of chemical reaction data, the main challenge for the recent retrosynthesis methods is low exploration ability during the analysis of target and intermediate compounds. The main goal of this research is to develop a novel, template-free approach to address this issue. Only individual molecular substructures of the target are used to determine potential disconnection sites, without relying on additional information such as chemical reaction class. The model for the identification of potential disconnection sites is trained on novel molecular substructure fingerprint representations. For each of the disconnections suggested using the model, a simple structural similarity-based reactant retrieval and scoring method is applied, and the suggestions are completed. This method achieves 47.2% top-1 accuracy for the single-step retrosynthesis task on the processed United States Patent Office dataset. Furthermore, if the predicted reaction class is used to narrow down the reactant candidate search space, the performance is improved to 61.4% top-1 accuracy.
Collapse
Affiliation(s)
- Haris Hasic
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, W8-85, 2-12-1, Ookayama, Meguro 152-8552, Tokyo, Japan
| | - Takashi Ishida
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, W8-85, 2-12-1, Ookayama, Meguro 152-8552, Tokyo, Japan
| |
Collapse
|
19
|
Kim E, Lee D, Kwon Y, Park MS, Choi YS. Valid, Plausible, and Diverse Retrosynthesis Using Tied Two-Way Transformers with Latent Variables. J Chem Inf Model 2021; 61:123-133. [PMID: 33410697 DOI: 10.1021/acs.jcim.0c01074] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Retrosynthesis is an essential task in organic chemistry for identifying the synthesis pathways of newly discovered materials, and with the recent advances in deep learning, there have been growing attempts to solve the retrosynthesis problem through transformer models, which are the state-of-the-art in neural machine translation, by converting the problem into a machine translation problem. However, the pure transformer provides unsatisfactory results that lack grammatical validity, chemical plausibility, and diversity in reactant candidates. In this study, we develop tied two-way transformers with latent modeling to solve those problems using cycle consistency checks, parameter sharing, and multinomial latent variables. Experimental results obtained using public and in-house datasets demonstrate that the proposed model improves the retrosynthesis accuracy, grammatical error, and diversity, and qualitative evaluation results verify its ability to suggest valid and plausible results.
Collapse
Affiliation(s)
- Eunji Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea
| | - Dongseon Lee
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea
| | - Youngchun Kwon
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea
| | - Min Sik Park
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea
| | - Youn-Suk Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon 16678, Republic of Korea
| |
Collapse
|
20
|
Decherchi S, Cavalli A. Thermodynamics and Kinetics of Drug-Target Binding by Molecular Simulation. Chem Rev 2020; 120:12788-12833. [PMID: 33006893 PMCID: PMC8011912 DOI: 10.1021/acs.chemrev.0c00534] [Citation(s) in RCA: 127] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Indexed: 12/19/2022]
Abstract
Computational studies play an increasingly important role in chemistry and biophysics, mainly thanks to improvements in hardware and algorithms. In drug discovery and development, computational studies can reduce the costs and risks of bringing a new medicine to market. Computational simulations are mainly used to optimize promising new compounds by estimating their binding affinity to proteins. This is challenging due to the complexity of the simulated system. To assess the present and future value of simulation for drug discovery, we review key applications of advanced methods for sampling complex free-energy landscapes at near nonergodicity conditions and for estimating the rate coefficients of very slow processes of pharmacological interest. We outline the statistical mechanics and computational background behind this research, including methods such as steered molecular dynamics and metadynamics. We review recent applications to pharmacology and drug discovery and discuss possible guidelines for the practitioner. Recent trends in machine learning are also briefly discussed. Thanks to the rapid development of methods for characterizing and quantifying rare events, simulation's role in drug discovery is likely to expand, making it a valuable complement to experimental and clinical approaches.
Collapse
Affiliation(s)
- Sergio Decherchi
- Computational
and Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, 16163 Genoa, Italy
| | - Andrea Cavalli
- Computational
and Chemical Biology, Fondazione Istituto
Italiano di Tecnologia, 16163 Genoa, Italy
- Department
of Pharmacy and Biotechnology, University
of Bologna, 40126 Bologna, Italy
| |
Collapse
|
21
|
Plehiers PP, Coley CW, Gao H, Vermeire FH, Dobbelaere MR, Stevens CV, Van Geem KM, Green WH. Artificial Intelligence for Computer-Aided Synthesis In Flow: Analysis and Selection of Reaction Components. FRONTIERS IN CHEMICAL ENGINEERING 2020. [DOI: 10.3389/fceng.2020.00005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
|
22
|
Coley CW, Eyke NS, Jensen KF. Autonomous Discovery in the Chemical Sciences Part I: Progress. Angew Chem Int Ed Engl 2020; 59:22858-22893. [DOI: 10.1002/anie.201909987] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Indexed: 01/05/2023]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
23
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil I: Fortschritt. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909987] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
24
|
Nicolaou CA, Watson IA, LeMasters M, Masquelin T, Wang J. Context Aware Data-Driven Retrosynthetic Analysis. J Chem Inf Model 2020; 60:2728-2738. [DOI: 10.1021/acs.jcim.9b01141] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Christos A. Nicolaou
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Ian A. Watson
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Mark LeMasters
- Research Chemistry IT, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Thierry Masquelin
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Jibo Wang
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| |
Collapse
|
25
|
Duan H, Wang L, Zhang C, Guo L, Li J. Retrosynthesis with attention-based NMT model and chemical analysis of "wrong" predictions. RSC Adv 2020; 10:1371-1378. [PMID: 35494683 PMCID: PMC9047528 DOI: 10.1039/c9ra08535a] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 12/25/2019] [Indexed: 01/04/2023] Open
Abstract
We consider retrosynthesis to be a machine translation problem. Accordingly, we apply an attention-based and completely data-driven model named Tensor2Tensor to a data set comprising approximately 50 000 diverse reactions extracted from the United States patent literature. The model significantly outperforms the seq2seq model (37.4%), with top-1 accuracy reaching 54.1%. We also offer a novel insight into the causes of grammatically invalid SMILES, and conduct a test in which experienced chemists select and analyze the "wrong" predictions that may be chemically plausible but differ from the ground truth. The effectiveness of our model is found to be underestimated and the "true" top-1 accuracy reaches as high as 64.6%.
Collapse
Affiliation(s)
- Hongliang Duan
- Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Sciences, Zhejiang University of Technology Hangzhou 310014 P. R. of China
| | - Ling Wang
- Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Sciences, Zhejiang University of Technology Hangzhou 310014 P. R. of China
| | - Chengyun Zhang
- Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Sciences, Zhejiang University of Technology Hangzhou 310014 P. R. of China
| | - Lin Guo
- Department of Pharmacy, The Affiliated Hospital of Xuzhou Medical University, Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University Xuzhou Jiangsu 221000 P. R. of China
| | - Jianjun Li
- Artificial Intelligent Aided Drug Discovery Lab, College of Pharmaceutical Sciences, Zhejiang University of Technology Hangzhou 310014 P. R. of China
| |
Collapse
|
26
|
Thakkar A, Kogej T, Reymond JL, Engkvist O, Bjerrum EJ. Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain. Chem Sci 2020; 11:154-168. [PMID: 32110367 PMCID: PMC7012039 DOI: 10.1039/c9sc04944d] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 11/05/2019] [Indexed: 12/19/2022] Open
Abstract
Computer Assisted Synthesis Planning (CASP) has gained considerable interest as of late. Herein we investigate a template-based retrosynthetic planning tool, trained on a variety of datasets consisting of up to 17.5 million reactions. We demonstrate that models trained on datasets such as internal Electronic Laboratory Notebooks (ELN), and the publicly available United States Patent Office (USPTO) extracts, are sufficient for the prediction of full synthetic routes to compounds of interest in medicinal chemistry. As such we have assessed the models on 1731 compounds from 41 virtual libraries for which experimental results were known. Furthermore, we show that accuracy is a misleading metric for assessment of the policy network, and propose that the number of successfully applied templates, in conjunction with the overall ability to generate full synthetic routes be examined instead. To this end we found that the specificity of the templates comes at the cost of generalizability, and overall model performance. This is supplemented by a comparison of the underlying datasets and their corresponding models.
Collapse
Affiliation(s)
- Amol Thakkar
- Hit Discovery , Discovery Sciences, R&D , AstraZeneca , Gothenburg , Sweden .
- Department of Chemistry and Biochemistry , University of Bern , Bern , Switzerland .
| | - Thierry Kogej
- Hit Discovery , Discovery Sciences, R&D , AstraZeneca , Gothenburg , Sweden .
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Bern , Switzerland .
| | - Ola Engkvist
- Hit Discovery , Discovery Sciences, R&D , AstraZeneca , Gothenburg , Sweden .
| | | |
Collapse
|
27
|
Zheng S, Rao J, Zhang Z, Xu J, Yang Y. Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks. J Chem Inf Model 2019; 60:47-55. [PMID: 31825611 DOI: 10.1021/acs.jcim.9b00949] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes; however, at present, it is cumbersome and cannot provide satisfactory results. In this study, we have developed a template-free self-corrected retrosynthesis predictor (SCROP) to predict retrosynthesis using transformer neural networks. In the method, the retrosynthesis planning was converted to a machine translation problem from the products to molecular linear notations of the reactants. By coupling with a neural network-based syntax corrector, our method achieved an accuracy of 59.0% on a standard benchmark data set, which outperformed other deep learning methods by >21% and template-based methods by >6%. More importantly, our method was 1.7 times more accurate than other state-of-the-art methods for compounds not appearing in the training set.
Collapse
Affiliation(s)
- Shuangjia Zheng
- Research Center for Drug Discovery, School of Pharmaceutical Sciences , Sun Yat-sen University , 132 East Circle at University City , Guangzhou 510006 , China.,School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510006 , China
| | - Jiahua Rao
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510006 , China
| | - Zhongyue Zhang
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510006 , China
| | - Jun Xu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences , Sun Yat-sen University , 132 East Circle at University City , Guangzhou 510006 , China.,School of Computer Science & Technology , Wuyi University , 99 Yingbin Road , Jiangmen 529020 , China
| | - Yuedong Yang
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510006 , China.,Key Laboratory of Machine Intelligence and Advanced Computing , Sun Yat-sen University, Ministry of Education , Guangzhou 510000 , China
| |
Collapse
|
28
|
Badowski T, Gajewska EP, Molga K, Grzybowski BA. Synergy Between Expert and Machine‐Learning Approaches Allows for Improved Retrosynthetic Planning. Angew Chem Int Ed Engl 2019. [DOI: 10.1002/ange.201912083] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Tomasz Badowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Ewa P. Gajewska
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Karol Molga
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Bartosz A. Grzybowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
- IBS Center for Soft and Living Matter and Department of Chemistry UNIST 50, UNIST-gil, Eonyang-eup, Ulju-gun Ulsan South Korea
| |
Collapse
|
29
|
Badowski T, Gajewska EP, Molga K, Grzybowski BA. Synergy Between Expert and Machine‐Learning Approaches Allows for Improved Retrosynthetic Planning. Angew Chem Int Ed Engl 2019; 59:725-730. [DOI: 10.1002/anie.201912083] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Indexed: 12/27/2022]
Affiliation(s)
- Tomasz Badowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Ewa P. Gajewska
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Karol Molga
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
| | - Bartosz A. Grzybowski
- Institute of Organic Chemistry Polish Academy of Sciences Ul. Kasprzaka 44/52 01-224 Warsaw Poland
- IBS Center for Soft and Living Matter and Department of Chemistry UNIST 50, UNIST-gil, Eonyang-eup, Ulju-gun Ulsan South Korea
| |
Collapse
|
30
|
David L, Arús-Pous J, Karlsson J, Engkvist O, Bjerrum EJ, Kogej T, Kriegl JM, Beck B, Chen H. Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research. Front Pharmacol 2019; 10:1303. [PMID: 31749705 PMCID: PMC6848277 DOI: 10.3389/fphar.2019.01303] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 10/14/2019] [Indexed: 12/21/2022] Open
Abstract
In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.
Collapse
Affiliation(s)
- Laurianne David
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Josep Arús-Pous
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Johan Karlsson
- Quantitative Biology, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Esben Jannik Bjerrum
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Thierry Kogej
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Jan M. Kriegl
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany
| | - Bernd Beck
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany
| | - Hongming Chen
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Chemistry and Chemical Biology Centre, Guangzhou Regenerative Medicine and Health – Guangdong Laboratory, Guangzhou, China
| |
Collapse
|
31
|
Ghiandoni GM, Bodkin MJ, Chen B, Hristozov D, Wallace JEA, Webster J, Gillet VJ. Development and Application of a Data-Driven Reaction Classification Model: Comparison of an Electronic Lab Notebook and Medicinal Chemistry Literature. J Chem Inf Model 2019; 59:4167-4187. [PMID: 31529948 DOI: 10.1021/acs.jcim.9b00537] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Reaction classification has often been considered an important task for many different applications, and has traditionally been accomplished using hand-coded rule-based approaches. However, the availability of large collections of reactions enables data-driven approaches to be developed. We present the development and validation of a 336-class machine learning-based classification model integrated within a Conformal Prediction (CP) framework to associate reaction class predictions with confidence estimations. We also propose a data-driven approach for "dynamic" reaction fingerprinting to maximize the effectiveness of reaction encoding, as well as developing a novel reaction classification system that organizes labels into four hierarchical levels (SHREC: Sheffield Hierarchical REaction Classification). We show that the performance of the CP augmented model can be improved by defining confidence thresholds to detect predictions that are less likely to be false. For example, the external validation of the model reports 95% of predictions as correct by filtering out less than 15% of the uncertain classifications. The application of the model is demonstrated by classifying two reaction data sets: one extracted from an industrial ELN and the other from the medicinal chemistry literature. We show how confidence estimations and class compositions across different levels of information can be used to gain immediate insights on the nature of reaction collections and hidden relationships between reaction classes.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Information School , University of Sheffield , Regent Court, 211 Portobello , Sheffield S1 4DP , United Kingdom
| | - Michael J Bodkin
- Evotec (U.K.) Ltd. , 114 Innovation Drive , Milton Park, Abingdon OX14 4RZ , United Kingdom
| | - Beining Chen
- Chemistry Department , University of Sheffield , Dainton Building , Brook Hill, Sheffield S3 7HF , United Kingdom
| | - Dimitar Hristozov
- Evotec (U.K.) Ltd. , 114 Innovation Drive , Milton Park, Abingdon OX14 4RZ , United Kingdom
| | - James E A Wallace
- Evotec (U.K.) Ltd. , 114 Innovation Drive , Milton Park, Abingdon OX14 4RZ , United Kingdom
| | - James Webster
- Information School , University of Sheffield , Regent Court, 211 Portobello , Sheffield S1 4DP , United Kingdom
| | - Valerie J Gillet
- Information School , University of Sheffield , Regent Court, 211 Portobello , Sheffield S1 4DP , United Kingdom
| |
Collapse
|
32
|
Konze KD, Bos PH, Dahlgren MK, Leswing K, Tubert-Brohman I, Bortolato A, Robbason B, Abel R, Bhat S. Reaction-Based Enumeration, Active Learning, and Free Energy Calculations To Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin-Dependent Kinase 2 Inhibitors. J Chem Inf Model 2019; 59:3782-3793. [PMID: 31404495 DOI: 10.1021/acs.jcim.9b00367] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The hit-to-lead and lead optimization processes usually involve the design, synthesis, and profiling of thousands of analogs prior to clinical candidate nomination. A hit finding campaign may begin with a virtual screen that explores millions of compounds, if not more. However, this scale of computational profiling is not frequently performed in the hit-to-lead or lead optimization phases of drug discovery. This is likely due to the lack of appropriate computational tools to generate synthetically tractable lead-like compounds in silico, and a lack of computational methods to accurately profile compounds prospectively on a large scale. Recent advances in computational power and methods provide the ability to profile much larger libraries of ligands than previously possible. Herein, we report a new computational technique, referred to as "PathFinder", that uses retrosynthetic analysis followed by combinatorial synthesis to generate novel compounds in synthetically accessible chemical space. In this work, the integration of PathFinder-driven compound generation, cloud-based FEP simulations, and active learning are used to rapidly optimize R-groups, and generate new cores for inhibitors of cyclin-dependent kinase 2 (CDK2). Using this approach, we explored >300 000 ideas, performed >5000 FEP simulations, and identified >100 ligands with a predicted IC50 < 100 nM, including four unique cores. To our knowledge, this is the largest set of FEP calculations disclosed in the literature to date. The rapid turnaround time, and scale of chemical exploration, suggests that this is a useful approach to accelerate the discovery of novel chemical matter in drug discovery campaigns.
Collapse
Affiliation(s)
- Kyle D Konze
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Pieter H Bos
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Markus K Dahlgren
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Karl Leswing
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Ivan Tubert-Brohman
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Andrea Bortolato
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Braxton Robbason
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Robert Abel
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| | - Sathesh Bhat
- Schrödinger Inc. , 120 West 45th Street, 17th floor , New York , New York 10036 , United States
| |
Collapse
|
33
|
Coley CW, Thomas DA, Lummiss JAM, Jaworski JN, Breen CP, Schultz V, Hart T, Fishman JS, Rogers L, Gao H, Hicklin RW, Plehiers PP, Byington J, Piotti JS, Green WH, Hart AJ, Jamison TF, Jensen KF. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 2019; 365:365/6453/eaax1566. [DOI: 10.1126/science.aax1566] [Citation(s) in RCA: 338] [Impact Index Per Article: 67.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 07/02/2019] [Indexed: 12/18/2022]
Abstract
The synthesis of complex organic molecules requires several stages, from ideation to execution, that require time and effort investment from expert chemists. Here, we report a step toward a paradigm of chemical synthesis that relieves chemists from routine tasks, combining artificial intelligence–driven synthesis planning and a robotically controlled experimental platform. Synthetic routes are proposed through generalization of millions of published chemical reactions and validated in silico to maximize their likelihood of success. Additional implementation details are determined by expert chemists and recorded in reusable recipe files, which are executed by a modular continuous-flow platform that is automatically reconfigured by a robotic arm to set up the required unit operations and carry out the reaction. This strategy for computer-augmented chemical synthesis is demonstrated for 15 drug or drug-like substances.
Collapse
|
34
|
Schreck JS, Coley CW, Bishop KJM. Learning Retrosynthetic Planning through Simulated Experience. ACS CENTRAL SCIENCE 2019; 5:970-981. [PMID: 31263756 PMCID: PMC6598174 DOI: 10.1021/acscentsci.9b00055] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Indexed: 05/11/2023]
Abstract
The problem of retrosynthetic planning can be framed as a one-player game, in which the chemist (or a computer program) works backward from a molecular target to simpler starting materials through a series of choices regarding which reactions to perform. This game is challenging as the combinatorial space of possible choices is astronomical, and the value of each choice remains uncertain until the synthesis plan is completed and its cost evaluated. Here, we address this search problem using deep reinforcement learning to identify policies that make (near) optimal reaction choices during each step of retrosynthetic planning according to a user-defined cost metric. Using a simulated experience, we train a neural network to estimate the expected synthesis cost or value of any given molecule based on a representation of its molecular structure. We show that learned policies based on this value network can outperform a heuristic approach that favors symmetric disconnections when synthesizing unfamiliar molecules from available starting materials using the fewest number of reactions. We discuss how the learned policies described here can be incorporated into existing synthesis planning tools and how they can be adapted to changes in the synthesis cost objective or material availability.
Collapse
Affiliation(s)
- John S. Schreck
- Department
of Chemical Engineering, Columbia University, New York, New York 10027, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Kyle J. M. Bishop
- Department
of Chemical Engineering, Columbia University, New York, New York 10027, United States
| |
Collapse
|
35
|
Coley CW, Green WH, Jensen KF. RDChiral: An RDKit Wrapper for Handling Stereochemistry in Retrosynthetic Template Extraction and Application. J Chem Inf Model 2019; 59:2529-2537. [DOI: 10.1021/acs.jcim.9b00286] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - William H. Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Klavs F. Jensen
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
36
|
Friederich P, Fediai A, Kaiser S, Konrad M, Jung N, Wenzel W. Toward Design of Novel Materials for Organic Electronics. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2019; 31:e1808256. [PMID: 31012166 DOI: 10.1002/adma.201808256] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Indexed: 06/09/2023]
Abstract
Materials for organic electronics are presently used in prominent applications, such as displays in mobile devices, while being intensely researched for other purposes, such as organic photovoltaics, large-area devices, and thin-film transistors. Many of the challenges to improve and optimize these applications are material related and there is a nearly infinite chemical space that needs to be explored to identify the most suitable material candidates. Established experimental approaches struggle with the size and complexity of this chemical space. Herein, the development of simulation methods is addressed, with a particular emphasis on predictive multiscale protocols, to complement experimental research in the identification of novel materials and illustrate the potential of these methods with a few prominent recent applications. Finally, the potential of machine learning and methods based on artificial intelligence is discussed to further accelerate the search for new materials.
Collapse
Affiliation(s)
- Pascal Friederich
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
- Department of Chemistry, University of Toronto, 80 St. George Street, M5S 3H6, Toronto, Ontario, Canada
| | - Artem Fediai
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Simon Kaiser
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Manuel Konrad
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Nicole Jung
- Institute of Organic Chemistry (IOC), Karlsruhe Institute of Technology (KIT), Fritz-Haber-Weg 6, 76131, Karlsruhe, Germany
| | - Wolfgang Wenzel
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
37
|
Coley CW, Jin W, Rogers L, Jamison TF, Jaakkola TS, Green WH, Barzilay R, Jensen KF. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem Sci 2019; 10:370-377. [PMID: 30746086 PMCID: PMC6335848 DOI: 10.1039/c8sc04228d] [Citation(s) in RCA: 289] [Impact Index Per Article: 57.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Accepted: 11/23/2018] [Indexed: 12/19/2022] Open
Abstract
We present a supervised learning approach to predict the products of organic reactions given their reactants, reagents, and solvent(s). The prediction task is factored into two stages comparable to manual expert approaches: considering possible sites of reactivity and evaluating their relative likelihoods. By training on hundreds of thousands of reaction precedents covering a broad range of reaction types from the patent literature, the neural model makes informed predictions of chemical reactivity. The model predicts the major product correctly over 85% of the time requiring around 100 ms per example, a significantly higher accuracy than achieved by previous machine learning approaches, and performs on par with expert chemists with years of formal training. We gain additional insight into predictions via the design of the neural model, revealing an understanding of chemistry qualitatively consistent with manual approaches.
Collapse
Affiliation(s)
- Connor W Coley
- Department of Chemical Engineering , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - Wengong Jin
- Computer Science and Artificial Intelligence Laboratory , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - Luke Rogers
- Department of Chemical Engineering , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - Timothy F Jamison
- Department of Chemistry , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA
| | - Tommi S Jaakkola
- Computer Science and Artificial Intelligence Laboratory , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - William H Green
- Department of Chemical Engineering , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| | - Klavs F Jensen
- Department of Chemical Engineering , Massachusetts Institute of Technology , 77 Massachusetts Avenue , Cambridge , MA 02139 , USA .
| |
Collapse
|
38
|
Baylon JL, Cilfone NA, Gulcher JR, Chittenden TW. Enhancing Retrosynthetic Reaction Prediction with Deep Learning Using Multiscale Reaction Classification. J Chem Inf Model 2019; 59:673-688. [DOI: 10.1021/acs.jcim.8b00801] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
- Javier L. Baylon
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE Cambridge, Massachusetts 02142, United States
- Complex Biological Systems Alliance, Medford, Massachusetts 02155, United States
| | - Nicholas A. Cilfone
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE Cambridge, Massachusetts 02142, United States
- Complex Biological Systems Alliance, Medford, Massachusetts 02155, United States
| | - Jeffrey R. Gulcher
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE Cambridge, Massachusetts 02142, United States
- Cancer Genetics Group, WuXi NextCODE, Cambridge, Massachusetts 02142, United States
| | - Thomas W. Chittenden
- Complex Biological Systems Alliance, Medford, Massachusetts 02155, United States
- Computational Statistics and Bioinformatics Group, Advanced Artificial Intelligence Research Laboratory, WuXi NextCODE, Cambridge, Massachusetts 02142, United States
- Division of Genetics and Genomics, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts 02215, United States
| |
Collapse
|
39
|
Watson IA, Wang J, Nicolaou CA. A retrosynthetic analysis algorithm implementation. J Cheminform 2019; 11:1. [PMID: 30604073 PMCID: PMC6689887 DOI: 10.1186/s13321-018-0323-6] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 12/20/2018] [Indexed: 11/30/2022] Open
Abstract
The need for synthetic route design arises frequently in discovery-oriented chemistry organizations. While traditionally finding solutions to this problem has been the domain of human experts, several computational approaches, aided by the algorithmic advances and the availability of large reaction collections, have recently been reported. Herein we present our own implementation of a retrosynthetic analysis method and demonstrate its capabilities in an attempt to identify synthetic routes for a collection of approved drugs. Our results indicate that the method, leveraging on reaction transformation rules learned from a large patent reaction dataset, can identify multiple theoretically feasible synthetic routes and, thus, support research chemist everyday efforts.
Collapse
Affiliation(s)
- Ian A Watson
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, 46285, USA
| | - Jibo Wang
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, 46285, USA
| | - Christos A Nicolaou
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, 46285, USA.
| |
Collapse
|
40
|
Planning chemical syntheses with deep neural networks and symbolic AI. Nature 2018; 555:604-610. [PMID: 29595767 DOI: 10.1038/nature25978] [Citation(s) in RCA: 756] [Impact Index Per Article: 126.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 01/31/2018] [Indexed: 11/09/2022]
Abstract
To plan the syntheses of small organic molecules, chemists use retrosynthesis, a problem-solving technique in which target molecules are recursively transformed into increasingly simpler precursors. Computer-aided retrosynthesis would be a valuable tool but at present it is slow and provides results of unsatisfactory quality. Here we use Monte Carlo tree search and symbolic artificial intelligence (AI) to discover retrosynthetic routes. We combined Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on essentially all reactions ever published in organic chemistry. Our system solves for almost twice as many molecules, thirty times faster than the traditional computer-aided search method, which is based on extracted rules and hand-designed heuristics. In a double-blind AB test, chemists on average considered our computer-generated routes to be equivalent to reported literature routes.
Collapse
|
41
|
Sivakumar TV, Bhaduri A, Duvvuru Muni RR, Park JH, Kim TY. SimCAL: a flexible tool to compute biochemical reaction similarity. BMC Bioinformatics 2018; 19:254. [PMID: 29969981 PMCID: PMC6029250 DOI: 10.1186/s12859-018-2248-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 06/14/2018] [Indexed: 11/29/2022] Open
Abstract
Background Computation of reaction similarity is a pre-requisite for several bioinformatics applications including enzyme identification for specific biochemical reactions, enzyme classification and mining for specific inhibitors. Reaction similarity is often assessed at either two levels: (i) comparison across all the constituent substrates and products of a reaction, reaction level similarity, (ii) comparison at the transformation center with various degrees of neighborhood, transformation level similarity. Existing reaction similarity computation tools are designed for specific applications and use different features and similarity measures. A single system integrating these diverse features enables comparison of the impact of different molecular properties on similarity score computation. Results To address these requirements, we present SimCAL, an integrated system to calculate reaction similarity with novel features and capability to perform comparative assessment. SimCAL provides reaction similarity computation at both whole reaction level and transformation level. Novel physicochemical features such as stereochemistry, mass, volume and charge are included in computing reaction fingerprint. Users can choose from four different fingerprint types and nine molecular similarity measures. Further, a comparative assessment of these features is also enabled. The performance of SimCAL is assessed on 3,688,122 reaction pairs with Enzyme Commission (EC) number from MetaCyc and achieved an area under the curve (AUC) of > 0.9. In addition, SimCAL results showed strong correlation with state-of-the-art EC-BLAST and molecular signature based reaction similarity methods. Conclusions SimCAL is developed in java and is available as a standalone tool, with intuitive, user-friendly graphical interface and also as a console application. With its customizable feature selection and similarity calculations, it is expected to cater a wide audience interested in studying and analyzing biochemical reactions and metabolic networks. Electronic supplementary material The online version of this article (10.1186/s12859-018-2248-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Anirban Bhaduri
- Bioinformatics Lab, Samsung Advanced Institute of Technology, Bangalore, 560037, India
| | | | - Jin Hwan Park
- Biomaterials Lab, Materials Center, Samsung Advanced Institute of Technology, Gyeonggi-do, 443803, South Korea
| | - Tae Yong Kim
- Biomaterials Lab, Materials Center, Samsung Advanced Institute of Technology, Gyeonggi-do, 443803, South Korea.
| |
Collapse
|
42
|
Abstract
Computer-aided synthesis planning (CASP) is focused on the goal of accelerating the process by which chemists decide how to synthesize small molecule compounds. The ideal CASP program would take a molecular structure as input and output a sorted list of detailed reaction schemes that each connect that target to purchasable starting materials via a series of chemically feasible reaction steps. Early work in this field relied on expert-crafted reaction rules and heuristics to describe possible retrosynthetic disconnections and selectivity rules but suffered from incompleteness, infeasible suggestions, and human bias. With the relatively recent availability of large reaction corpora (such as the United States Patent and Trademark Office (USPTO), Reaxys, and SciFinder databases), consisting of millions of tabulated reaction examples, it is now possible to construct and validate purely data-driven approaches to synthesis planning. As a result, synthesis planning has been opened to machine learning techniques, and the field is advancing rapidly. In this Account, we focus on two critical aspects of CASP and recent machine learning approaches to both challenges. First, we discuss the problem of retrosynthetic planning, which requires a recommender system to propose synthetic disconnections starting from a target molecule. We describe how the search strategy, necessary to overcome the exponential growth of the search space with increasing number of reaction steps, can be assisted through a learned synthetic complexity metric. We also describe how the recursive expansion can be performed by a straightforward nearest neighbor model that makes clever use of reaction data to generate high quality retrosynthetic disconnections. Second, we discuss the problem of anticipating the products of chemical reactions, which can be used to validate proposed reactions in a computer-generated synthesis plan (i.e., reduce false positives) to increase the likelihood of experimental success. While we introduce this task in the context of reaction validation, its utility extends to the prediction of side products and impurities, among other applications. We describe neural network-based approaches that we and others have developed for this forward prediction task that can be trained on previously published experimental data. Machine learning and artificial intelligence have revolutionized a number of disciplines, not limited to image recognition, dictation, translation, content recommendation, advertising, and autonomous driving. While there is a rich history of using machine learning for structure-activity models in chemistry, it is only now that it is being successfully applied more broadly to organic synthesis and synthesis design. As reported in this Account, machine learning is rapidly transforming CASP, but there are several remaining challenges and opportunities, many pertaining to the availability and standardization of both data and evaluation metrics, which must be addressed by the community at large.
Collapse
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - William H. Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Klavs F. Jensen
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
43
|
Engkvist O, Norrby PO, Selmi N, Lam YH, Peng Z, Sherer EC, Amberg W, Erhard T, Smyth LA. Computational prediction of chemical reactions: current status and outlook. Drug Discov Today 2018; 23:1203-1218. [PMID: 29510217 DOI: 10.1016/j.drudis.2018.02.014] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 01/31/2018] [Accepted: 02/26/2018] [Indexed: 01/05/2023]
Abstract
Over the past few decades, various computational methods have become increasingly important for discovering and developing novel drugs. Computational prediction of chemical reactions is a key part of an efficient drug discovery process. In this review, we discuss important parts of this field, with a focus on utilizing reaction data to build predictive models, the existing programs for synthesis prediction, and usage of quantum mechanics and molecular mechanics (QM/MM) to explore chemical reactions. We also outline potential future developments with an emphasis on pre-competitive collaboration opportunities.
Collapse
Affiliation(s)
- Ola Engkvist
- Discovery Sciences, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, SE-43183 Mölndal, Sweden.
| | - Per-Ola Norrby
- Pharmaceutical Sciences, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, SE-43183 Mölndal, Sweden
| | - Nidhal Selmi
- Discovery Sciences, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, SE-43183 Mölndal, Sweden
| | - Yu-Hong Lam
- Modeling and Informatics, MRL, Merck & Co., Rahway, NJ 07065, USA
| | - Zhengwei Peng
- Modeling and Informatics, MRL, Merck & Co., Rahway, NJ 07065, USA
| | - Edward C Sherer
- Modeling and Informatics, MRL, Merck & Co., Rahway, NJ 07065, USA
| | - Willi Amberg
- AbbVie Deutschland GmbH & Co. KG, Neuroscience Discovery, Medicinal Chemistry, Knollstrasse, 67061 Ludwigshafen, Germany
| | - Thomas Erhard
- AbbVie Deutschland GmbH & Co. KG, Neuroscience Discovery, Medicinal Chemistry, Knollstrasse, 67061 Ludwigshafen, Germany
| | - Lynette A Smyth
- AbbVie Deutschland GmbH & Co. KG, Neuroscience Discovery, Medicinal Chemistry, Knollstrasse, 67061 Ludwigshafen, Germany
| |
Collapse
|
44
|
Coley C, Rogers L, Green WH, Jensen KF. Computer-Assisted Retrosynthesis Based on Molecular Similarity. ACS CENTRAL SCIENCE 2017; 3:1237-1245. [PMID: 29296663 PMCID: PMC5746854 DOI: 10.1021/acscentsci.7b00355] [Citation(s) in RCA: 149] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Indexed: 05/05/2023]
Abstract
We demonstrate molecular similarity to be a surprisingly effective metric for proposing and ranking one-step retrosynthetic disconnections based on analogy to precedent reactions. The developed approach mimics the retrosynthetic strategy defined implicitly by a corpus of known reactions without the need to encode any chemical knowledge. Using 40 000 reactions from the patent literature as a knowledge base, the recorded reactants are among the top 10 proposed precursors in 74.1% of 5000 test reactions, providing strong quantitative support for our methodology. Extension of the one-step strategy to multistep pathway planning is demonstrated and discussed for two exemplary drug products.
Collapse
|
45
|
Tremouilhac P, Nguyen A, Huang YC, Kotov S, Lütjohann DS, Hübsch F, Jung N, Bräse S. Chemotion ELN: an Open Source electronic lab notebook for chemists in academia. J Cheminform 2017; 9:54. [PMID: 29086216 PMCID: PMC5612905 DOI: 10.1186/s13321-017-0240-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 09/18/2017] [Indexed: 11/18/2022] Open
Abstract
The development of an electronic lab notebook (ELN) for researchers working in the field of chemical sciences is presented. The web based application is available as an Open Source software that offers modern solutions for chemical researchers. The Chemotion ELN is equipped with the basic functionalities necessary for the acquisition and processing of chemical data, in particular the work with molecular structures and calculations based on molecular properties. The ELN supports planning, description, storage, and management for the routine work of organic chemists. It also provides tools for communicating and sharing the recorded research data among colleagues. Meeting the requirements of a state of the art research infrastructure, the ELN allows the search for molecules and reactions not only within the user’s data but also in conventional external sources as provided by SciFinder and PubChem. The presented development makes allowance for the growing dependency of scientific activity on the availability of digital information by providing Open Source instruments to record and reuse research data. The current version of the ELN has been using for over half of a year in our chemistry research group, serves as a common infrastructure for chemistry research and enables chemistry researchers to build their own databases of digital information as a prerequisite for the detailed, systematic investigation and evaluation of chemical reactions and mechanisms.
Collapse
Affiliation(s)
- Pierre Tremouilhac
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - An Nguyen
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Yu-Chieh Huang
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Serhii Kotov
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Dominic Sebastian Lütjohann
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany.,Cubuslab GmbH, Lange Straße 2, 76199, Karlsruhe, Germany
| | - Florian Hübsch
- Ninja-Concept GmbH, Haid-und-Neu-Straße 18, 76131, Karlsruhe, Germany
| | - Nicole Jung
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany. .,Institute of Organic Chemistry, Karlsruhe Institute of Technology, Fritz-Haber-Weg 6, 76131, Karlsruhe, Germany.
| | - Stefan Bräse
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany. .,Institute of Organic Chemistry, Karlsruhe Institute of Technology, Fritz-Haber-Weg 6, 76131, Karlsruhe, Germany.
| |
Collapse
|
46
|
Coley C, Barzilay R, Jaakkola TS, Green WH, Jensen KF. Prediction of Organic Reaction Outcomes Using Machine Learning. ACS CENTRAL SCIENCE 2017; 3:434-443. [PMID: 28573205 PMCID: PMC5445544 DOI: 10.1021/acscentsci.7b00064] [Citation(s) in RCA: 346] [Impact Index Per Article: 49.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Indexed: 05/18/2023]
Abstract
Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules' overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases.
Collapse
Affiliation(s)
- Connor
W. Coley
- Department of Chemical Engineering and Computer Science and Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Regina Barzilay
- Department of Chemical Engineering and Computer Science and Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Tommi S. Jaakkola
- Department of Chemical Engineering and Computer Science and Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - William H. Green
- Department of Chemical Engineering and Computer Science and Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Klavs F. Jensen
- Department of Chemical Engineering and Computer Science and Artificial Intelligence
Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
47
|
Segler MHS, Waller MP. Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. Chemistry 2017; 23:5966-5971. [PMID: 28134452 DOI: 10.1002/chem.201605499] [Citation(s) in RCA: 234] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Indexed: 12/24/2022]
Abstract
Reaction prediction and retrosynthesis are the cornerstones of organic chemistry. Rule-based expert systems have been the most widespread approach to computationally solve these two related challenges to date. However, reaction rules often fail because they ignore the molecular context, which leads to reactivity conflicts. Herein, we report that deep neural networks can learn to resolve reactivity conflicts and to prioritize the most suitable transformation rules. We show that by training our model on 3.5 million reactions taken from the collective published knowledge of the entire discipline of chemistry, our model exhibits a top10-accuracy of 95 % in retrosynthesis and 97 % for reaction prediction on a validation set of almost 1 million reactions.
Collapse
Affiliation(s)
- Marwin H S Segler
- Organisch-Chemisches Institut and Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, Corrensstr. 40, 48149, Münster, Germany
| | - Mark P Waller
- Organisch-Chemisches Institut and Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, Corrensstr. 40, 48149, Münster, Germany
- Department of Physics and International Center for Quantum and Molecular Structures, Shanghai University, Shangda Road 99, 200444, Shanghai, China
| |
Collapse
|
48
|
Segler MHS, Waller MP. Modelling Chemical Reasoning to Predict and Invent Reactions. Chemistry 2017; 23:6118-6128. [DOI: 10.1002/chem.201604556] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Indexed: 01/30/2023]
Affiliation(s)
- Marwin H. S. Segler
- Institute of Organic Chemistry and Center for Multiscale Theory and Computation; Westfälische Wilhelms-Universität Münster; Corrensstraße 40 48149 Münster Germany
| | - Mark P. Waller
- Institute of Organic Chemistry and Center for Multiscale Theory and Computation; Westfälische Wilhelms-Universität Münster; Corrensstraße 40 48149 Münster Germany
- Department of Physics and International Centre for Quantum and Molecular Structures; Shanghai University; Shangda Road 99 200444 Shanghai P.R. China
| |
Collapse
|
49
|
Szymkuć S, Gajewska EP, Klucznik T, Molga K, Dittwald P, Startek M, Bajczyk M, Grzybowski BA. Computer-Assisted Synthetic Planning: The End of the Beginning. Angew Chem Int Ed Engl 2016; 55:5904-37. [PMID: 27062365 DOI: 10.1002/anie.201506101] [Citation(s) in RCA: 310] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Revised: 09/14/2015] [Indexed: 11/07/2022]
Abstract
Exactly half a century has passed since the launch of the first documented research project (1965 Dendral) on computer-assisted organic synthesis. Many more programs were created in the 1970s and 1980s but the enthusiasm of these pioneering days had largely dissipated by the 2000s, and the challenge of teaching the computer how to plan organic syntheses earned itself the reputation of a "mission impossible". This is quite curious given that, in the meantime, computers have "learned" many other skills that had been considered exclusive domains of human intellect and creativity-for example, machines can nowadays play chess better than human world champions and they can compose classical music pleasant to the human ear. Although there have been no similar feats in organic synthesis, this Review argues that to concede defeat would be premature. Indeed, bringing together the combination of modern computational power and algorithms from graph/network theory, chemical rules (with full stereo- and regiochemistry) coded in appropriate formats, and the elements of quantum mechanics, the machine can finally be "taught" how to plan syntheses of non-trivial organic molecules in a matter of seconds to minutes. The Review begins with an overview of some basic theoretical concepts essential for the big-data analysis of chemical syntheses. It progresses to the problem of optimizing pathways involving known reactions. It culminates with discussion of algorithms that allow for a completely de novo and fully automated design of syntheses leading to relatively complex targets, including those that have not been made before. Of course, there are still things to be improved, but computers are finally becoming relevant and helpful to the practice of organic-synthetic planning. Paraphrasing Churchill's famous words after the Allies' first major victory over the Axis forces in Africa, it is not the end, it is not even the beginning of the end, but it is the end of the beginning for the computer-assisted synthesis planning. The machine is here to stay.
Collapse
Affiliation(s)
- Sara Szymkuć
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Ewa P Gajewska
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Tomasz Klucznik
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Karol Molga
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Piotr Dittwald
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Michał Startek
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Banacha 2, 02-097 Warszawa, Poland
| | - Michał Bajczyk
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland
| | - Bartosz A Grzybowski
- Institute of Organic Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, Warsaw, 02-224, Poland. , .,Center for Soft and Living Matter of Korea's Institute for Basic Science (IBS), Department of Chemistry, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan, South Korea. ,
| |
Collapse
|
50
|
Szymkuć S, Gajewska EP, Klucznik T, Molga K, Dittwald P, Startek M, Bajczyk M, Grzybowski BA. Computergestützte Syntheseplanung: Das Ende vom Anfang. Angew Chem Int Ed Engl 2016. [DOI: 10.1002/ange.201506101] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Sara Szymkuć
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Ewa P. Gajewska
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Tomasz Klucznik
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Karol Molga
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Piotr Dittwald
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Michał Startek
- Faculty of Mathematics, Informatics, and Mechanics University of Warsaw Banacha 2 02-097 Warszawa Poland
| | - Michał Bajczyk
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
| | - Bartosz A. Grzybowski
- Institute of Organic Chemistry Polish Academy of Sciences Kasprzaka 44/52 Warsaw 02-224 Polen
- Center for Soft and Living Matter of Korea's Institute for Basic Science (IBS) Department of Chemistry Ulsan National Institute of Science and Technology 50, UNIST-gil, Eonyang-eup, Ulju-gun Ulsan Südkorea
| |
Collapse
|