1
|
Xu L, Zhu J, Shen X, Chai J, Shi L, Wu B, Li W, Ma D. 6-Hydroxy Picolinohydrazides Promoted Cu(I)-Catalyzed Hydroxylation Reaction in Water: Machine-Learning Accelerated Ligands Design and Reaction Optimization. Angew Chem Int Ed Engl 2024; 63:e202412552. [PMID: 39189301 DOI: 10.1002/anie.202412552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/19/2024] [Accepted: 08/25/2024] [Indexed: 08/28/2024]
Abstract
Hydroxylated (hetero)arenes are privileged motifs in natural products, materials, small-molecule pharmaceuticals and serve as versatile intermediates in synthetic organic chemistry. Herein, we report an efficient Cu(I)/6-hydroxy picolinohydrazide-catalyzed hydroxylation reaction of (hetero)aryl halides (Br, Cl) in water. By establishing machine learning (ML) models, the design of ligands and optimization of reaction conditions were effectively accelerated. The N-(1,3-dimethyl-9H- carbazol-9-yl)-6-hydroxypicolinamide (L32, 6-HPA-DMCA) demonstrated high efficiency for (hetero)aryl bromides, promoting hydroxylation reactions with a minimal catalyst loading of 0.01 mol % (100 ppm) at 80 °C to reach 10000 TON; for substrates containing sensitive functional groups, the catalyst loading needs to be increased to 3.0 mol % under near-room temperature conditions. N-(2,7-Di-tert-butyl-9H-carbazol-9-yl)-6-hydroxypicolinamide (L42, 6-HPA-DTBCA) displayed superior reaction activity for chloride substrates, enabling hydroxylation reactions at 100 °C with 2-3 mol % catalyst loading. These represent the state of art for both lowest catalyst loading and temperature in the copper-catalyzed hydroxylation reactions. Furthermore, this method features a sustainable and environmentally friendly solvent system, accommodates a wide range of substrates, and shows potential for developing robust and scalable synthesis processes for key pharmaceutical intermediates.
Collapse
Affiliation(s)
- Lanting Xu
- State Key Laboratory of Chemical Biology, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 345 Lingling Lu, Shanghai, 200032, China
| | - Jiazhou Zhu
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Xiaodong Shen
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Jiashuang Chai
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, 500 Dongchuang Lu, Shanghai, 200062, China
| | - Lei Shi
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Bin Wu
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Wei Li
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Dawei Ma
- State Key Laboratory of Chemical Biology, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 345 Lingling Lu, Shanghai, 200032, China
| |
Collapse
|
2
|
Kaspersetz L, Englert B, Krah F, Martinez EC, Neubauer P, Cruz Bournazou MN. Management of experimental workflows in robotic cultivation platforms. SLAS Technol 2024; 29:100214. [PMID: 39486480 DOI: 10.1016/j.slast.2024.100214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 09/13/2024] [Accepted: 10/22/2024] [Indexed: 11/04/2024]
Abstract
In the last decades, robotic cultivation facilities combined with automated execution of workflows have drastically increased the speed of research in biotechnology. In this work, we present the design and deployment of a digital infrastructure for robotic cultivation platforms. We implement a Workflow Management System, using Directed Acyclic Graphs, based on the open-source platform Apache Airflow to increase traceability and the automated execution of experiments. We demonstrate the integration and automation of experimental workflows in a laboratory environment with a heterogeneous device landscape including liquid handling stations, parallel cultivation systems, and mobile robots. The feasibility of our approach is assessed in parallel E. coli fed-batch cultivations with glucose oscillations in which different elastin-like proteins are produced. We show that the use of workflow management systems in robotic cultivation platforms increases automation, robustness and traceability of experimental data.
Collapse
Affiliation(s)
- Lucas Kaspersetz
- Technische Universität Berlin, Institute of Biotechnology, Chair of Bioprocess Engineering, Ackerstr. 76, 13355, Berlin, Germany.
| | - Britta Englert
- Technische Universität Berlin, Institute of Biotechnology, Chair of Bioprocess Engineering, Ackerstr. 76, 13355, Berlin, Germany
| | - Fabian Krah
- BTC Business Technology Consulting AG, Escherweg 5, 26121, Oldenburg, Germany
| | - Ernesto C Martinez
- Technische Universität Berlin, Institute of Biotechnology, Chair of Bioprocess Engineering, Ackerstr. 76, 13355, Berlin, Germany; INGAR, (CONICET - UTN), Avellaneda 3657, Santa Fe, Argentina
| | - Peter Neubauer
- Technische Universität Berlin, Institute of Biotechnology, Chair of Bioprocess Engineering, Ackerstr. 76, 13355, Berlin, Germany
| | - M Nicolas Cruz Bournazou
- Technische Universität Berlin, Institute of Biotechnology, Chair of Bioprocess Engineering, Ackerstr. 76, 13355, Berlin, Germany.
| |
Collapse
|
3
|
Augustine L, Wang Y, Adelman SL, Batista ER, Kozimor SA, Perez D, Schrier J, Yang P. Advancing Rare-Earth (4 f) and Actinide (5 f) Separation through Machine Learning and Automated High-Throughput Experiments. ACS SUSTAINABLE CHEMISTRY & ENGINEERING 2024; 12:16692-16699. [PMID: 39545104 PMCID: PMC11558677 DOI: 10.1021/acssuschemeng.4c06166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Revised: 10/14/2024] [Accepted: 10/14/2024] [Indexed: 11/17/2024]
Abstract
Identifying improved and sustainable alternatives to "classic" separation techniques is an active research field due to its potential widespread impact in fundamental and applied chemistry. As basic purification methodologies, like liquid-liquid extraction, undergo continuous refinement by chemists and engineers, identifying new conditions that outperform existing techniques can be difficult. A major contributor to this challenging problem is the need to explore a vast experimental space to identify the precise conditions that optimize the separation procedure. The advent of artificial intelligence and the advancement of robotic technologies offer the potential to shift the traditional design paradigm. Toward that end, we applied a combination of Bayesian Optimization and high-throughput robotic experiments on the liquid-liquid extraction of thorium (Th4+) and demonstrated that this approach speeds up discovery and significantly accelerates the optimization process. By using Bayesian Optimization as a guide, our automated instrument carried out a total of 339 distribution ratio measurements, corresponding to 113 unique conditions, identifying the optimal experimental conditions with reduced experimental efforts by an estimated 74% compared to a traditional full screening approach. This time and cost saving is particularly significant for radioactive materials, as it not only is more economical and sustainable but also minimizes human exposure to radioactivity.
Collapse
Affiliation(s)
- Logan
J. Augustine
- Theoretical
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Yufei Wang
- Chemistry
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Sara L. Adelman
- Chemistry
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Enrique R. Batista
- Theoretical
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Stosh A. Kozimor
- Chemistry
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Danny Perez
- Theoretical
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| | - Joshua Schrier
- Department
of Chemistry & Biochemistry, Fordham
University, The Bronx, New York 10458, United States
| | - Ping Yang
- Theoretical
Division, Los Alamos National Lab, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
4
|
Bannigan P, Hickman RJ, Aspuru‐Guzik A, Allen C. The Dawn of a New Pharmaceutical Epoch: Can AI and Robotics Reshape Drug Formulation? Adv Healthc Mater 2024; 13:e2401312. [PMID: 39155417 PMCID: PMC11582498 DOI: 10.1002/adhm.202401312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 07/21/2024] [Indexed: 08/20/2024]
Abstract
Over the last four decades, pharmaceutical companies' expenditures on research and development have increased 51-fold. During this same time, clinical success rates for new drugs have remained unchanged at about 10 percent, predominantly due to lack of efficacy and/or safety concerns. This persistent problem underscores the need to innovate across the entire drug development process, particularly in drug formulation, which is often deprioritized and under-resourced.
Collapse
Affiliation(s)
- Pauric Bannigan
- Intrepid Labs Inc.MaRS CentreWest Tower661 University Avenue Suite 1300TorontoONM5G 0B7Canada
| | - Riley J. Hickman
- Intrepid Labs Inc.MaRS CentreWest Tower661 University Avenue Suite 1300TorontoONM5G 0B7Canada
| | - Alán Aspuru‐Guzik
- Intrepid Labs Inc.MaRS CentreWest Tower661 University Avenue Suite 1300TorontoONM5G 0B7Canada
- Department of Chemical Engineering and Applied ChemistryUniversity of TorontoTorontoONM5S 3E5Canada
- Acceleration ConsortiumUniversity of TorontoTorontoONM5S 3H6Canada
- Department of ChemistryUniversity of TorontoTorontoONM5S 3H6Canada
| | - Christine Allen
- Intrepid Labs Inc.MaRS CentreWest Tower661 University Avenue Suite 1300TorontoONM5G 0B7Canada
- Department of Chemical Engineering and Applied ChemistryUniversity of TorontoTorontoONM5S 3E5Canada
- Acceleration ConsortiumUniversity of TorontoTorontoONM5S 3H6Canada
- Leslie Dan Faculty of PharmacyUniversity of TorontoTorontoONM5S 3M2Canada
| |
Collapse
|
5
|
Regnier M, Vega C, Ioannou DI, Noël T. Enhancing electrochemical reactions in organic synthesis: the impact of flow chemistry. Chem Soc Rev 2024; 53:10741-10760. [PMID: 39297689 DOI: 10.1039/d4cs00539b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Utilizing electrons directly offers significant potential for advancing organic synthesis by facilitating novel reactivity and enhancing selectivity under mild conditions. As a result, an increasing number of organic chemists are exploring electrosynthesis. However, the efficacy of electrochemical transformations depends critically on the design of the electrochemical cell. Batch cells often suffer from limitations such as large inter-electrode distances and poor mass transfer, making flow cells a promising alternative. Implementing flow cells, however, requires a foundational understanding of microreactor technology. In this review, we briefly outline the applications of flow electrosynthesis before providing a comprehensive examination of existing flow reactor technologies. Our goal is to equip organic chemists with the insights needed to tailor their electrochemical flow cells to meet specific reactivity requirements effectively. We also highlight the application of reactor designs in scaling up electrochemical processes and integrating high-throughput experimentation and automation. These advancements not only enhance the potential of flow electrosynthesis for the synthetic community but also hold promise for both academia and industry.
Collapse
Affiliation(s)
- Morgan Regnier
- Flow Chemistry Group, Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, SciencePark 904, 1098XH, Amsterdam, The Netherlands.
| | - Clara Vega
- Flow Chemistry Group, Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, SciencePark 904, 1098XH, Amsterdam, The Netherlands.
| | - Dimitris I Ioannou
- Flow Chemistry Group, Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, SciencePark 904, 1098XH, Amsterdam, The Netherlands.
| | - Timothy Noël
- Flow Chemistry Group, Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, SciencePark 904, 1098XH, Amsterdam, The Netherlands.
| |
Collapse
|
6
|
Cheng AH, Ser CT, Skreta M, Guzmán-Cordero A, Thiede L, Burger A, Aldossary A, Leong SX, Pablo-García S, Strieth-Kalthoff F, Aspuru-Guzik A. Spiers Memorial Lecture: How to do impactful research in artificial intelligence for chemistry and materials science. Faraday Discuss 2024. [PMID: 39400305 DOI: 10.1039/d4fd00153b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Machine learning has been pervasively touching many fields of science. Chemistry and materials science are no exception. While machine learning has been making a great impact, it is still not reaching its full potential or maturity. In this perspective, we first outline current applications across a diversity of problems in chemistry. Then, we discuss how machine learning researchers view and approach problems in the field. Finally, we provide our considerations for maximizing impact when researching machine learning for chemistry.
Collapse
Affiliation(s)
- Austin H Cheng
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Cher Tian Ser
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Marta Skreta
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Andrés Guzmán-Cordero
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
- Tinbergen Institute, University of Amsterdam, Amsterdam, Netherlands
| | - Luca Thiede
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Andreas Burger
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | | | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore 63737, Singapore
| | | | | | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
- Acceleration Consortium, Toronto, Ontario M5G 1X6, Canada
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Canada
- Department of Materials Science and Engineering, University of Toronto, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Canada
| |
Collapse
|
7
|
Gormley AJ. Machine learning in drug delivery. J Control Release 2024; 373:23-30. [PMID: 38909704 PMCID: PMC11384327 DOI: 10.1016/j.jconrel.2024.06.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 06/17/2024] [Accepted: 06/19/2024] [Indexed: 06/25/2024]
Abstract
For decades, drug delivery scientists have been performing trial-and-error experimentation to manually sample parameter spaces and optimize release profiles through rational design. To enable this approach, scientists spend much of their career learning nuanced drug-material interactions that drive system behavior. In relatively simple systems, rational design criteria allow us to fine tune release profiles and enable efficacious therapies. However, as materials and drugs become increasingly sophisticated and their interactions have non-linear and compounding effects, the field is suffering the Curse of Dimensionality which prevents us from comprehending complex structure-function relationships. In the past, we have embraced this complexity by implementing high-throughput screens to increase the probability of finding ideal compositions. However, this brute force method was inefficient and led many to abandon these fishing expeditions. Fortunately, methods in data science including artificial intelligence / machine learning (AI/ML) are providing ideal analytical tools to model this complex data and ascertain quantitative structure-function relationships. In this Oration, I speak to the potential value of data science in drug delivery with particular focus on polymeric delivery systems. Here, I do not suggest that AI/ML will simply replace mechanistic understanding of complex systems. Rather, I propose that AI/ML should be yet another useful tool in the lab to navigate complex parameter spaces. The recent hype around AI/ML is breathtaking and potentially over inflated, but the value of these methods is poised to revolutionize how we perform science. Therefore, I encourage readers to consider adopting these skills and applying data science methods to their own problems. If done successfully, I believe we will all realize a paradigm shift in our approach to drug delivery.
Collapse
Affiliation(s)
- Adam J Gormley
- Associate Professor, Department of Biomedical Engineering, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States.
| |
Collapse
|
8
|
Tibo A, He J, Janet JP, Nittinger E, Engkvist O. Exhaustive local chemical space exploration using a transformer model. Nat Commun 2024; 15:7315. [PMID: 39183239 PMCID: PMC11345417 DOI: 10.1038/s41467-024-51672-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 08/12/2024] [Indexed: 08/27/2024] Open
Abstract
How many near-neighbors does a molecule have? This fundamental question in chemistry is crucial for molecular optimization problems under the similarity principle assumption. Generative models can sample molecules from a vast chemical space but lack explicit knowledge about molecular similarity. Therefore, these models need guidance from reinforcement learning to sample a relevant similar chemical space. However, they still miss a mechanism to measure the coverage of a specific region of the chemical space. To overcome these limitations, a source-target molecular transformer model, regularized via a similarity kernel function, is proposed. Trained on a largest dataset of ≥200 billion molecular pairs, the model enforces a direct relationship between generating a target molecule and its similarity to a source molecule. Results indicate that the regularization term significantly improves the correlation between generation probability and molecular similarity, enabling exhaustive exploration of molecule near-neighborhoods.
Collapse
Affiliation(s)
- Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
| | - Jiazhen He
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
- Data Science and AI, Computer Science and Engineering, Chalmers, Gothenburg, Sweden
| |
Collapse
|
9
|
Sun Y, Zhao Y, Xie X, Li H, Feng W. Printed polymer platform empowering machine-assisted chemical synthesis in stacked droplets. Nat Commun 2024; 15:6759. [PMID: 39117641 PMCID: PMC11310347 DOI: 10.1038/s41467-024-50768-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 07/19/2024] [Indexed: 08/10/2024] Open
Abstract
Efficiently exploring organic molecules through multi-step processes demands a transition from conventional laboratory synthesis to automated systems. Existing platforms for machine-assistant synthetic workflows compatible with multiple liquid-phases require substantial engineering investments for setup, thereby hindering quick customization and throughput increasement. Here we present a droplet-based chip that facilitates the self-organization of various liquid phases into stacked layers for conducting chemical transformations. The chip's precision polymer printing capability, enabled by digital micromirror device (DMD)-maskless photolithography and dual post-chemical modifications, allows it to create customized, sub-10 µm featured patterns to confine diverse liquids, regardless of density, within each droplet. The robustness and open design of surface-templated liquid layers actualize machine-assistant droplet manipulation, synchronous reaction triggering, local oscillation, and real-time monitoring of individual layers into a reality. We propose that, with further integration of machine operation line and self-learning, this droplet-based platform holds the potential to become a valuable addition to the toolkit of chemistry process, operating autonomously and with high-throughput.
Collapse
Affiliation(s)
- Yingxue Sun
- College of Polymer Science and Engineering, Sichuan University, Chengdu, China
| | - Yuanyi Zhao
- College of Polymer Science and Engineering, Sichuan University, Chengdu, China
| | - Xinjian Xie
- College of Polymer Science and Engineering, Sichuan University, Chengdu, China
| | - Hongjiao Li
- College of Chemical Engineering, Sichuan University, Chengdu, China
| | - Wenqian Feng
- College of Polymer Science and Engineering, Sichuan University, Chengdu, China.
- Department State Key Laboratory of Polymer Materials Engineering, Sichuan University, Chengdu, China.
| |
Collapse
|
10
|
Mervin L, Voronov A, Kabeshov M, Engkvist O. QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design. J Chem Inf Model 2024; 64:5365-5374. [PMID: 38950185 DOI: 10.1021/acs.jcim.4c00457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Machine-learning (ML) and deep-learning (DL) approaches to predict the molecular properties of small molecules are increasingly deployed within the design-make-test-analyze (DMTA) drug design cycle to predict molecular properties of interest. Despite this uptake, there are only a few automated packages to aid their development and deployment that also support uncertainty estimation, model explainability, and other key aspects of model usage. This represents a key unmet need within the field, and the large number of molecular representations and algorithms (and associated parameters) means it is nontrivial to robustly optimize, evaluate, reproduce, and deploy models. Here, we present QSARtuna, a molecule property prediction modeling pipeline, written in Python and utilizing the Optuna, Scikit-learn, RDKit, and ChemProp packages, which enables the efficient and automated comparison between molecular representations and machine learning models. The platform was developed by considering the increasingly important aspect of model uncertainty quantification and explainability by design. We provide details for our framework and provide illustrative examples to demonstrate the capability of the software when applied to simple molecular property, reaction/reactivity prediction, and DNA encoded library enrichment classification. We hope that the release of QSARtuna will further spur innovation in automatic ML modeling and provide a platform for education of best practices in molecular property modeling. The code for the QSARtuna framework is made freely available via GitHub.
Collapse
Affiliation(s)
- Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge CB2 0AA, United Kingdom
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
- Department of Computer Science and Engineering, University of Gothenburg, Chalmers University of Technology, Gothenburg 412 96, Sweden
| |
Collapse
|
11
|
Bao Z, Yung F, Hickman RJ, Aspuru-Guzik A, Bannigan P, Allen C. Data-driven development of an oral lipid-based nanoparticle formulation of a hydrophobic drug. Drug Deliv Transl Res 2024; 14:1872-1887. [PMID: 38158474 DOI: 10.1007/s13346-023-01491-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/28/2023] [Indexed: 01/03/2024]
Abstract
Due to its cost-effectiveness, convenience, and high patient adherence, oral drug administration normally remains the preferred approach. Yet, the effective delivery of hydrophobic drugs via the oral route is often hindered by their limited water solubility and first-pass metabolism. To mitigate these challenges, advanced delivery systems such as solid lipid nanoparticles (SLNs) and nanostructured lipid carriers (NLCs) have been developed to encapsulate hydrophobic drugs and enhance their bioavailability. However, traditional design methodologies for these complex formulations often present intricate challenges because they are restricted to a relatively narrow design space. Here, we present a data-driven approach for the accelerated design of SLNs/NLCs encapsulating a model hydrophobic drug, cannabidiol, that combines experimental automation and machine learning. A small subset of formulations, comprising 10% of all formulations in the design space, was prepared in-house, leveraging miniaturized experimental automation to improve throughput and decrease the quantity of drug and materials required. Machine learning models were then trained on the data generated from these formulations and used to predict properties of all SLNs/NLCs within this design space (i.e., 1215 formulations). Notably, formulations predicted to be high-performers via this approach were confirmed to significantly enhance the solubility of the drug by up to 3000-fold and prevented degradation of drug. Moreover, the high-performance formulations significantly enhanced the oral bioavailability of the drug compared to both its free form and an over-the-counter version. Furthermore, this bioavailability matched that of a formulation equivalent in composition to the FDA-approved product, Epidiolex®.
Collapse
Affiliation(s)
- Zeqing Bao
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada
| | - Fion Yung
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada
| | - Riley J Hickman
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON, M5S 1M1, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
- Department of Materials Science & Engineering, University of Toronto, Toronto, ON, M5S 3E4, Canada
- CIFAR Artificial Intelligence Research Chair, Vector Institute, Toronto, ON, M5S 1M1, Canada
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada
| | - Pauric Bannigan
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada.
| | - Christine Allen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada.
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada.
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada.
| |
Collapse
|
12
|
Young YA, Nguyen HTH, Nguyen HD, Ganguly T, Nguyen YH, Do LH. A ratiometric substrate for rapid evaluation of transfer hydrogenation efficiency in solution. Dalton Trans 2024; 53:8887-8892. [PMID: 38757518 PMCID: PMC11160331 DOI: 10.1039/d4dt00891j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
A cyclometalated iridium(III) complex bearing a self-immolative quinolinium moiety was developed as a ratiometric substrate for transfer hydrogenation studies. This photoluminescent probe allowed the rapid screening of a variety of Ir catalysts using a microplate reader, offering a convenient method to assess activity using a minimum amount of catalyst sample.
Collapse
Affiliation(s)
- Yen-An Young
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| | - Huong T H Nguyen
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| | - Hieu D Nguyen
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| | - Tuhin Ganguly
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| | - Yennie H Nguyen
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| | - Loi H Do
- Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204, USA.
| |
Collapse
|
13
|
Snapp KL, Verdier B, Gongora AE, Silverman S, Adesiji AD, Morgan EF, Lawton TJ, Whiting E, Brown KA. Superlative mechanical energy absorbing efficiency discovered through self-driving lab-human partnership. Nat Commun 2024; 15:4290. [PMID: 38773093 PMCID: PMC11109101 DOI: 10.1038/s41467-024-48534-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 04/30/2024] [Indexed: 05/23/2024] Open
Abstract
Energy absorbing efficiency is a key determinant of a structure's ability to provide mechanical protection and is defined by the amount of energy that can be absorbed prior to stresses increasing to a level that damages the system to be protected. Here, we explore the energy absorbing efficiency of additively manufactured polymer structures by using a self-driving lab (SDL) to perform >25,000 physical experiments on generalized cylindrical shells. We use a human-SDL collaborative approach where experiments are selected from over trillions of candidates in an 11-dimensional parameter space using Bayesian optimization and then automatically performed while the human team monitors progress to periodically modify aspects of the system. The result of this human-SDL campaign is the discovery of a structure with a 75.2% energy absorbing efficiency and a library of experimental data that reveals transferable principles for designing tough structures.
Collapse
Affiliation(s)
- Kelsey L Snapp
- Department of Mechanical Engineering, Boston University, Boston, MA, USA
| | - Benjamin Verdier
- Department of Computer Science, Boston University, Boston, MA, USA
| | - Aldair E Gongora
- Department of Mechanical Engineering, Boston University, Boston, MA, USA
| | - Samuel Silverman
- Department of Computer Science, Boston University, Boston, MA, USA
| | - Adedire D Adesiji
- Department of Mechanical Engineering, Boston University, Boston, MA, USA
| | - Elise F Morgan
- Department of Mechanical Engineering, Boston University, Boston, MA, USA
- Division of Materials Science & Engineering, Boston University, Boston, MA, USA
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Timothy J Lawton
- Soldier Protection Directorate, US Army Combat Capabilities Development Command Soldier Center, Natick, MA, USA
| | - Emily Whiting
- Department of Computer Science, Boston University, Boston, MA, USA
| | - Keith A Brown
- Department of Mechanical Engineering, Boston University, Boston, MA, USA.
- Division of Materials Science & Engineering, Boston University, Boston, MA, USA.
- Physics Department, Boston University, Boston, MA, USA.
| |
Collapse
|
14
|
Christensen M, Xu Y, Kwan EE, Di Maso MJ, Ji Y, Reibarkh M, Sun AC, Liaw A, Fier PS, Grosser S, Hein JE. Dynamic sampling in autonomous process optimization. Chem Sci 2024; 15:7160-7169. [PMID: 38756794 PMCID: PMC11095507 DOI: 10.1039/d3sc06884f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 04/10/2024] [Indexed: 05/18/2024] Open
Abstract
Autonomous process optimization (APO) is a technology that has recently found utility in a multitude of process optimization challenges. In contrast to most APO examples in microflow reactor systems, we recently presented a system capable of optimization in high-throughput batch reactor systems. The drawback of APO in a high-throughput batch reactor system is the reliance on reaction sampling at a predetermined static timepoint rather than a dynamic endpoint. Static timepoint sampling can lead to the inconsistent capture of the process performance under each process parameter permutation. This is important because critical process behaviors such as rate acceleration accompanied by decomposition could be missed entirely. To address this drawback, we implemented a dynamic reaction endpoint determination strategy to capture the product purity once the process stream stabilized. We accomplished this through the incorporation of a real-time plateau detection algorithm into the APO workflow to measure and report the product purity at the dynamically determined reaction endpoint. We then applied this strategy to the autonomous optimization of a photobromination reaction towards the synthesis of a pharmaceutically relevant intermediate. In doing so, we not only uncovered process conditions to access the desired monohalogenation product in 85 UPLC area % purity with minimal decomposition risk, but also measured the effect of each parameter on the process performance. Our results highlight the advantage of incorporating dynamic sampling in APO workflows to drive optimization toward a stable and high-performing process.
Collapse
Affiliation(s)
- Melodie Christensen
- Department of Chemistry, University of British Columbia Vancouver British Columbia V6T 1Z1 Canada
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Yuting Xu
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Eugene E Kwan
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Michael J Di Maso
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Yining Ji
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Mikhail Reibarkh
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Alexandra C Sun
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Andy Liaw
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Patrick S Fier
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Shane Grosser
- Department of Process Research and Development, Merck & Co., Inc Rahway NJ 07065 USA
| | - Jason E Hein
- Department of Chemistry, University of British Columbia Vancouver British Columbia V6T 1Z1 Canada
- Acceleration Consortium, University of Toronto Toronto ON Canada
- Department of Chemistry, University of Bergen Bergen Norway
| |
Collapse
|
15
|
Wigh D, Arrowsmith J, Pomberger A, Felton KC, Lapkin AA. ORDerly: Data Sets and Benchmarks for Chemical Reaction Data. J Chem Inf Model 2024; 64:3790-3798. [PMID: 38648077 PMCID: PMC11094788 DOI: 10.1021/acs.jcim.4c00292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/03/2024] [Accepted: 04/04/2024] [Indexed: 04/25/2024]
Abstract
Machine learning has the potential to provide tremendous value to life sciences by providing models that aid in the discovery of new molecules and reduce the time for new products to come to market. Chemical reactions play a significant role in these fields, but there is a lack of high-quality open-source chemical reaction data sets for training machine learning models. Herein, we present ORDerly, an open-source Python package for the customizable and reproducible preparation of reaction data stored in accordance with the increasingly popular Open Reaction Database (ORD) schema. We use ORDerly to clean United States patent data stored in ORD and generate data sets for forward prediction, retrosynthesis, as well as the first benchmark for reaction condition prediction. We train neural networks on data sets generated with ORDerly for condition prediction and show that data sets missing key cleaning steps can lead to silently overinflated performance metrics. Additionally, we train transformers for forward and retrosynthesis prediction and demonstrate how non-patent data can be used to evaluate model generalization. By providing a customizable open-source solution for cleaning and preparing large chemical reaction data, ORDerly is poised to push forward the boundaries of machine learning applications in chemistry.
Collapse
Affiliation(s)
- Daniel
S. Wigh
- Department of Chemical Engineering
and Biotechnology, University of Cambridge, Cambridge CB3 0AS, U.K.
| | - Joe Arrowsmith
- Department of Chemical Engineering
and Biotechnology, University of Cambridge, Cambridge CB3 0AS, U.K.
| | - Alexander Pomberger
- Department of Chemical Engineering
and Biotechnology, University of Cambridge, Cambridge CB3 0AS, U.K.
| | - Kobi C. Felton
- Department of Chemical Engineering
and Biotechnology, University of Cambridge, Cambridge CB3 0AS, U.K.
| | - Alexei A. Lapkin
- Department of Chemical Engineering
and Biotechnology, University of Cambridge, Cambridge CB3 0AS, U.K.
| |
Collapse
|
16
|
Kench T, Rahardjo A, Terrones GG, Bellamkonda A, Maher TE, Storch M, Kulik HJ, Vilar R. A Semi-Automated, High-Throughput Approach for the Synthesis and Identification of Highly Photo-Cytotoxic Iridium Complexes. Angew Chem Int Ed Engl 2024; 63:e202401808. [PMID: 38404222 DOI: 10.1002/anie.202401808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024]
Abstract
The discovery of new compounds with pharmacological properties is usually a lengthy, laborious and expensive process. Thus, there is increasing interest in developing workflows that allow for the rapid synthesis and evaluation of libraries of compounds with the aim of identifying leads for further drug development. Herein, we apply combinatorial synthesis to build a library of 90 iridium(III) complexes (81 of which are new) over two synthesise-and-test cycles, with the aim of identifying potential agents for photodynamic therapy. We demonstrate the power of this approach by identifying highly active complexes that are well-tolerated in the dark but display very low nM phototoxicity against cancer cells. To build a detailed structure-activity relationship for this class of compounds we have used density functional theory (DFT) calculations to determine some key electronic parameters and study correlations with the experimental data. Finally, we present an optimised semi-automated synthesise-and-test protocol to obtain multiplex data within 72 hours.
Collapse
Affiliation(s)
- Timothy Kench
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, London, UK
| | - Arielle Rahardjo
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, London, UK
| | - Gianmarco G Terrones
- Department of Chemical Engineering, Massachusetts Institute of Technology, 02139, Cambridge, MA, USA
| | | | - Thomas E Maher
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, London, UK
- Institute of Chemical Biology, Imperial College London, White City Campus, W12 0BZ, London, UK
| | - Marko Storch
- Department of Infectious Disease, Imperial College London, South Kensington Campus, SW7 2AZ, London, UK
- London Biofoundry, Imperial College Translation and Innovation Hub, W12 0BZ, London, UK
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, 02139, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, 02139, Cambridge, MA, USA
| | - Ramon Vilar
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, London, UK
- Institute of Chemical Biology, Imperial College London, White City Campus, W12 0BZ, London, UK
| |
Collapse
|
17
|
Yu L, Zhang W, Nie Z, Duan J, Chen S. Machine learning guided tuning charge distribution by composition in MOFs for oxygen evolution reaction. RSC Adv 2024; 14:9032-9037. [PMID: 38500624 PMCID: PMC10945371 DOI: 10.1039/d3ra08873a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/25/2024] [Indexed: 03/20/2024] Open
Abstract
Traditional design/optimization of metal-organic frameworks (MOFs) is time-consuming and labor-intensive. In this study, we utilize machine learning (ML) to accelerate the synthesis of MOFs. We have built a library of over 900 MOFs with different metal salts, solvent ratios, reaction durations and temperatures, and utilize zeta potentials as target variables for ML training. A total of four ML models have been used to train the collected dataset and assess their convergence performances, where Random Forest Regression (RFR) and Gradient Boosting Regression (GBR) models show strong correlation and accurate predictions. We then predicted two kinds of MOFs from RFR and GBR models. Remarkably, the experimentally data of the synthesized MOFs closely matched the predicted results, and these MOFs exhibited excellent electrocatalytic performances for oxygen evolution. This study would have general implications in the utilization of machine learning for accelerating the synthesis of MOFs for diverse applications.
Collapse
Affiliation(s)
- Licheng Yu
- Key Laboratory for Soft Chemistry and Functional Materials (Ministry of Education), School of Chemistry and Chemical Engineering, School of Energy and Power Engineering, Nanjing University of Science and Technology Nanjing 210094 China
| | - Wenwen Zhang
- Key Laboratory for Soft Chemistry and Functional Materials (Ministry of Education), School of Chemistry and Chemical Engineering, School of Energy and Power Engineering, Nanjing University of Science and Technology Nanjing 210094 China
| | - Zhihao Nie
- Key Laboratory for Soft Chemistry and Functional Materials (Ministry of Education), School of Chemistry and Chemical Engineering, School of Energy and Power Engineering, Nanjing University of Science and Technology Nanjing 210094 China
| | - Jingjing Duan
- Key Laboratory for Soft Chemistry and Functional Materials (Ministry of Education), School of Chemistry and Chemical Engineering, School of Energy and Power Engineering, Nanjing University of Science and Technology Nanjing 210094 China
| | - Sheng Chen
- Key Laboratory for Soft Chemistry and Functional Materials (Ministry of Education), School of Chemistry and Chemical Engineering, School of Energy and Power Engineering, Nanjing University of Science and Technology Nanjing 210094 China
| |
Collapse
|
18
|
Bai J, Mosbach S, Taylor CJ, Karan D, Lee KF, Rihm SD, Akroyd J, Lapkin AA, Kraft M. A dynamic knowledge graph approach to distributed self-driving laboratories. Nat Commun 2024; 15:462. [PMID: 38263405 PMCID: PMC10805810 DOI: 10.1038/s41467-023-44599-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 12/21/2023] [Indexed: 01/25/2024] Open
Abstract
The ability to integrate resources and share knowledge across organisations empowers scientists to expedite the scientific discovery process. This is especially crucial in addressing emerging global challenges that require global solutions. In this work, we develop an architecture for distributed self-driving laboratories within The World Avatar project, which seeks to create an all-encompassing digital twin based on a dynamic knowledge graph. We employ ontologies to capture data and material flows in design-make-test-analyse cycles, utilising autonomous agents as executable knowledge components to carry out the experimentation workflow. Data provenance is recorded to ensure its findability, accessibility, interoperability, and reusability. We demonstrate the practical application of our framework by linking two robots in Cambridge and Singapore for a collaborative closed-loop optimisation for a pharmaceutically-relevant aldol condensation reaction in real-time. The knowledge graph autonomously evolves toward the scientist's research goals, with the two robots effectively generating a Pareto front for cost-yield optimisation in three days.
Collapse
Affiliation(s)
- Jiaru Bai
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK
| | - Sebastian Mosbach
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore
| | - Connor J Taylor
- Astex Pharmaceuticals, 436 Cambridge Science Park Milton Road, Cambridge, CB4 0QA, UK
- Innovation Centre in Digital Molecular Technologies, Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
- Faculty of Engineering, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Dogancan Karan
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore
| | - Kok Foong Lee
- CMCL Innovations, Sheraton House, Cambridge, CB3 0AX, UK
| | - Simon D Rihm
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore
| | - Jethro Akroyd
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore
| | - Alexei A Lapkin
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore
- Innovation Centre in Digital Molecular Technologies, Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Markus Kraft
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, UK.
- Cambridge Centre for Advanced Research and Education in Singapore (CARES), 1 Create Way, CREATE Tower, #05-05, Singapore, 138602, Singapore.
- School of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, 637459, Singapore, Singapore.
- The Alan Turing Institute, London, NW1 2DB, UK.
| |
Collapse
|
19
|
Sadeghi S, Bateni F, Kim T, Son DY, Bennett JA, Orouji N, Punati VS, Stark C, Cerra TD, Awad R, Delgado-Licona F, Xu J, Mukhin N, Dickerson H, Reyes KG, Abolhasani M. Autonomous nanomanufacturing of lead-free metal halide perovskite nanocrystals using a self-driving fluidic lab. NANOSCALE 2024; 16:580-591. [PMID: 38116636 DOI: 10.1039/d3nr05034c] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Lead-based metal halide perovskite (MHP) nanocrystals (NCs) have emerged as a promising class of semiconducting nanomaterials for a wide range of optoelectronic and photoelectronic applications. However, the intrinsic lead toxicity of MHP NCs has significantly hampered their large-scale device applications. Copper-base MHP NCs with composition-tunable optical properties have emerged as a prominent lead-free MHP NC candidate. However, comprehensive synthesis space exploration, development, and synthesis science studies of copper-based MHP NCs have been limited by the manual nature of flask-based synthesis and characterization methods. In this study, we present an autonomous approach for the development of lead-free MHP NCs via seamless integration of a modular microfluidic platform with machine learning-assisted NC synthesis modeling and experiment selection to establish a self-driving fluidic lab for accelerated NC synthesis science studies. For the first time, a successful and reproducible in-flow synthesis of Cs3Cu2I5 NCs is presented. Autonomous experimentation is then employed for rapid in-flow synthesis science studies of Cs3Cu2I5 NCs. The autonomously generated experimental NC synthesis dataset is then utilized for fast-tracked synthetic route optimization of high-performing Cs3Cu2I5 NCs.
Collapse
Affiliation(s)
- Sina Sadeghi
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Fazel Bateni
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Taekhoon Kim
- Synthesis Technical Unit, Material Research Center, Samsung Advanced Institute of Technology, SEC, 130, Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, Republic of Korea
| | - Dae Yong Son
- Synthesis Technical Unit, Material Research Center, Samsung Advanced Institute of Technology, SEC, 130, Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, Republic of Korea
| | - Jeffrey A Bennett
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Negin Orouji
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Venkat S Punati
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Christine Stark
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Teagan D Cerra
- Department of Physics, Weber State University, Ogden, UT 84408, USA
| | - Rami Awad
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Fernando Delgado-Licona
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Jinge Xu
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Nikolai Mukhin
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Hannah Dickerson
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| | - Kristofer G Reyes
- Department of Materials Design and Innovation, University at Buffalo, Buffalo, NY 14260, USA
| | - Milad Abolhasani
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| |
Collapse
|
20
|
Zhao Q, Anstine DM, Isayev O, Savoie BM. Δ 2 machine learning for reaction property prediction. Chem Sci 2023; 14:13392-13401. [PMID: 38033903 PMCID: PMC10686042 DOI: 10.1039/d3sc02408c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 07/11/2023] [Indexed: 12/02/2023] Open
Abstract
The emergence of Δ-learning models, whereby machine learning (ML) is used to predict a correction to a low-level energy calculation, provides a versatile route to accelerate high-level energy evaluations at a given geometry. However, Δ-learning models are inapplicable to reaction properties like heats of reaction and activation energies that require both a high-level geometry and energy evaluation. Here, a Δ2-learning model is introduced that can predict high-level activation energies based on low-level critical-point geometries. The Δ2 model uses an atom-wise featurization typical of contemporary ML interatomic potentials (MLIPs) and is trained on a dataset of ∼167 000 reactions, using the GFN2-xTB energy and critical-point geometry as a low-level input and the B3LYP-D3/TZVP energy calculated at the B3LYP-D3/TZVP critical point as a high-level target. The excellent performance of the Δ2 model on unseen reactions demonstrates the surprising ease with which the model implicitly learns the geometric deviations between the low-level and high-level geometries that condition the activation energy prediction. The transferability of the Δ2 model is validated on several external testing sets where it shows near chemical accuracy, illustrating the benefits of combining ML models with readily available physical-based information from semi-empirical quantum chemistry calculations. Fine-tuning of the Δ2 model on a small number of Gaussian-4 calculations produced a 35% accuracy improvement over DFT activation energy predictions while retaining xTB-level cost. The Δ2 model approach proves to be an efficient strategy for accelerating chemical reaction characterization with minimal sacrifice in prediction accuracy.
Collapse
Affiliation(s)
- Qiyuan Zhao
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| | - Dylan M Anstine
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Brett M Savoie
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| |
Collapse
|
21
|
Millan R, Bello-Jurado E, Moliner M, Boronat M, Gomez-Bombarelli R. Effect of Framework Composition and NH 3 on the Diffusion of Cu + in Cu-CHA Catalysts Predicted by Machine-Learning Accelerated Molecular Dynamics. ACS CENTRAL SCIENCE 2023; 9:2044-2056. [PMID: 38033797 PMCID: PMC10683499 DOI: 10.1021/acscentsci.3c00870] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Indexed: 12/02/2023]
Abstract
Cu-exchanged zeolites rely on mobile solvated Cu+ cations for their catalytic activity, but the role of the framework composition in transport is not fully understood. Ab initio molecular dynamics simulations can provide quantitative atomistic insight but are too computationally expensive to explore large length and time scales or diverse compositions. We report a machine-learning interatomic potential that accurately reproduces ab initio results and effectively generalizes to allow multinanosecond simulations of large supercells and diverse chemical compositions. Biased and unbiased simulations of [Cu(NH3)2]+ mobility show that aluminum pairing in eight-membered rings accelerates local hopping and demonstrate that increased NH3 concentration enhances long-range diffusion. The probability of finding two [Cu(NH3)2]+ complexes in the same cage, which is key for SCR-NOx reaction, increases with Cu content and Al content but does not correlate with the long-range mobility of Cu+. Supporting experimental evidence was obtained from reactivity tests of Cu-CHA catalysts with a controlled chemical composition.
Collapse
Affiliation(s)
- Reisel Millan
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States
- Instituto
de Tecnología Química, Universitat
Politècnica de València-Consejo Superior de Investigaciones
Científicas, Avenida de los Naranjos s/n, 46022 Valencia, Spain
| | - Estefanía Bello-Jurado
- Instituto
de Tecnología Química, Universitat
Politècnica de València-Consejo Superior de Investigaciones
Científicas, Avenida de los Naranjos s/n, 46022 Valencia, Spain
| | - Manuel Moliner
- Instituto
de Tecnología Química, Universitat
Politècnica de València-Consejo Superior de Investigaciones
Científicas, Avenida de los Naranjos s/n, 46022 Valencia, Spain
| | - Mercedes Boronat
- Instituto
de Tecnología Química, Universitat
Politècnica de València-Consejo Superior de Investigaciones
Científicas, Avenida de los Naranjos s/n, 46022 Valencia, Spain
| | - Rafael Gomez-Bombarelli
- Department
of Materials Science and Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
22
|
Ha T, Lee D, Kwon Y, Park MS, Lee S, Jang J, Choi B, Jeon H, Kim J, Choi H, Seo HT, Choi W, Hong W, Park YJ, Jang J, Cho J, Kim B, Kwon H, Kim G, Oh WS, Kim JW, Choi J, Min M, Jeon A, Jung Y, Kim E, Lee H, Choi YS. AI-driven robotic chemist for autonomous synthesis of organic molecules. SCIENCE ADVANCES 2023; 9:eadj0461. [PMID: 37910607 PMCID: PMC10619927 DOI: 10.1126/sciadv.adj0461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 09/27/2023] [Indexed: 11/03/2023]
Abstract
The automation of organic compound synthesis is pivotal for expediting the development of such compounds. In addition, enhancing development efficiency can be achieved by incorporating autonomous functions alongside automation. To achieve this, we developed an autonomous synthesis robot that harnesses the power of artificial intelligence (AI) and robotic technology to establish optimal synthetic recipes. Given a target molecule, our AI initially plans synthetic pathways and defines reaction conditions. It then iteratively refines these plans using feedback from the experimental robot, gradually optimizing the recipe. The system performance was validated by successfully determining synthetic recipes for three organic compounds, yielding that conversion rates that outperform existing references. Notably, this autonomous system is designed around batch reactors, making it accessible and valuable to chemists in standard laboratory settings, thereby streamlining research endeavors.
Collapse
Affiliation(s)
- Taesin Ha
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Dongseon Lee
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Youngchun Kwon
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Min Sik Park
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Sangyoon Lee
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Jaejun Jang
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Byungkwon Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Hyunjeong Jeon
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Jeonghun Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Hyundo Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Hyung-Tae Seo
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
- Department of Mechanical Engineering, Kyonggi University, 154-42, Gwanggyosan-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16227, Republic of Korea
| | - Wonje Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Wooram Hong
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Young Jin Park
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
- School of Mechanical Engineering, Gyeongsang National University, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, Republic of Korea
| | - Junwon Jang
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Joonkee Cho
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Bosung Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Hyukju Kwon
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Gahee Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Won Seok Oh
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Jin Woo Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Joonhyuk Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Minsik Min
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Aram Jeon
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Yongsik Jung
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| | - Eunji Kim
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
- School of Business Administration, Chung-Ang University, 135, Seodal-ro, Dongjak-gu, Seoul 06973, Republic of Korea
| | - Hyosug Lee
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
- College of Information and Communication Engineering, Sungkyunkwan University (SKKU), 2066, Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do 16419, Republic of Korea
| | - Youn-Suk Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., 130 Samsung-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do 16678, Republic of Korea
| |
Collapse
|
23
|
Bao Z, Bufton J, Hickman RJ, Aspuru-Guzik A, Bannigan P, Allen C. Revolutionizing drug formulation development: The increasing impact of machine learning. Adv Drug Deliv Rev 2023; 202:115108. [PMID: 37774977 DOI: 10.1016/j.addr.2023.115108] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/24/2023] [Accepted: 09/25/2023] [Indexed: 10/01/2023]
Abstract
Over the past few years, the adoption of machine learning (ML) techniques has rapidly expanded across many fields of research including formulation science. At the same time, the use of lipid nanoparticles to enable the successful delivery of mRNA vaccines in the recent COVID-19 pandemic demonstrated the impact of formulation science. Yet, the design of advanced pharmaceutical formulations is non-trivial and primarily relies on costly and time-consuming wet-lab experimentation. In 2021, our group published a review article focused on the use of ML as a means to accelerate drug formulation development. Since then, the field has witnessed significant growth and progress, reflected by an increasing number of studies published in this area. This updated review summarizes the current state of ML directed drug formulation development, introduces advanced ML techniques that have been implemented in formulation design and shares the progress on making self-driving laboratories a reality. Furthermore, this review highlights several future applications of ML yet to be fully exploited to advance drug formulation research and development.
Collapse
Affiliation(s)
- Zeqing Bao
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Jack Bufton
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Riley J Hickman
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada; Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada; Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada; Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON M5S 1M1, Canada; Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada; Department of Materials Science & Engineering, University of Toronto, Toronto, ON M5S 3E4, Canada; CIFAR Artificial Intelligence Research Chair, Vector Institute, Toronto, ON M5S 1M1, Canada; Acceleration Consortium, Toronto, ON M5S 3H6, Canada
| | - Pauric Bannigan
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada.
| | - Christine Allen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada; Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada; Acceleration Consortium, Toronto, ON M5S 3H6, Canada.
| |
Collapse
|
24
|
Ida T, Kojima H, Hori Y. Predicting and analyzing organic reaction pathways by combining machine learning and reaction network approaches. Chem Commun (Camb) 2023; 59:12439-12442. [PMID: 37773321 DOI: 10.1039/d3cc03890d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2023]
Abstract
A learning model is proposed that predicts both products and reaction pathways by combining machine learning and reaction network approaches. By training 50 fundamental organic reactions, the learning model predicted the products and pathways of 35 test reactions with a top-5 accuracy of 68.6%. The model identified the key fragment structures of the intermediates and could be classified as several basic reaction rules in the context of organic chemistry, such as the Markovnikov rule.
Collapse
Affiliation(s)
- Tomonori Ida
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa 920-1192, Japan.
| | - Honoka Kojima
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa 920-1192, Japan.
| | - Yuta Hori
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Japan
| |
Collapse
|
25
|
Ateia M, Sigmund G, Bentel MJ, Washington JW, Lai A, Merrill NH, Wang Z. Integrated data-driven cross-disciplinary framework to prevent chemical water pollution. ONE EARTH (CAMBRIDGE, MASS.) 2023; 6:10.1016/j.oneear.2023.07.001. [PMID: 38264630 PMCID: PMC10802893 DOI: 10.1016/j.oneear.2023.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
Access to a clean and healthy environment is a human right and a prerequisite for maintaining a sustainable ecosystem. Experts across domains along the chemical life cycle have traditionally operated in isolation, leading to limited connectivity between upstream chemical innovation to downstream development of water-treatment technologies. This fragmented and historically reactive approach to managing emerging contaminants has resulted in significant externalized societal costs. Herein, we propose an integrated data-driven framework to foster proactive action across domains to effectively address chemical water pollution. By implementing this integrated framework, it will not only enhance the capabilities of experts in their respective fields but also create opportunities for novel approaches that yield co-benefits across multiple domains. To successfully operationalize the integrated framework, several concerted efforts are warranted, including adopting open and FAIR (findable, accessible, interoperable, and reusable) data practices, developing common knowledge bases/platforms, and staying vigilant against new substance "properties" of concern.
Collapse
Affiliation(s)
- Mohamed Ateia
- United States Environmental Protection Agency, Center for Environmental Solutions & Emergency Response, Cincinnati, OH 45220, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, USA
| | - Gabriel Sigmund
- Environmental Geosciences, Centre for Microbiology and Environmental Systems Science, University of Vienna, Josef-Holaubeck-Platz 2, 1090 Vienna, Austria
- Environmental Technology, Wageningen University & Research, P.O. Box 17, 6700 AA Wageningen, the Netherlands
| | - Michael J. Bentel
- Department of Environmental Engineering and Earth Sciences, Clemson University, Clemson, SC 29634, USA
| | - John W. Washington
- United States Environmental Protection Agency, Center for Environmental Measurement and Modeling, Athens, GA 30605, USA
| | - Adelene Lai
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, 07743 Jena, Germany
| | - Nathaniel H. Merrill
- United States Environmental Protection Agency, Center for Environmental Measurement and Modeling, Narragansett, RI, USA
| | - Zhanyun Wang
- Empa Swiss – Federal Laboratories for Materials Science and Technology, Technology and Society Laboratory, 9014 St. Gallen, Switzerland
| |
Collapse
|
26
|
Zhang XE, Liu C, Dai J, Yuan Y, Gao C, Feng Y, Wu B, Wei P, You C, Wang X, Si T. Enabling technology and core theory of synthetic biology. SCIENCE CHINA. LIFE SCIENCES 2023; 66:1742-1785. [PMID: 36753021 PMCID: PMC9907219 DOI: 10.1007/s11427-022-2214-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 10/04/2022] [Indexed: 02/09/2023]
Abstract
Synthetic biology provides a new paradigm for life science research ("build to learn") and opens the future journey of biotechnology ("build to use"). Here, we discuss advances of various principles and technologies in the mainstream of the enabling technology of synthetic biology, including synthesis and assembly of a genome, DNA storage, gene editing, molecular evolution and de novo design of function proteins, cell and gene circuit engineering, cell-free synthetic biology, artificial intelligence (AI)-aided synthetic biology, as well as biofoundries. We also introduce the concept of quantitative synthetic biology, which is guiding synthetic biology towards increased accuracy and predictability or the real rational design. We conclude that synthetic biology will establish its disciplinary system with the iterative development of enabling technologies and the maturity of the core theory.
Collapse
Affiliation(s)
- Xian-En Zhang
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Chenli Liu
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Junbiao Dai
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Yingjin Yuan
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.
| | - Caixia Gao
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Yan Feng
- State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Bian Wu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Ping Wei
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Chun You
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, 100084, China.
| | - Tong Si
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
27
|
Xu R, Meisner J, Chang AM, Thompson KC, Martínez TJ. First principles reaction discovery: from the Schrodinger equation to experimental prediction for methane pyrolysis. Chem Sci 2023; 14:7447-7464. [PMID: 37449065 PMCID: PMC10337770 DOI: 10.1039/d3sc01202f] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 06/02/2023] [Indexed: 07/18/2023] Open
Abstract
Our recent success in exploiting graphical processing units (GPUs) to accelerate quantum chemistry computations led to the development of the ab initio nanoreactor, a computational framework for automatic reaction discovery and kinetic model construction. In this work, we apply the ab initio nanoreactor to methane pyrolysis, from automatic reaction discovery to path refinement and kinetic modeling. Elementary reactions occurring during methane pyrolysis are revealed using GPU-accelerated ab initio molecular dynamics simulations. Subsequently, these reaction paths are refined at a higher level of theory with optimized reactant, product, and transition state geometries. Reaction rate coefficients are calculated by transition state theory based on the optimized reaction paths. The discovered reactions lead to a kinetic model with 53 species and 134 reactions, which is validated against experimental data and simulations using literature kinetic models. We highlight the advantage of leveraging local brute force and Monte Carlo sensitivity analysis approaches for efficient identification of important reactions. Both sensitivity approaches can further improve the accuracy of the methane pyrolysis kinetic model. The results in this work demonstrate the power of the ab initio nanoreactor framework for computationally affordable systematic reaction discovery and accurate kinetic modeling.
Collapse
Affiliation(s)
- Rui Xu
- Department of Chemistry, The PULSE Institute, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Jan Meisner
- Department of Chemistry, The PULSE Institute, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Alexander M Chang
- Department of Chemistry, The PULSE Institute, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Keiran C Thompson
- Department of Chemistry, The PULSE Institute, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Todd J Martínez
- Department of Chemistry, The PULSE Institute, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| |
Collapse
|
28
|
Wang X, Huang Y, Xie X, Liu Y, Huo Z, Lin M, Xin H, Tong R. Bayesian-optimization-assisted discovery of stereoselective aluminum complexes for ring-opening polymerization of racemic lactide. Nat Commun 2023; 14:3647. [PMID: 37339991 DOI: 10.1038/s41467-023-39405-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 06/12/2023] [Indexed: 06/22/2023] Open
Abstract
Stereoselective ring-opening polymerization catalysts are used to produce degradable stereoregular poly(lactic acids) with thermal and mechanical properties that are superior to those of atactic polymers. However, the process of discovering highly stereoselective catalysts is still largely empirical. We aim to develop an integrated computational and experimental framework for efficient, predictive catalyst selection and optimization. As a proof of principle, we have developed a Bayesian optimization workflow on a subset of literature results for stereoselective lactide ring-opening polymerization, and using the algorithm, we identify multiple new Al complexes that catalyze either isoselective or heteroselective polymerization. In addition, feature attribution analysis uncovers mechanistically meaningful ligand descriptors, such as percent buried volume (%Vbur) and the highest occupied molecular orbital energy (EHOMO), that can access quantitative and predictive models for catalyst development.
Collapse
Affiliation(s)
- Xiaoqian Wang
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Yang Huang
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Xiaoyu Xie
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Yan Liu
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Ziyu Huo
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Maverick Lin
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA
| | - Hongliang Xin
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA.
| | - Rong Tong
- Department of Chemical Engineering, Virginia Polytechnic Institute and State University, 635 Prices Fork Road, Blacksburg, VA, 24061, USA.
| |
Collapse
|
29
|
Smith PT, Ye Z, Pietryga J, Huang J, Wahl CB, Hedlund Orbeck JK, Mirkin CA. Molecular Thin Films Enable the Synthesis and Screening of Nanoparticle Megalibraries Containing Millions of Catalysts. J Am Chem Soc 2023. [PMID: 37311072 DOI: 10.1021/jacs.3c03910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Megalibraries are centimeter-scale chips containing millions of materials synthesized in parallel using scanning probe lithography. As such, they stand to accelerate how materials are discovered for applications spanning catalysis, optics, and more. However, a long-standing challenge is the availability of substrates compatible with megalibrary synthesis, which limits the structural and functional design space that can be explored. To address this challenge, thermally removable polystyrene films were developed as universal substrate coatings that decouple lithography-enabled nanoparticle synthesis from the underlying substrate chemistry, thus providing consistent lithography parameters on diverse substrates. Multi-spray inking of the scanning probe arrays with polymer solutions containing metal salts allows patterning of >56 million nanoreactors designed to vary in composition and size. These are subsequently converted to inorganic nanoparticles via reductive thermal annealing, which also removes the polystyrene to deposit the megalibrary. Megalibraries with mono-, bi-, and trimetallic materials were synthesized, and nanoparticle size was controlled between 5 and 35 nm by modulating the lithography speed. Importantly, the polystyrene coating can be used on conventional substrates like Si/SiOx, as well as substrates typically more difficult to pattern on, such as glassy carbon, diamond, TiO2, BN, W, or SiC. Finally, high-throughput materials discovery is performed in the context of photocatalytic degradation of organic pollutants using Au-Pd-Cu nanoparticle megalibraries on TiO2 substrates with 2,250,000 unique composition/size combinations. The megalibrary was screened within 1 h by developing fluorescent thin-film coatings on top of the megalibrary as proxies for catalytic turnover, revealing Au0.53Pd0.38Cu0.09-TiO2 as the most active photocatalyst composition.
Collapse
Affiliation(s)
- Peter T Smith
- Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
| | - Zihao Ye
- Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
| | - Jacob Pietryga
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
- Department of Materials Science and Engineering, Northwestern University, Evanston, Illinois 60208, United States
| | - Jin Huang
- Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
| | - Carolin B Wahl
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
- Department of Materials Science and Engineering, Northwestern University, Evanston, Illinois 60208, United States
| | - Jenny K Hedlund Orbeck
- Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
| | - Chad A Mirkin
- Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States
- International Institute for Nanotechnology, Evanston, Illinois 60208, United States
- Department of Materials Science and Engineering, Northwestern University, Evanston, Illinois 60208, United States
| |
Collapse
|
30
|
Janet JP, Mervin L, Engkvist O. Artificial intelligence in molecular de novo design: Integration with experiment. Curr Opin Struct Biol 2023; 80:102575. [PMID: 36966692 DOI: 10.1016/j.sbi.2023.102575] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/09/2023] [Accepted: 02/18/2023] [Indexed: 06/04/2023]
Abstract
In this mini review, we capture the latest progress of applying artificial intelligence (AI) techniques based on deep learning architectures to molecular de novo design with a focus on integration with experimental validation. We will cover the progress and experimental validation of novel generative algorithms, the validation of QSAR models and how AI-based molecular de novo design is starting to become connected with chemistry automation. While progress has been made in the last few years, it is still early days. The experimental validations conducted thus far should be considered proof-of-principle, providing confidence that the field is moving in the right direction.
Collapse
Affiliation(s)
- Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
31
|
Capaldo L, Wen Z, Noël T. A field guide to flow chemistry for synthetic organic chemists. Chem Sci 2023; 14:4230-4247. [PMID: 37123197 PMCID: PMC10132167 DOI: 10.1039/d3sc00992k] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 03/15/2023] [Indexed: 03/17/2023] Open
Abstract
Flow chemistry has unlocked a world of possibilities for the synthetic community, but the idea that it is a mysterious "black box" needs to go. In this review, we show that several of the benefits of microreactor technology can be exploited to push the boundaries in organic synthesis and to unleash unique reactivity and selectivity. By "lifting the veil" on some of the governing principles behind the observed trends, we hope that this review will serve as a useful field guide for those interested in diving into flow chemistry.
Collapse
Affiliation(s)
- Luca Capaldo
- Flow Chemistry Group, Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam 1098 XH Amsterdam The Netherlands
| | - Zhenghui Wen
- Flow Chemistry Group, Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam 1098 XH Amsterdam The Netherlands
| | - Timothy Noël
- Flow Chemistry Group, Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam 1098 XH Amsterdam The Netherlands
| |
Collapse
|
32
|
Anstine D, Isayev O. Generative Models as an Emerging Paradigm in the Chemical Sciences. J Am Chem Soc 2023; 145:8736-8750. [PMID: 37052978 PMCID: PMC10141264 DOI: 10.1021/jacs.2c13467] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Indexed: 04/14/2023]
Abstract
Traditional computational approaches to design chemical species are limited by the need to compute properties for a vast number of candidates, e.g., by discriminative modeling. Therefore, inverse design methods aim to start from the desired property and optimize a corresponding chemical structure. From a machine learning viewpoint, the inverse design problem can be addressed through so-called generative modeling. Mathematically, discriminative models are defined by learning the probability distribution function of properties given the molecular or material structure. In contrast, a generative model seeks to exploit the joint probability of a chemical species with target characteristics. The overarching idea of generative modeling is to implement a system that produces novel compounds that are expected to have a desired set of chemical features, effectively sidestepping issues found in the forward design process. In this contribution, we overview and critically analyze popular generative algorithms like generative adversarial networks, variational autoencoders, flow, and diffusion models. We highlight key differences between each of the models, provide insights into recent success stories, and discuss outstanding challenges for realizing generative modeling discovered solutions in chemical applications.
Collapse
Affiliation(s)
- Dylan
M. Anstine
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
33
|
Liu L, Corma A. Bimetallic Sites for Catalysis: From Binuclear Metal Sites to Bimetallic Nanoclusters and Nanoparticles. Chem Rev 2023; 123:4855-4933. [PMID: 36971499 PMCID: PMC10141355 DOI: 10.1021/acs.chemrev.2c00733] [Citation(s) in RCA: 81] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Indexed: 03/29/2023]
Abstract
Heterogeneous bimetallic catalysts have broad applications in industrial processes, but achieving a fundamental understanding on the nature of the active sites in bimetallic catalysts at the atomic and molecular level is very challenging due to the structural complexity of the bimetallic catalysts. Comparing the structural features and the catalytic performances of different bimetallic entities will favor the formation of a unified understanding of the structure-reactivity relationships in heterogeneous bimetallic catalysts and thereby facilitate the upgrading of the current bimetallic catalysts. In this review, we will discuss the geometric and electronic structures of three representative types of bimetallic catalysts (bimetallic binuclear sites, bimetallic nanoclusters, and nanoparticles) and then summarize the synthesis methodologies and characterization techniques for different bimetallic entities, with emphasis on the recent progress made in the past decade. The catalytic applications of supported bimetallic binuclear sites, bimetallic nanoclusters, and nanoparticles for a series of important reactions are discussed. Finally, we will discuss the future research directions of catalysis based on supported bimetallic catalysts and, more generally, the prospective developments of heterogeneous catalysis in both fundamental research and practical applications.
Collapse
Affiliation(s)
- Lichen Liu
- Department
of Chemistry, Tsinghua University, Beijing 100084, China
| | - Avelino Corma
- Instituto
de Tecnología Química, Universitat
Politècnica de València−Consejo Superior de Investigaciones
Científicas (UPV-CSIC), Avenida de los Naranjos s/n, Valencia 46022, Spain
| |
Collapse
|
34
|
Ge L, Ke Y, Li X. Machine learning integrated photocatalysis: progress and challenges. Chem Commun (Camb) 2023; 59:5795-5806. [PMID: 37093605 DOI: 10.1039/d3cc00989k] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Discovering efficient photocatalysts has long been the goal of photocatalysis, which has traditionally been driven by serendipitous or try-and-error strategies. Recent developments in photocatalysis integrated with machine learning techniques promise to accelerate the discovery of photocatalysts, but are also facing significant challenges. In this review, advances in machine learning integrated photocatalysis are first presented from the perspective of three main photocatalytic processes: light harvesting, charge generation and separation, and surface redox reactions. Next, progress in using machine learning to understand complex photoactivity-structure relationships and identify the factors governing activity follows. A future photocatalysis paradigm is then provided with the integration of artificial intelligence, robots and automation. Lastly, we discuss the current challenges in machine learning integrated photocatalysis. This review aims to provide a systematic overview and guidelines to the broad scientific community interested in photocatalysis and artificial intelligence for solar fuel synthesis.
Collapse
Affiliation(s)
- Luyao Ge
- Key Laboratory of the Ministry of Education for Advanced Catalysis Materials, Zhejiang Key Laboratory for Reactive Chemistry on Solid Surfaces, Zhejiang Normal University, Jinhua 321004, China.
| | - Yuanzhen Ke
- Key Laboratory of the Ministry of Education for Advanced Catalysis Materials, Zhejiang Key Laboratory for Reactive Chemistry on Solid Surfaces, Zhejiang Normal University, Jinhua 321004, China.
| | - Xiaobo Li
- Key Laboratory of the Ministry of Education for Advanced Catalysis Materials, Zhejiang Key Laboratory for Reactive Chemistry on Solid Surfaces, Zhejiang Normal University, Jinhua 321004, China.
| |
Collapse
|
35
|
Hickman RJ, Bannigan P, Bao Z, Aspuru-Guzik A, Allen C. Self-driving laboratories: A paradigm shift in nanomedicine development. MATTER 2023; 6:1071-1081. [PMID: 37020832 PMCID: PMC9993483 DOI: 10.1016/j.matt.2023.02.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Nanomedicines have transformed promising therapeutic agents into clinically approved medicines with optimal safety and efficacy profiles. This is exemplified by the mRNA vaccines against COVID-19, which were made possible by lipid nanoparticle technology. Despite the success of nanomedicines to date, their design remains far from trivial, in part due to the complexity associated with their preclinical development. Herein, we propose a nanomedicine materials acceleration platform (NanoMAP) to streamline the preclinical development of these formulations. NanoMAP combines high-throughput experimentation with state-of-the-art advances in artificial intelligence (including active learning and few-shot learning) as well as a web-based application for data sharing. The deployment of NanoMAP requires interdisciplinary collaboration between leading figures in drug delivery and artificial intelligence to enable this data-driven design approach. The proposed approach will not only expedite the development of next-generation nanomedicines but also encourage participation of the pharmaceutical science community in a large data curation initiative.
Collapse
Affiliation(s)
- Riley J Hickman
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada
| | - Pauric Bannigan
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Zeqing Bao
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON M5S 1M1, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada
- Department of Materials Science & Engineering, University of Toronto, Toronto, ON M5S 3E4, Canada
- CIFAR Artificial Intelligence Research Chair, Vector Institute, Toronto, ON M5S 1M1, Canada
| | - Christine Allen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| |
Collapse
|
36
|
Volk AA, Epps RW, Yonemoto DT, Masters BS, Castellano FN, Reyes KG, Abolhasani M. AlphaFlow: autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning. Nat Commun 2023; 14:1403. [PMID: 36918561 PMCID: PMC10015005 DOI: 10.1038/s41467-023-37139-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 03/02/2023] [Indexed: 03/16/2023] Open
Abstract
Closed-loop, autonomous experimentation enables accelerated and material-efficient exploration of large reaction spaces without the need for user intervention. However, autonomous exploration of advanced materials with complex, multi-step processes and data sparse environments remains a challenge. In this work, we present AlphaFlow, a self-driven fluidic lab capable of autonomous discovery of complex multi-step chemistries. AlphaFlow uses reinforcement learning integrated with a modular microdroplet reactor capable of performing reaction steps with variable sequence, phase separation, washing, and continuous in-situ spectral monitoring. To demonstrate the power of reinforcement learning toward high dimensionality multi-step chemistries, we use AlphaFlow to discover and optimize synthetic routes for shell-growth of core-shell semiconductor nanoparticles, inspired by colloidal atomic layer deposition (cALD). Without prior knowledge of conventional cALD parameters, AlphaFlow successfully identified and optimized a novel multi-step reaction route, with up to 40 parameters, that outperformed conventional sequences. Through this work, we demonstrate the capabilities of closed-loop, reinforcement learning-guided systems in exploring and solving challenges in multi-step nanoparticle syntheses, while relying solely on in-house generated data from a miniaturized microfluidic platform. Further application of AlphaFlow in multi-step chemistries beyond cALD can lead to accelerated fundamental knowledge generation as well as synthetic route discoveries and optimization.
Collapse
Affiliation(s)
- Amanda A Volk
- Department of Chemical and Biomolecular Engineering, North Carolina State University, 911 Partners Way, Raleigh, NC, 27695-7905, USA
| | - Robert W Epps
- Department of Chemical and Biomolecular Engineering, North Carolina State University, 911 Partners Way, Raleigh, NC, 27695-7905, USA
| | - Daniel T Yonemoto
- Department of Chemistry, North Carolina State University, Raleigh, NC, 27695-8204, USA
| | - Benjamin S Masters
- Department of Chemistry, North Carolina State University, Raleigh, NC, 27695-8204, USA
| | - Felix N Castellano
- Department of Chemistry, North Carolina State University, Raleigh, NC, 27695-8204, USA
| | - Kristofer G Reyes
- Department of Materials Design and Innovation, University at Buffalo, Buffalo, NY, 14260, USA
| | - Milad Abolhasani
- Department of Chemical and Biomolecular Engineering, North Carolina State University, 911 Partners Way, Raleigh, NC, 27695-7905, USA.
| |
Collapse
|
37
|
Noto N, Yada A, Yanai T, Saito S. Machine-Learning Classification for the Prediction of Catalytic Activity of Organic Photosensitizers in the Nickel(II)-Salt-Induced Synthesis of Phenols. Angew Chem Int Ed Engl 2023; 62:e202219107. [PMID: 36645619 DOI: 10.1002/anie.202219107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 01/15/2023] [Accepted: 01/16/2023] [Indexed: 01/17/2023]
Abstract
Catalytic systems using a small amount of organic photosensitizer for the activation of an inorganic (on-demand ligand-free) nickel(II) salt represent a cost-effective method for cross-coupling reactions, while C(sp2 )-O bond formation remains less developed. Herein, we report a strategy for the synthesis of phenols with a nickel(II) salt and an organic photosensitizer, which was identified via an investigation into the catalytic activity of 60 organic photosensitizers consisting of various electron donor and acceptor moieties. To examine the effect of multiple intractable parameters on the catalytic activity of photosensitizers, machine-learning (ML) models were developed, wherein we embedded descriptors representing their physical and structural properties, which were obtained from DFT calculations and RDKit, respectively. The study clarified that integrating both DFT- and RDKit-derived descriptors in ML models balances higher "precision" and "recall" across a wide range of search space relative to using only one of the two descriptor sets.
Collapse
Affiliation(s)
- Naoki Noto
- Integrated Research Consortium on Chemical Sciences (IRCCS), Nagoya University, Nagoya, Aichi, 464-8602, Japan
| | - Akira Yada
- Interdisciplinary Research Center for Catalytic Chemistry, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Higashi, Tsukuba, Ibaraki, 305-8565, Japan
| | - Takeshi Yanai
- Institute of Transformative Bio-Molecules (WPI-ITbM) and Graduate School of Science, Nagoya University, Nagoya, Aichi, 464-8602, Japan
| | - Susumu Saito
- Integrated Research Consortium on Chemical Sciences (IRCCS) and Graduate School of Science, Nagoya University, Nagoya, Aichi, 464-8602, Japan
| |
Collapse
|
38
|
Haas C, Lübbesmeyer M, Jin EH, McDonald MA, Koscher BA, Guimond N, Di Rocco L, Kayser H, Leweke S, Niedenführ S, Nicholls R, Greeves E, Barber DM, Hillenbrand J, Volpin G, Jensen KF. Open-Source Chromatographic Data Analysis for Reaction Optimization and Screening. ACS CENTRAL SCIENCE 2023; 9:307-317. [PMID: 36844498 PMCID: PMC9951288 DOI: 10.1021/acscentsci.2c01042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Indexed: 06/18/2023]
Abstract
Automation and digitalization solutions in the field of small molecule synthesis face new challenges for chemical reaction analysis, especially in the field of high-performance liquid chromatography (HPLC). Chromatographic data remains locked in vendors' hardware and software components, limiting their potential in automated workflows and data science applications. In this work, we present an open-source Python project called MOCCA for the analysis of HPLC-DAD (photodiode array detector) raw data. MOCCA provides a comprehensive set of data analysis features, including an automated peak deconvolution routine of known signals, even if overlapped with signals of unexpected impurities or side products. We highlight the broad applicability of MOCCA in four studies: (i) a simulation study to validate MOCCA's data analysis features; (ii) a reaction kinetics study on a Knoevenagel condensation reaction demonstrating MOCCA's peak deconvolution feature; (iii) a closed-loop optimization study for the alkylation of 2-pyridone without human control during data analysis; (iv) a well plate screening of categorical reaction parameters for a novel palladium-catalyzed cyanation of aryl halides employing O-protected cyanohydrins. By publishing MOCCA as a Python package with this work, we envision an open-source community project for chromatographic data analysis with the potential of further advancing its scope and capabilities.
Collapse
Affiliation(s)
- Christian
P. Haas
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Maximilian Lübbesmeyer
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Edward H. Jin
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Matthew A. McDonald
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Brent A. Koscher
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Nicolas Guimond
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Alfred-Nobel-Straße 50, 40789 Monheim am Rhein, Germany
| | - Laura Di Rocco
- Chemical
& Pharmaceutical Development, Bayer
AG, Pharmaceuticals Division, Müllerstraße 178, 13353 Berlin, Germany
| | - Henning Kayser
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Alfred-Nobel-Straße 50, 40789 Monheim am Rhein, Germany
| | - Samuel Leweke
- Applied
Mathematics, Bayer AG, Enabling Functions
Division, Kaiser-Wilhelm-Allee
1, 51368 Leverkusen, Germany
| | - Sebastian Niedenführ
- Research
and Development, Computational Life Science, Bayer AG, Crop Science Division, Alfred-Nobel-Straße 50, 40789 Monheim am Rhein, Germany
| | - Rachel Nicholls
- Research
and Development, Computational Life Science, Bayer AG, Crop Science Division, Alfred-Nobel-Straße 50, 40789 Monheim am Rhein, Germany
| | - Emily Greeves
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - David M. Barber
- Research
and Development, Weed Control Chemistry, Bayer AG, Crop Science Division, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Julius Hillenbrand
- Chemical
& Pharmaceutical Development, Bayer
AG, Pharmaceuticals Division, Friedrich-Ebert-Straße 475, 42117 Wuppertal, Germany
| | - Giulio Volpin
- Research
and Development, Small Molecules Technologies, Bayer AG, Crop Science Division, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Klavs F. Jensen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
39
|
Zhang S, Xu L, Li S, Oliveira JCA, Li X, Ackermann L, Hong X. Bridging Chemical Knowledge and Machine Learning for Performance Prediction of Organic Synthesis. Chemistry 2023; 29:e202202834. [PMID: 36206170 PMCID: PMC10099903 DOI: 10.1002/chem.202202834] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Indexed: 11/29/2022]
Abstract
Recent years have witnessed a boom of machine learning (ML) applications in chemistry, which reveals the potential of data-driven prediction of synthesis performance. Digitalization and ML modelling are the key strategies to fully exploit the unique potential within the synergistic interplay between experimental data and the robust prediction of performance and selectivity. A series of exciting studies have demonstrated the importance of chemical knowledge implementation in ML, which improves the model's capability for making predictions that are challenging and often go beyond the abilities of human beings. This Minireview summarizes the cutting-edge embedding techniques and model designs in synthetic performance prediction, elaborating how chemical knowledge can be incorporated into machine learning until June 2022. By merging organic synthesis tactics and chemical informatics, we hope this Review can provide a guide map and intrigue chemists to revisit the digitalization and computerization of organic chemistry principles.
Collapse
Affiliation(s)
- Shuo‐Qing Zhang
- Center of Chemistry for Frontier TechnologiesDepartment of ChemistryState Key Laboratory of Clean Energy UtilizationZhejiang University38 Zheda RoadHangzhou310027P. R. China
| | - Li‐Cheng Xu
- Center of Chemistry for Frontier TechnologiesDepartment of ChemistryState Key Laboratory of Clean Energy UtilizationZhejiang University38 Zheda RoadHangzhou310027P. R. China
| | - Shu‐Wen Li
- Center of Chemistry for Frontier TechnologiesDepartment of ChemistryState Key Laboratory of Clean Energy UtilizationZhejiang University38 Zheda RoadHangzhou310027P. R. China
| | - João C. A. Oliveira
- Institut für Organische und Biomolekulare ChemieWöhler Research Institute for Sustainable Chemistry (WISCh)Georg-August-UniversitätTammannstraße 237077GöttingenGermany
| | - Xin Li
- Center of Chemistry for Frontier TechnologiesDepartment of ChemistryState Key Laboratory of Clean Energy UtilizationZhejiang University38 Zheda RoadHangzhou310027P. R. China
| | - Lutz Ackermann
- Institut für Organische und Biomolekulare ChemieWöhler Research Institute for Sustainable Chemistry (WISCh)Georg-August-UniversitätTammannstraße 237077GöttingenGermany
| | - Xin Hong
- Center of Chemistry for Frontier TechnologiesDepartment of ChemistryState Key Laboratory of Clean Energy UtilizationZhejiang University38 Zheda RoadHangzhou310027P. R. China
- Beijing National Laboratory for Molecular SciencesZhongguancun North First Street No. 2Beijing100190P. R. China
- Key Laboratory of Precise Synthesis ofFunctional Molecules of Zhejiang ProvinceSchool of ScienceWestlake University18 Shilongshan RoadHangzhou310024Zhejiang ProvinceP. R. China
| |
Collapse
|
40
|
Sato E, Tachiwaki G, Fujii M, Mitsudo K, Washio T, Takizawa S, Suga S. Electrochemical Carbon-Ferrier Rearrangement Using a Microflow Reactor and Machine Learning-Assisted Exploration of Suitable Conditions. Org Process Res Dev 2023. [DOI: 10.1021/acs.oprd.2c00267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Eisuke Sato
- Faculty of Natural Science and Technology, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan
| | - Gaku Tachiwaki
- Faculty of Natural Science and Technology, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan
| | - Mayu Fujii
- Faculty of Natural Science and Technology, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan
| | - Koichi Mitsudo
- Faculty of Natural Science and Technology, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan
| | - Takashi Washio
- Department of Reasoning for Intelligence, SANKEN, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Artificial Intelligence Research Center, SANKEN, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
| | - Shinobu Takizawa
- Department of Reasoning for Intelligence, SANKEN, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Department of Synthetic Organic Chemistry, SANKEN, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
| | - Seiji Suga
- Faculty of Natural Science and Technology, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan
| |
Collapse
|
41
|
A Review on Artificial Intelligence Enabled Design, Synthesis, and Process Optimization of Chemical Products for Industry 4.0. Processes (Basel) 2023. [DOI: 10.3390/pr11020330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
With the development of Industry 4.0, artificial intelligence (AI) is gaining increasing attention for its performance in solving particularly complex problems in industrial chemistry and chemical engineering. Therefore, this review provides an overview of the application of AI techniques, in particular machine learning, in chemical design, synthesis, and process optimization over the past years. In this review, the focus is on the application of AI for structure-function relationship analysis, synthetic route planning, and automated synthesis. Finally, we discuss the challenges and future of AI in making chemical products.
Collapse
|
42
|
Kowalski D, MacGregor CM, Long DL, Bell NL, Cronin L. Automated Library Generation and Serendipity Quantification Enables Diverse Discovery in Coordination Chemistry. J Am Chem Soc 2023; 145:2332-2341. [PMID: 36649125 PMCID: PMC9896557 DOI: 10.1021/jacs.2c11066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Library generation experiments are a key part of the discovery of new materials, methods, and models in chemistry, but the question of how to generate high quality libraries to enable discovery is nontrivial. Herein, we use coordination chemistry to demonstrate the automation of many of the workflows used for library generation in automated hardware including the Chemputer. First, we explore the target-oriented synthesis of three influential coordination complexes, to validate key synthetic operations in our system; second, the generation of focused libraries in chemical and process space; and third, the development of a new workflow for prospecting library formation. This involved Bayesian optimization using a Gaussian process as surrogate model combined with a metric for novelty (or serendipity) quantification based on mass spectrometry data. In this way, we show directed exploration of a process space toward those areas with rarer observations and build a picture of the diversity in product distributions present across the space. We show that this effectively "engineers" serendipity into our search through the unexpected appearance of acetic anhydride, formed in situ, and solvent degradation products as ligands in an isolable series of three Co(III) anhydride complexes.
Collapse
|
43
|
Tu Z, Stuyver T, Coley CW. Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery. Chem Sci 2023; 14:226-244. [PMID: 36743887 PMCID: PMC9811563 DOI: 10.1039/d2sc05089g] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 11/25/2022] [Indexed: 11/29/2022] Open
Abstract
The field of predictive chemistry relates to the development of models able to describe how molecules interact and react. It encompasses the long-standing task of computer-aided retrosynthesis, but is far more reaching and ambitious in its goals. In this review, we summarize several areas where predictive chemistry models hold the potential to accelerate the deployment, development, and discovery of organic reactions and advance synthetic chemistry.
Collapse
Affiliation(s)
- Zhengkai Tu
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA
| | - Thijs Stuyver
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA
| | - Connor W Coley
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA
| |
Collapse
|
44
|
Duan C, Nandy A, Meyer R, Arunachalam N, Kulik HJ. A transferable recommender approach for selecting the best density functional approximations in chemical discovery. NATURE COMPUTATIONAL SCIENCE 2023; 3:38-47. [PMID: 38177951 DOI: 10.1038/s43588-022-00384-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 11/23/2022] [Indexed: 01/06/2024]
Abstract
Approximate density functional theory has become indispensable owing to its balanced cost-accuracy trade-off, including in large-scale screening. To date, however, no density functional approximation (DFA) with universal accuracy has been identified, leading to uncertainty in the quality of data generated from density functional theory. With electron density fitting and Δ-learning, we build a DFA recommender that selects the DFA with the lowest expected error with respect to the gold standard (but cost-prohibitive) coupled cluster theory in a system-specific manner. We demonstrate this recommender approach on the evaluation of vertical spin splitting energies of transition metal complexes. Our recommender predicts top-performing DFAs and yields excellent accuracy (about 2 kcal mol-1) for chemical discovery, outperforming both individual Δ-learning models and the best conventional single-functional approach from a set of 48 DFAs. By demonstrating transferability to diverse synthesized compounds, our recommender potentially addresses the accuracy versus scope dilemma broadly encountered in computational chemistry.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ralf Meyer
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Naveen Arunachalam
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
45
|
Rakhimbekova A, Lopukhov A, Klyachko N, Kabanov A, Madzhidov TI, Tropsha A. Efficient design of peptide-binding polymers using active learning approaches. J Control Release 2023; 353:903-914. [PMID: 36402234 DOI: 10.1016/j.jconrel.2022.11.023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 10/21/2022] [Accepted: 11/13/2022] [Indexed: 12/23/2022]
Abstract
Active learning (AL) has become a subject of active recent research both in industry and academia as an efficient approach for rapid design and discovery of novel chemicals, materials, and polymers. Herein, we have assessed the applicability of AL for the discovery of polymeric micelle formulations for poorly soluble drugs. We were motivated by the key advantages of this approach making it a desirable strategy for rational design of drug delivery systems due toto its ability to (i) employ relatively small datasets for model development, (ii) iterate between model development and model assessment using small external datasets that can be either generated in focused experimental studies or formed from subsets of the initial training data, and (iii) progressively evolve models towards increasingly more reliable predictions and the identification of novel chemicals with the desired properties. In this study, we compared various AL protocols for their effectiveness in finding biologically active molecules using synthetic datasets. We have investigated the dependency of AL performance on the size of the initial training set, the relative complexity of the task, and the choice of the initial training dataset. We found that AL techniques as applied to regression modeling offer no benefits over random search, while AL used for classification tasks performs better than models built for randomly selected training sets but still quite far from perfect. Using the best performing AL protocol,. Finally, the best performing AL approach was employed to discover and experimentally validate novel binding polymers for a case study of asialoglycoprotein receptor (ASGPR).
Collapse
Affiliation(s)
- Assima Rakhimbekova
- A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| | - Anton Lopukhov
- Laboratory of Chemical Design of Bionanomaterials, Faculty of Chemistry, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - Natalia Klyachko
- Laboratory of Chemical Design of Bionanomaterials, Faculty of Chemistry, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - Alexander Kabanov
- Laboratory of Chemical Design of Bionanomaterials, Faculty of Chemistry, M.V. Lomonosov Moscow State University, Moscow, Russia; Center for Nanotechnology in Drug Delivery, Division of Pharmacoengineering and Molecular Pharmaceutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, NC, USA
| | - Timur I Madzhidov
- A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599, USA.
| |
Collapse
|
46
|
Lansford JL, Barnes BC, Rice BM, Jensen KF. Building Chemical Property Models for Energetic Materials from Small Datasets Using a Transfer Learning Approach. J Chem Inf Model 2022; 62:5397-5410. [PMID: 36240441 DOI: 10.1021/acs.jcim.2c00841] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
For many experimentally measured chemical properties that cannot be directly computed from first-principles, the existing physics-based models do not extrapolate well to out-of-sample molecules, and experimental datasets themselves are too small for traditional machine learning (ML) approaches. To overcome these limitations, we apply a transfer learning approach, whereby we simultaneously train a multi-target regression model on a small number of molecules with experimentally measured values and a large number of molecules with related computed properties. We demonstrate this methodology on predicting the experimentally measured impact sensitivity of energetic crystals, finding that both characteristics of the computed dataset and model architecture are important to prediction accuracy of the small experimental dataset. Our directed-message passing neural network (D-MPNN) ML model using transfer learning outperforms direct-ML and physics-based models on a diverse test set, and the new methods described here are widely applicable to modeling many other structure-property relationships.
Collapse
Affiliation(s)
- Joshua L Lansford
- U.S. Army Combat Capabilities Development Command (DEVCOM) Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States.,Department of Chemical Engineering, MIT, Cambridge, Massachusetts 02139, United States
| | - Brian C Barnes
- U.S. Army Combat Capabilities Development Command (DEVCOM) Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Betsy M Rice
- U.S. Army Combat Capabilities Development Command (DEVCOM) Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Klavs F Jensen
- Department of Chemical Engineering, MIT, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
47
|
Shirasawa R, Takemura I, Hattori S, Nagata Y. A semi-automated material exploration scheme to predict the solubilities of tetraphenylporphyrin derivatives. Commun Chem 2022; 5:158. [PMID: 36697881 PMCID: PMC9814751 DOI: 10.1038/s42004-022-00770-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/04/2022] [Indexed: 11/24/2022] Open
Abstract
Acceleration of material discovery has been tackled by informatics and laboratory automation. Here we show a semi-automated material exploration scheme to modelize the solubility of tetraphenylporphyrin derivatives. The scheme involved the following steps: definition of a practical chemical search space, prioritization of molecules in the space using an extended algorithm for submodular function maximization without requiring biased variable selection or pre-existing data, synthesis & automated measurement, and machine-learning model estimation. The optimal evaluation order selected using the algorithm covered several similar molecules (32% of all targeted molecules, whereas that obtained by random sampling and uncertainty sampling was ~7% and ~4%, respectively) with a small number of evaluations (10 molecules: 0.13% of all targeted molecules). The derived binary classification models predicted 'good solvents' with an accuracy >0.8. Overall, we confirmed the effectivity of the proposed semi-automated scheme in early-stage material search projects for accelerating a wider range of material research.
Collapse
Affiliation(s)
- Raku Shirasawa
- Advanced Research Laboratory, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan.
| | - Ichiro Takemura
- Tokyo Laboratory 26, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan
| | - Shinnosuke Hattori
- Advanced Research Laboratory, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan
| | - Yuuya Nagata
- Institute for Chemical Reaction Design and Discovery, Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.
| |
Collapse
|
48
|
Le Pogam P, Papon N, Beniddir MA, Courdavault V. Computer-Assisted Design of Sustainable Syntheses of Pharmaceuticals and Agrochemicals from Industrial Wastes. CHEMSUSCHEM 2022; 15:e202201125. [PMID: 35894947 DOI: 10.1002/cssc.202201125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/24/2022] [Indexed: 06/15/2023]
Abstract
Computer-based strategies vastly enhanced the field of analytical chemistry. The impact of data-driven technologies in shaping organic chemistry strategies long remained comparatively elusive but various tools recently emerged to computationally plan multistep organic syntheses. A recent study elegantly takes benefit of an in-house library of chemical reactions enriched with various metadata to provide numerous, reliable and realistic organic chemistry workflows to structurally-varied drugs of interest, from locally available industrial by-products. The retrieval of the different synthetic pathways and a scoring based on different features, especially comprising sustainability considerations, are also proposed.
Collapse
Affiliation(s)
- Pierre Le Pogam
- Équipe Chimie des Substances Naturelles, BioCIS, Université Paris-Saclay, CNRS, 92290, Châtenay-Malabry, France
| | - Nicolas Papon
- Univ Angers, Univ Brest, IRF, SFR ICAT, F-49000, Angers, France
| | - Mehdi A Beniddir
- Équipe Chimie des Substances Naturelles, BioCIS, Université Paris-Saclay, CNRS, 92290, Châtenay-Malabry, France
| | - Vincent Courdavault
- Biomolécules et Biotechnologies Végétales, BBV, EA2106, Université de Tours, 37200, Tours, France
| |
Collapse
|
49
|
Moradi S, Kundu S, Saidaminov MI. High-Throughput Synthesis of Thin Films for the Discovery of Energy Materials: A Perspective. ACS MATERIALS AU 2022; 2:516-524. [PMID: 36124002 PMCID: PMC9479136 DOI: 10.1021/acsmaterialsau.2c00028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
Thin films are an
integral part of many electronic and optoelectronic
devices. They also provide an excellent platform for material characterization.
Therefore, strategies for the fabrication of thin films are constantly
developed and have significantly benefited from the advent of high-throughput
synthesis (HTS) platforms. This perspective summarizes recent advances
in HTS of thin films from experimentalists’ point of view.
The work analyzes general strategies of HTS and then discusses their
use in developing new energy materials for applications that rely
on thin films, such as solar cells, light-emitting diodes, batteries,
superconductors, and thermoelectrics. The perspective also summarizes
some key challenges and opportunities in the HTS of thin films.
Collapse
Affiliation(s)
- Shahram Moradi
- Department of Electrical & Computer Engineering, University of Victoria, 3800 Finnerty Road, Victoria, British Columbia V8P 5C2, Canada
| | - Soumya Kundu
- Department of Chemistry, University of Victoria, 3800 Finnerty Road, Victoria, British Columbia V8P 5C2, Canada
| | - Makhsud I. Saidaminov
- Department of Electrical & Computer Engineering, University of Victoria, 3800 Finnerty Road, Victoria, British Columbia V8P 5C2, Canada
- Department of Chemistry, University of Victoria, 3800 Finnerty Road, Victoria, British Columbia V8P 5C2, Canada
- Centre for Advanced Materials and Related Technologies (CAMTEC), University of Victoria, 3800 Finnerty Road, Victoria, British Columbia V8P 5C2, Canada
| |
Collapse
|
50
|
Spenke F, Hartke B. Graph-based Automated Macro-Molecule Assembly. J Chem Inf Model 2022; 62:3714-3723. [PMID: 35938711 DOI: 10.1021/acs.jcim.2c00609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a general molecular framework assembly algorithm that takes a largely arbitrary molecular fragment database and a user-supplied target template graph as input. Automatic assembly of molecular fragments from the database, following a prescribed, user-supplied set of connection rules, then turns the template graph into an actual, chemically reasonable molecular framework. Assembly capabilities of our algorithm are tested by producing several abstract, closed-loop shapes. To indicate a few of many possible application areas we demonstrate a host-guest complex and a road toward catalysis. Postassembly substituent exchange can be used to produce electric fields of desired values at desired points inside the framework or at its surface as a stepping stone toward rationally designed, artificial heterogeneous catalysts.
Collapse
Affiliation(s)
- Florian Spenke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| | - Bernd Hartke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| |
Collapse
|