1
|
Amar Y, Schweidtmann AM, Deutsch P, Cao L, Lapkin A. Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis. Chem Sci 2019; 10:6697-6706. [PMID: 31367324 PMCID: PMC6625492 DOI: 10.1039/c9sc01844a] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 05/28/2019] [Indexed: 12/19/2022] Open
Abstract
Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction-specific descriptors, and additional descriptors based on screening charge density, were calculated. Gaussian process surrogate models were trained on experimental data from a Rh(CO)2(acac)/Josiphos catalysed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam. With two simultaneous objectives - high conversion and high diastereomeric excess - the multi-objective algorithm, trained on the initial dataset of 25 solvents, has identified solvents leading to better reaction outcomes. In addition to being a powerful design of experiments (DoE) methodology, the resulting Gaussian process surrogate model for conversion is, in statistical terms, predictive, with a cross-validation correlation coefficient of 0.84. After identifying promising solvents, the composition of solvent mixtures and optimal reaction temperature were found using a black-box Bayesian optimisation. We then demonstrated the application of a new genetic programming approach to select an appropriate machine learning model for a specific physical system, which should allow the transition of the overall process development workflow into the future robotic laboratories.
Collapse
|
research-article |
6 |
50 |
2
|
Rall D, Menne D, Schweidtmann AM, Kamp J, von Kolzenberg L, Mitsos A, Wessling M. Rational design of ion separation membranes. J Memb Sci 2019. [DOI: 10.1016/j.memsci.2018.10.013] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
|
6 |
34 |
3
|
Schweidtmann AM, Huster WR, Lüthje JT, Mitsos A. Deterministic global process optimization: Accurate (single-species) properties via artificial neural networks. Comput Chem Eng 2019. [DOI: 10.1016/j.compchemeng.2018.10.007] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
|
6 |
33 |
4
|
Lee U, Burre J, Caspari A, Kleinekorte J, Schweidtmann AM, Mitsos A. Techno-economic Optimization of a Green-Field Post-Combustion CO2 Capture Process Using Superstructure and Rate-Based Models. Ind Eng Chem Res 2016. [DOI: 10.1021/acs.iecr.6b01668] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
|
9 |
20 |
5
|
Rall D, Schweidtmann AM, Aumeier BM, Kamp J, Karwe J, Ostendorf K, Mitsos A, Wessling M. Simultaneous rational design of ion separation membranes and processes. J Memb Sci 2020. [DOI: 10.1016/j.memsci.2020.117860] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
|
5 |
17 |
6
|
Schweidtmann AM, Esche E, Fischer A, Kloft M, Repke J, Sager S, Mitsos A. Machine Learning in Chemical Engineering: A Perspective. CHEM-ING-TECH 2021. [DOI: 10.1002/cite.202100083] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
|
4 |
15 |
7
|
Huster WR, Schweidtmann AM, Lüthje JT, Mitsos A. Deterministic global superstructure-based optimization of an organic Rankine cycle. Comput Chem Eng 2020. [DOI: 10.1016/j.compchemeng.2020.106996] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
|
5 |
13 |
8
|
Helmdach D, Yaseneva P, Heer PK, Schweidtmann AM, Lapkin AA. A Multiobjective Optimization Including Results of Life Cycle Assessment in Developing Biorenewables-Based Processes. CHEMSUSCHEM 2017; 10:3632-3643. [PMID: 28714562 DOI: 10.1002/cssc.201700927] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 07/13/2017] [Indexed: 06/07/2023]
Abstract
A decision support tool has been developed that uses global multiobjective optimization based on 1) the environmental impacts, evaluated within the framework of full life cycle assessment; and 2) process costs, evaluated by using rigorous process models. This approach is particularly useful in developing biorenewable-based energy solutions and chemicals manufacturing, for which multiple criteria must be evaluated and optimization-based decision-making processes are particularly attractive. The framework is demonstrated by using a case study of the conversion of terpenes derived from biowaste feedstocks into reactive intermediates. A two-step chemical conversion/separation sequence was implemented as a rigorous process model and combined with a life cycle model. A life cycle inventory for crude sulfate turpentine was developed, as well as a conceptual process of its separation into pure terpene feedstocks. The performed single- and multiobjective optimizations demonstrate the functionality of the optimization-based process development and illustrate the approach. The most significant advance is the ability to perform multiobjective global optimization, resulting in identification of a region of Pareto-optimal solutions.
Collapse
|
|
8 |
11 |
9
|
Schäfer P, Caspari A, Schweidtmann AM, Vaupel Y, Mhamdi A, Mitsos A. The Potential of Hybrid Mechanistic/Data‐Driven Approaches for Reduced Dynamic Modeling: Application to Distillation Columns. CHEM-ING-TECH 2020. [DOI: 10.1002/cite.202000048] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
|
5 |
9 |
10
|
Weber JM, Guo Z, Zhang C, Schweidtmann AM, Lapkin AA. Chemical data intelligence for sustainable chemistry. Chem Soc Rev 2021; 50:12013-12036. [PMID: 34520507 DOI: 10.1039/d1cs00477h] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
This study highlights new opportunities for optimal reaction route selection from large chemical databases brought about by the rapid digitalisation of chemical data. The chemical industry requires a transformation towards more sustainable practices, eliminating its dependencies on fossil fuels and limiting its impact on the environment. However, identifying more sustainable process alternatives is, at present, a cumbersome, manual, iterative process, based on chemical intuition and modelling. We give a perspective on methods for automated discovery and assessment of competitive sustainable reaction routes based on renewable or waste feedstocks. Three key areas of transition are outlined and reviewed based on their state-of-the-art as well as bottlenecks: (i) data, (ii) evaluation metrics, and (iii) decision-making. We elucidate their synergies and interfaces since only together these areas can bring about the most benefit. The field of chemical data intelligence offers the opportunity to identify the inherently more sustainable reaction pathways and to identify opportunities for a circular chemical economy. Our review shows that at present the field of data brings about most bottlenecks, such as data completion and data linkage, but also offers the principal opportunity for advancement.
Collapse
|
Review |
4 |
8 |
11
|
König A, Siska M, Schweidtmann AM, Rittig JG, Viell J, Mitsos A, Dahmen M. Designing production-optimal alternative fuels for conventional, flexible-fuel, and ultra-high efficiency engines. Chem Eng Sci 2021. [DOI: 10.1016/j.ces.2021.116562] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
|
4 |
5 |
12
|
Schäfer P, Schweidtmann AM, Mitsos A. Nonlinear scheduling with time‐variable electricity prices using sensitivity‐based truncations of wavelet transforms. AIChE J 2020. [DOI: 10.1002/aic.16986] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
|
5 |
3 |
13
|
Jorayev P, Russo D, Tibbetts JD, Schweidtmann AM, Deutsch P, Bull SD, Lapkin AA. Multi-objective Bayesian optimisation of a two-step synthesis of p-cymene from crude sulphate turpentine. Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2021.116938] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
|
3 |
3 |
14
|
Liebal UW, Köbbing S, Netze L, Schweidtmann AM, Mitsos A, Blank LM. Insight to Gene Expression From Promoter Libraries With the Machine Learning Workflow Exp2Ipynb. FRONTIERS IN BIOINFORMATICS 2021; 1:747428. [PMID: 36303772 PMCID: PMC9581000 DOI: 10.3389/fbinf.2021.747428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 09/23/2021] [Indexed: 11/16/2022] Open
Abstract
Metabolic engineering relies on modifying gene expression to regulate protein concentrations and reaction activities. The gene expression is controlled by the promoter sequence, and sequence libraries are used to scan expression activities and to identify correlations between sequence and activity. We introduce a computational workflow called Exp2Ipynb to analyze promoter libraries maximizing information retrieval and promoter design with desired activity. We applied Exp2Ipynb to seven prokaryotic expression libraries to identify optimal experimental design principles. The workflow is open source, available as Jupyter Notebooks and covers the steps to 1) generate a statistical overview to sequence and activity, 2) train machine-learning algorithms, such as random forest, gradient boosting trees and support vector machines, for prediction and extraction of feature importance, 3) evaluate the performance of the estimator, and 4) to design new sequences with a desired activity using numerical optimization. The workflow can perform regression or classification on multiple promoter libraries, across species or reporter proteins. The most accurate predictions in the sample libraries were achieved when the promoters in the library were recognized by a single sigma factor and a unique reporter system. The prediction confidence mostly depends on sample size and sequence diversity, and we present a relationship to estimate their respective effects. The workflow can be adapted to process sequence libraries from other expression-related problems and increase insight to the growing application of high-throughput experiments, providing support for efficient strain engineering.
Collapse
|
Review |
4 |
1 |
15
|
Vogel G, Schulze Balhorn L, Schweidtmann AM. Learning from flowsheets: A generative transformer model for autocompletion of flowsheets. Comput Chem Eng 2023. [DOI: 10.1016/j.compchemeng.2023.108162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
|
2 |
|
16
|
Rittig JG, Ritzert M, Schweidtmann AM, Winkler S, Weber JM, Morsch P, Heufer KA, Grohe M, Mitsos A, Dahmen M. Graph Machine Learning for Design of High‐Octane Fuels. AIChE J 2022. [DOI: 10.1002/aic.17971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
|
3 |
|
17
|
Stops L, Leenhouts R, Gao Q, Schweidtmann AM. Flowsheet generation through hierarchical reinforcement learning and graph neural networks. AIChE J 2022. [DOI: 10.1002/aic.17938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
|
3 |
|
18
|
Stocker M, Heger T, Schweidtmann A, Ćwiek-Kupczyńska H, Penev L, Dojchinovski M, Willighagen E, Vidal ME, Turki H, Balliet D, Tiddi I, Kuhn T, Mietchen D, Karras O, Vogt L, Hellmann S, Jeschke J, Krajewski P, Auer S. SKG4EOSC - Scholarly Knowledge Graphs for EOSC: Establishing a backbone of knowledge graphs for FAIR Scholarly Information in EOSC. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e83789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In the age of advanced information systems powering fast-paced knowledge economies that face global societal challenges, it is no longer adequate to express scholarly information - an essential resource for modern economies - primarily as article narratives in document form. Despite being a well-established tradition in scholarly communication, PDF-based text publishing is hindering scientific progress as it buries scholarly information into non-machine-readable formats. The key objective of SKG4EOSC is to improve science productivity through development and implementation of services for text and data conversion, and production, curation, and re-use of FAIR scholarly information. This will be achieved by (1) establishing the Open Research Knowledge Graph (ORKG, orkg.org), a service operated by the SKG4EOSC coordinator, as a Hub for access to FAIR scholarly information in the EOSC; (2) lifting to EOSC of numerous and heterogeneous domain-specific research infrastructures through the ORKG Hub’s harmonized access facilities; and (3) leverage the Hub to support cross-disciplinary research and policy decisions addressing societal challenges. SKG4EOSC will pilot the devised approaches and technologies in four research domains: biodiversity crisis, precision oncology, circular processes, and human cooperation. With the aim to improve machine-based scholarly information use, SKG4EOSC addresses an important current and future need of researchers. It extends the application of the FAIR data principles to scholarly communication practices, hence a more comprehensive coverage of the entire research lifecycle. Through explicit, machine actionable provenance links between FAIR scholarly information, primary data and contextual entities, it will substantially contribute to reproducibility, validation and trust in science. The resulting advanced machine support will catalyse new discoveries in basic research and solutions in key application areas.
Collapse
|
|
3 |
|
19
|
Schulze Balhorn L, Weber JM, Buijsman S, Hildebrandt JR, Ziefle M, Schweidtmann AM. Empirical assessment of ChatGPT's answering capabilities in natural science and engineering. Sci Rep 2024; 14:4998. [PMID: 38424125 PMCID: PMC10904823 DOI: 10.1038/s41598-024-54936-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 02/19/2024] [Indexed: 03/02/2024] Open
Abstract
ChatGPT is a powerful language model from OpenAI that is arguably able to comprehend and generate text. ChatGPT is expected to greatly impact society, research, and education. An essential step to understand ChatGPT's expected impact is to study its domain-specific answering capabilities. Here, we perform a systematic empirical assessment of its abilities to answer questions across the natural science and engineering domains. We collected 594 questions on natural science and engineering topics from 198 faculty members across five faculties at Delft University of Technology. After collecting the answers from ChatGPT, the participants assessed the quality of the answers using a systematic scheme. Our results show that the answers from ChatGPT are, on average, perceived as "mostly correct". Two major trends are that the rating of the ChatGPT answers significantly decreases (i) as the educational level of the question increases and (ii) as we evaluate skills beyond scientific knowledge, e.g., critical attitude.
Collapse
|
research-article |
1 |
|