1
|
Yang L, Guo Q, Zhang L. AI-assisted chemistry research: a comprehensive analysis of evolutionary paths and hotspots through knowledge graphs. Chem Commun (Camb) 2024; 60:6977-6987. [PMID: 38910536 DOI: 10.1039/d4cc01892c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Artificial intelligence (AI) offers transformative potential for chemical research through its ability to optimize reactions and processes, enhance energy efficiency, and reduce waste. AI-assisted chemical research (AI + chem) has become a global hotspot. To better understand the current research status of "AI + chem", this study conducted a scientific bibliometric investigation using CiteSpace. The web of science core collection was utilized to retrieve original articles related to "AI + chem" published from 2000 to 2024. The obtained data allowed for the visualization of the knowledge background, current research status, and latest knowledge structure of "AI + chem". The "AI + chem" has entered a stage of explosive growth, and the number of papers will maintain long-term high-speed growth. This article systematically analyzes the latest progress in "AI + chem" and objectively predicts future trends, including molecular design, reaction prediction, materials design, drug design, and quantum chemistry. The outcomes of this study will provide readers with a comprehensive understanding of the overall landscape of "AI + chem".
Collapse
Affiliation(s)
- Lin Yang
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Qingle Guo
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Lijing Zhang
- School of Chemistry, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China.
| |
Collapse
|
2
|
Ramos JRC, Pinto J, Poiares-Oliveira G, Peeters L, Dumas P, Oliveira R. Deep hybrid modeling of a HEK293 process: Combining long short-term memory networks with first principles equations. Biotechnol Bioeng 2024; 121:1554-1568. [PMID: 38343176 DOI: 10.1002/bit.28668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 12/22/2023] [Accepted: 01/22/2024] [Indexed: 04/14/2024]
Abstract
The combination of physical equations with deep learning is becoming a promising methodology for bioprocess digitalization. In this paper, we investigate for the first time the combination of long short-term memory (LSTM) networks with first principles equations in a hybrid workflow to describe human embryonic kidney 293 (HEK293) culture dynamics. Experimental data of 27 extracellular state variables in 20 fed-batch HEK293 cultures were collected in a parallel high throughput 250 mL cultivation system in an industrial process development setting. The adaptive moment estimation method with stochastic regularization and cross-validation were employed for deep learning. A total of 784 hybrid models with varying deep neural network architectures, depths, layers sizes and node activation functions were compared. In most scenarios, hybrid LSTM models outperformed classical hybrid Feedforward Neural Network (FFNN) models in terms of training and testing error. Hybrid LSTM models revealed to be less sensitive to data resampling than FFNN hybrid models. As disadvantages, Hybrid LSTM models are in general more complex (higher number of parameters) and have a higher computation cost than FFNN hybrid models. The hybrid model with the highest prediction accuracy consisted in a LSTM network with seven internal states connected in series with dynamic material balance equations. This hybrid model correctly predicted the dynamics of the 27 state variables (R2 = 0.93 in the test data set), including biomass, key substrates, amino acids and metabolic by-products for around 10 cultivation days.
Collapse
Affiliation(s)
- João R C Ramos
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | - José Pinto
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | - Gil Poiares-Oliveira
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | | | | | - Rui Oliveira
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| |
Collapse
|
3
|
Ramírez-Sanz JM, Maestro-Prieto JA, Arnaiz-González Á, Bustillo A. Semi-supervised learning for industrial fault detection and diagnosis: A systemic review. ISA TRANSACTIONS 2023:S0019-0578(23)00434-2. [PMID: 37778919 DOI: 10.1016/j.isatra.2023.09.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/03/2023] [Accepted: 09/22/2023] [Indexed: 10/03/2023]
Abstract
The automation of Fault Detection and Diagnosis (FDD) is a central task for many industries today. A myriad of methods are in use, although the most recent leading contenders are data-driven approaches and especially Machine Learning (ML) methods. ML algorithms fall into two main categories: supervised and unsupervised methods, depending on whether or not the instances are labeled with the expected outputs. However, a new approach called Semi-Supervised Learning (SSL) has recently emerged that uses a few labeled instances together with other unlabeled instances for the training process. This new approach can significantly improve the accuracy of conventional ML models for industrial environments where labeled data are scarce. SSL has been tested as a promising solution over the past few years for several FDD problems, although there have been no systemic reviews of this sort of approach up until the present review. In this study, an attempt to organize the existing literature on SSL for FDD using the taxonomy of van Engelen & Hoos is reported. The most and the least frequently used SSL algorithms are identified and considered in terms of different fault detection tasks and their most common dataset structure. Moreover, a set of best practices are proposed in the conclusions of this work for implementation under real industrial conditions, so as to avoid some of the most common faults.
Collapse
Affiliation(s)
| | | | | | - Andrés Bustillo
- Universidad de Burgos, Avda. Cantabria s/n, Burgos, 09006, Burgos, Spain
| |
Collapse
|
4
|
Pinto J, Ramos JRC, Costa RS, Rossell S, Dumas P, Oliveira R. Hybrid deep modeling of a CHO-K1 fed-batch process: combining first-principles with deep neural networks. Front Bioeng Biotechnol 2023; 11:1237963. [PMID: 37744245 PMCID: PMC10515724 DOI: 10.3389/fbioe.2023.1237963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Accepted: 08/22/2023] [Indexed: 09/26/2023] Open
Abstract
Introduction: Hybrid modeling combining First-Principles with machine learning is becoming a pivotal methodology for Biopharma 4.0 enactment. Chinese Hamster Ovary (CHO) cells, being the workhorse for industrial glycoproteins production, have been the object of several hybrid modeling studies. Most previous studies pursued a shallow hybrid modeling approach based on three-layered Feedforward Neural Networks (FFNNs) combined with macroscopic material balance equations. Only recently, the hybrid modeling field is incorporating deep learning into its framework with significant gains in descriptive and predictive power. Methods: This study compares, for the first time, deep and shallow hybrid modeling in a CHO process development context. Data of 24 fed-batch cultivations of a CHO-K1 cell line expressing a target glycoprotein, comprising 30 measured state variables over time, were used to compare both methodologies. Hybrid models with varying FFNN depths (3-5 layers) were systematically compared using two training methodologies. The classical training is based on the Levenberg-Marquardt algorithm, indirect sensitivity equations and cross-validation. The deep learning is based on the Adaptive Moment Estimation Method (ADAM), stochastic regularization and semidirect sensitivity equations. Results and conclusion: The results point to a systematic generalization improvement of deep hybrid models over shallow hybrid models. Overall, the training and testing errors decreased by 14.0% and 23.6% respectively when applying the deep methodology. The Central Processing Unit (CPU) time for training the deep hybrid model increased by 31.6% mainly due to the higher FFNN complexity. The final deep hybrid model is shown to predict the dynamics of the 30 state variables within the error bounds in every test experiment. Notably, the deep hybrid model could predict the metabolic shifts in key metabolites (e.g., lactate, ammonium, glutamine and glutamate) in the test experiments. We expect deep hybrid modeling to accelerate the deployment of high-fidelity digital twins in the biopharma sector in the near future.
Collapse
Affiliation(s)
- José Pinto
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | - João R. C. Ramos
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | - Rafael S. Costa
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| | | | | | - Rui Oliveira
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, NOVA University Lisbon, Caparica, Portugal
| |
Collapse
|
5
|
Jakab-Nácsa A, Garami A, Fiser B, Farkas L, Viskolcz B. Towards Machine Learning in Heterogeneous Catalysis-A Case Study of 2,4-Dinitrotoluene Hydrogenation. Int J Mol Sci 2023; 24:11461. [PMID: 37511224 PMCID: PMC10380742 DOI: 10.3390/ijms241411461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/22/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
Utilization of multivariate data analysis in catalysis research has extraordinary importance. The aim of the MIRA21 (MIskolc RAnking 21) model is to characterize heterogeneous catalysts with bias-free quantifiable data from 15 different variables to standardize catalyst characterization and provide an easy tool to compare, rank, and classify catalysts. The present work introduces and mathematically validates the MIRA21 model by identifying fundamentals affecting catalyst comparison and provides support for catalyst design. Literature data of 2,4-dinitrotoluene hydrogenation catalysts for toluene diamine synthesis were analyzed by using the descriptor system of MIRA21. In this study, exploratory data analysis (EDA) has been used to understand the relationships between individual variables such as catalyst performance, reaction conditions, catalyst compositions, and sustainable parameters. The results will be applicable in catalyst design, and using machine learning tools will also be possible.
Collapse
Affiliation(s)
- Alexandra Jakab-Nácsa
- BorsodChem Ltd., Bolyai tér 1, H-3700 Kazincbarcika, Hungary
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
| | - Attila Garami
- Institute of Energy, Ceramics and Polymer Technology, University of Miskolc, H-3515 Miskolc, Hungary
| | - Béla Fiser
- Higher Education and Industrial Cooperation Centre, University of Miskolc, H-3515 Miskolc, Hungary
- Ferenc Rakoczi II Transcarpathian Hungarian College of Higher Education, 90200 Beregszász, Transcarpathia, Ukraine
- Department of Physical Chemistry, Faculty of Chemistry, University of Lodz, 90-236 Lodz, Poland
| | - László Farkas
- BorsodChem Ltd., Bolyai tér 1, H-3700 Kazincbarcika, Hungary
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
| | - Béla Viskolcz
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
- Higher Education and Industrial Cooperation Centre, University of Miskolc, H-3515 Miskolc, Hungary
| |
Collapse
|
6
|
Kondo M, Wathsala HDP, Ishikawa K, Yamashita D, Miyazaki T, Ohno Y, Sasai H, Washio T, Takizawa S. Bayesian Optimization-Assisted Screening to Identify Improved Reaction Conditions for Spiro-Dithiolane Synthesis. Molecules 2023; 28:5180. [PMID: 37446842 DOI: 10.3390/molecules28135180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/20/2023] [Accepted: 06/28/2023] [Indexed: 07/15/2023] Open
Abstract
Bayesian optimization (BO)-assisted screening was applied to identify improved reaction conditions toward a hundred-gram scale-up synthesis of 2,3,7,8-tetrathiaspiro[4.4]nonane (1), a key synthetic intermediate of 2,2-bis(mercaptomethyl)propane-1,3-dithiol [tetramercaptan pentaerythritol]. Starting from the initial training set (ITS) consisting of six trials sampled by random screening for BO, suitable parameters were predicted (78% conversion yield of spiro-dithiolane 1) within seven experiments. Moreover, BO-assisted screening with the ITS selected by Latin hypercube sampling (LHS) further improved the yield of 1 to 89% within the eight trials. The established conditions were confirmed to be satisfactory for a hundred grams scale-up synthesis of 1.
Collapse
Affiliation(s)
- Masaru Kondo
- SANKEN, Osaka University, Ibaraki-shi 567-0047, Japan
- Department of Materials Science and Engineering, Graduate School of Science and Engineering, Ibaraki University, Nakanarusawa-cho, Hitachi-shi 316-8511, Japan
| | | | | | - Daisuke Yamashita
- Asahi Chemical Co., Ltd., Mitsuya-Minami, Yodogawa Ward, Osaka-shi 532-0035, Japan
| | - Takeshi Miyazaki
- Asahi Chemical Co., Ltd., Mitsuya-Minami, Yodogawa Ward, Osaka-shi 532-0035, Japan
| | - Yoji Ohno
- Asahi Chemical Co., Ltd., Mitsuya-Minami, Yodogawa Ward, Osaka-shi 532-0035, Japan
| | - Hiroaki Sasai
- SANKEN, Osaka University, Ibaraki-shi 567-0047, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, Suita-shi 565-0871, Japan
| | | | | |
Collapse
|
7
|
Helleckes LM, Hemmerich J, Wiechert W, von Lieres E, Grünberger A. Machine learning in bioprocess development: from promise to practice. Trends Biotechnol 2023; 41:817-835. [PMID: 36456404 DOI: 10.1016/j.tibtech.2022.10.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/20/2022] [Accepted: 10/27/2022] [Indexed: 11/30/2022]
Abstract
Fostered by novel analytical techniques, digitalization, and automation, modern bioprocess development provides large amounts of heterogeneous experimental data, containing valuable process information. In this context, data-driven methods like machine learning (ML) approaches have great potential to rationally explore large design spaces while exploiting experimental facilities most efficiently. Herein we demonstrate how ML methods have been applied so far in bioprocess development, especially in strain engineering and selection, bioprocess optimization, scale-up, monitoring, and control of bioprocesses. For each topic, we will highlight successful application cases, current challenges, and point out domains that can potentially benefit from technology transfer and further progress in the field of ML.
Collapse
Affiliation(s)
- Laura M Helleckes
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Johannes Hemmerich
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
| | - Wolfgang Wiechert
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Eric von Lieres
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Alexander Grünberger
- Multiscale Bioengineering, Technical Faculty, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Institute of Process Engineering in Life Sciences, Section III: Microsystems in Bioprocess Engineering, Karlsruhe Institute of Technology, Fritz-Haber-Weg 2, 76131, Karlsruhe, Germany.
| |
Collapse
|
8
|
Saldaña M, Gálvez E, Navarra A, Toro N, Cisternas LA. Optimization of the SAG Grinding Process Using Statistical Analysis and Machine Learning: A Case Study of the Chilean Copper Mining Industry. MATERIALS (BASEL, SWITZERLAND) 2023; 16:3220. [PMID: 37110055 PMCID: PMC10145634 DOI: 10.3390/ma16083220] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 03/20/2023] [Accepted: 03/25/2023] [Indexed: 06/19/2023]
Abstract
Considering the continuous increase in production costs and resource optimization, more than a strategic objective has become imperative in the copper mining industry. In the search to improve the efficiency in the use of resources, the present work develops models of a semi-autogenous grinding (SAG) mill using statistical analysis and machine learning (ML) techniques (regression, decision trees, and artificial neural networks). The hypotheses studied aim to improve the process's productive indicators, such as production and energy consumption. The simulation of the digital model captures an increase in production of 4.42% as a function of mineral fragmentation, while there is potential to increase production by decreasing the mill rotational speed, which has a decrease in energy consumption of 7.62% for all linear age configurations. Considering the performance of machine learning in the adjustment of complex models such as SAG grinding, the application of these tools in the mineral processing industry has the potential to increase the efficiency of these processes, either by improving production indicators or by saving energy consumption. Finally, the incorporation of these techniques in the aggregate management of processes such as the Mine to Mill paradigm, or the development of models that consider the uncertainty of the explanatory variables, could further increase the performance of productive indicators at the industrial scale.
Collapse
Affiliation(s)
- Manuel Saldaña
- Faculty of Engineering and Architecture, Universidad Arturo Prat, Iquique 1110939, Chile;
- Departamento de Ingeniería Química y Procesos de Minerales, Universidad de Antofagasta, Antofagasta 1270300, Chile;
| | - Edelmira Gálvez
- Department of Metallurgical and Mining Engineering, Universidad Católica del Norte, Av. Angamos 0610, Antofagasta 1270709, Chile;
| | - Alessandro Navarra
- Department of Mining and Materials Engineering, McGill University, 3610 University Street, Montreal, QC H3A 0C5, Canada;
| | - Norman Toro
- Faculty of Engineering and Architecture, Universidad Arturo Prat, Iquique 1110939, Chile;
| | - Luis A. Cisternas
- Departamento de Ingeniería Química y Procesos de Minerales, Universidad de Antofagasta, Antofagasta 1270300, Chile;
| |
Collapse
|
9
|
Rihm GB, Schueler M, Nentwich C, Esche E, Repke JU. Adaptation of Dynamic Data‐Driven Models for Real‐Time Applications: From Simulated to Real Batch Distillation Trajectories by Transfer Learning. CHEM-ING-TECH 2023. [DOI: 10.1002/cite.202200228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
|
10
|
Rizki Z, Ottens M. Model-based optimization approaches for pressure-driven membrane systems. Sep Purif Technol 2023. [DOI: 10.1016/j.seppur.2023.123682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
11
|
A Review on Artificial Intelligence Enabled Design, Synthesis, and Process Optimization of Chemical Products for Industry 4.0. Processes (Basel) 2023. [DOI: 10.3390/pr11020330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
With the development of Industry 4.0, artificial intelligence (AI) is gaining increasing attention for its performance in solving particularly complex problems in industrial chemistry and chemical engineering. Therefore, this review provides an overview of the application of AI techniques, in particular machine learning, in chemical design, synthesis, and process optimization over the past years. In this review, the focus is on the application of AI for structure-function relationship analysis, synthetic route planning, and automated synthesis. Finally, we discuss the challenges and future of AI in making chemical products.
Collapse
|
12
|
Khan N, Ammar Taqvi SA. Machine Learning an Intelligent Approach in Process Industries: A Perspective and Overview. CHEMBIOENG REVIEWS 2022. [DOI: 10.1002/cben.202200030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Nadia Khan
- NED University of Engineering & Technology Polymer and Petrochemical Engineering Department Karachi Pakistan
| | - Syed Ali Ammar Taqvi
- NED University of Engineering & Technology Chemical Engineering Department Karachi Pakistan
| |
Collapse
|
13
|
Machine Learning with Gradient-Based Optimization of Nuclear Waste Vitrification with Uncertainties and Constraints. Processes (Basel) 2022. [DOI: 10.3390/pr10112365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Gekko is an optimization suite in Python that solves optimization problems involving mixed-integer, nonlinear, and differential equations. The purpose of this study is to integrate common Machine Learning (ML) algorithms such as Gaussian Process Regression (GPR), support vector regression (SVR), and artificial neural network (ANN) models into Gekko to solve data based optimization problems. Uncertainty quantification (UQ) is used alongside ML for better decision making. These methods include ensemble methods, model-specific methods, conformal predictions, and the delta method. An optimization problem involving nuclear waste vitrification is presented to demonstrate the benefit of ML in this field. ML models are compared against the current partial quadratic mixture (PQM) model in an optimization problem in Gekko. GPR with conformal uncertainty was chosen as the best substitute model as it had a lower mean squared error of 0.0025 compared to 0.018 and more confidently predicted a higher waste loading of 37.5 wt% compared to 34 wt%. The example problem shows that these tools can be used in similar industry settings where easier use and better performance is needed over classical approaches. Future works with these tools include expanding them with other regression models and UQ methods, and exploration into other optimization problems or dynamic control.
Collapse
|
14
|
Mora-Mariano D, Flores-Tlacuahuac A. A machine learning approach for the surrogate modeling of uncertain distributed process engineering models. Chem Eng Res Des 2022. [DOI: 10.1016/j.cherd.2022.07.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|