1
|
Garcia-Escobar F, Taniike T, Takahashi K. MonteCat: A Basin-Hopping-Inspired Catalyst Descriptor Search Algorithm for Machine Learning Models. J Chem Inf Model 2024; 64:1512-1521. [PMID: 38385190 DOI: 10.1021/acs.jcim.3c01952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Proposing relevant catalyst descriptors that can relate the information on a catalyst's composition to its actual performance is an ongoing area in catalyst informatics, as it is a necessary step to improve our understanding on the target reactions. Herein, a small descriptor-engineered data set containing 3289 descriptor variables and the performance of 200 catalysts for the oxidative coupling of methane (OCM) is analyzed, and a descriptor search algorithm based on the workflow of the Basin-hopping optimization methodology is proposed to select the descriptors that better fit a predictive model. The algorithm, which can be considered wrapper in nature, consists of the successive generation of random-based modifications to the descriptor subset used in a regression model and adopting them depending on their effect on the model's score. The results are presented after being tested on linear and Support Vector Regression models with average cross-validation r2 scores of 0.8268 and 0.6875, respectively.
Collapse
Affiliation(s)
| | - Toshiaki Taniike
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan
| | - Keisuke Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-8510, Japan
| |
Collapse
|
2
|
Taniike T, Fujiwara A, Nakanowatari S, García-Escobar F, Takahashi K. Automatic feature engineering for catalyst design using small data without prior knowledge of target catalysis. Commun Chem 2024; 7:11. [PMID: 38216711 PMCID: PMC10786848 DOI: 10.1038/s42004-023-01086-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 12/08/2023] [Indexed: 01/14/2024] Open
Abstract
The empirical aspect of descriptor design in catalyst informatics, particularly when confronted with limited data, necessitates adequate prior knowledge for delving into unknown territories, thus presenting a logical contradiction. This study introduces a technique for automatic feature engineering (AFE) that works on small catalyst datasets, without reliance on specific assumptions or pre-existing knowledge about the target catalysis when designing descriptors and building machine-learning models. This technique generates numerous features through mathematical operations on general physicochemical features of catalytic components and extracts relevant features for the desired catalysis, essentially screening numerous hypotheses on a machine. AFE yields reasonable regression results for three types of heterogeneous catalysis: oxidative coupling of methane (OCM), conversion of ethanol to butadiene, and three-way catalysis, where only the training set is swapped. Moreover, through the application of active learning that combines AFE and high-throughput experimentation for OCM, we successfully visualize the machine's process of acquiring precise recognition of the catalyst design. Thus, AFE is a versatile technique for data-driven catalysis research and a key step towards fully automated catalyst discoveries.
Collapse
Affiliation(s)
- Toshiaki Taniike
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan.
| | - Aya Fujiwara
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
| | - Sunao Nakanowatari
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
| | | | - Keisuke Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo, 060-0810, Japan
| |
Collapse
|
3
|
Takahashi K, Takahashi L. Toward the Golden Age of Materials Informatics: Perspective and Opportunities. J Phys Chem Lett 2023; 14:4726-4733. [PMID: 37172318 DOI: 10.1021/acs.jpclett.3c00648] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Materials informatics is reaching the transition point and is evolving from its early stages of adoption and development and moving toward its golden age. Here, the transformation of the early stage of materials informatics toward the next level of materials informatics is explored. In particular, it has become crucial to be able to manipulate materials synthesis data, materials properties data, and materials characterization data. Through the use of ontology, material design and understanding can be carried out simultaneously in a whitebox manner. Here, a perspective on the ultimate goal of materials informatics along with potential key components is discussed.
Collapse
Affiliation(s)
- Keisuke Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-0810, Japan
| | - Lauren Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-0810, Japan
| |
Collapse
|
4
|
Rossi K. What do we talk about, when we talk about single-crystal termination-dependent selectivity of Cu electrocatalysts for CO 2 reduction? A data-driven retrospective. Phys Chem Chem Phys 2023; 25:6867-6876. [PMID: 36799456 DOI: 10.1039/d2cp04576a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
We mine from the literature experimental data on the CO2 electrochemical reduction selectivity of Cu single crystal surfaces. We then probe the accuracy of a machine learning model trained to predict faradaic efficiencies for 11 CO2 reduction reaction products, as a function of the applied voltage at which the reaction takes place, and the relative amounts of non equivalent surface sites, distinguished according to their nominal coordination. A satisfactory model accuracy is found only when discriminating data according to their provenance. On one hand, this result points at a qualitative agreement across reported experimental CO2 reduction reactions trends for single-crystal surfaces with well-defined terminations. On the other, this finding hints at the presence of differences in nominally identical catalysts and/or CO2 reduction reaction measurements, which result in quantitative disagreement between experiments.
Collapse
Affiliation(s)
- Kevin Rossi
- Institut des sciences et ingénierie chimiques, École Polytechnique Fédérale de Lausanne, 1950 Sion, Switzerland.
| |
Collapse
|
5
|
Takahashi K, Ohyama J, Nishimura S, Fujima J, Takahashi L, Uno T, Taniike T. Catalysts informatics: paradigm shift towards data-driven catalyst design. Chem Commun (Camb) 2023; 59:2222-2238. [PMID: 36723221 DOI: 10.1039/d2cc05938j] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Designing catalysts is a challenging matter as catalysts are involved with various factors that impact synthesis, catalysts, reactor and reaction. In order to overcome these difficulties, catalysts informatics is proposed as an alternative way to design and understand catalysts. The underlying concept of catalysts informatics is to design the catalysts from trends and patterns found in catalysts data. Here, three key concepts are introduced: experimental catalysts database, knowledge extraction from catalyst data via data science, and a catalysts informatics platform. Methane oxidation is chosen as a prototype reaction for demonstrating various aspects of catalysts informatics. This work summarizes how catalysts informatics plays a role in catalyst design. The work covers big data generation via high throughput experiments, machine learning, catalysts network method, catalyst design from small data, catalysts informatics platform, and the future of catalysts informatics via ontology. Thus, the proposed catalysts informatics would help innovate how catalysts can be designed and understood.
Collapse
Affiliation(s)
- Keisuke Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-0810, Japan.
| | - Junya Ohyama
- Faculty of Advanced Science and Technology, Kumamoto University, 2-39-1 Kurokami, Chuo-ku, 860-8555, Japan
| | - Shun Nishimura
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan
| | - Jun Fujima
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-0810, Japan.
| | - Lauren Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-0810, Japan.
| | - Takeaki Uno
- National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Japan
| | - Toshiaki Taniike
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan
| |
Collapse
|
6
|
Takahashi K, Takahashi L, Le SD, Kinoshita T, Nishimura S, Ohyama J. Synthesis of Heterogeneous Catalysts in Catalyst Informatics to Bridge Experiment and High-Throughput Calculation. J Am Chem Soc 2022; 144:15735-15744. [PMID: 35984913 DOI: 10.1021/jacs.2c06143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The coupling of high-throughput calculations with catalyst informatics is proposed as an alternative way to design heterogeneous catalysts. High-throughput first-principles calculations for the oxidative coupling of methane (OCM) reaction are designed and performed where 1972 catalyst surface planes for the CH4 to CH3 reaction are calculated. Several catalysts for the OCM reaction are designed based on key elements that are unveiled via data visualization and network analysis. Among the designed catalysts, several active catalysts such as CoAg/TiO2, Mg/BaO, and Ti/BaO are found to result in high C2 yield. Results illustrate that designing catalysts using high-throughput calculations is achievable in principle if appropriate trends and patterns within the data generated via high-throughput calculations are identified. Thus, high-throughput calculations in combination with catalyst informatics offer a potential alternative method for catalyst design.
Collapse
Affiliation(s)
- Keisuke Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-8510, Japan
| | - Lauren Takahashi
- Department of Chemistry, Hokkaido University, North 10, West 8, Sapporo 060-8510, Japan
| | - Son Dinh Le
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi 923-1292, Japan
| | - Takaaki Kinoshita
- Graduate School of Science and Technology, Kumamoto University, 2-39-1 Kurokami, Chuo-ku, Kumamoto 860-8555, Japan
| | - Shun Nishimura
- Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi 923-1292, Japan
| | - Junya Ohyama
- Faculty of Advanced Science and Technology, Kumamoto University, 2-39-1 Kurokami, Chuo-ku, Kumamoto 860-8555, Japan
| |
Collapse
|
7
|
Takimoto K, Takeuchi K, Ton NNT, Taniike T. Exploring stabilizer formulations for light-induced yellowing of polystyrene by high-throughput experimentation and machine learning. Polym Degrad Stab 2022. [DOI: 10.1016/j.polymdegradstab.2022.109967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
Sakaushi K, Watanabe A, Kumeda T, Shibuta Y. Fast-Decoding Algorithm for Electrode Processes at Electrified Interfaces by Mean-Field Kinetic Model and Bayesian Data Assimilation: An Active-Data-Mining Approach for the Efficient Search and Discovery of Electrocatalysts. ACS APPLIED MATERIALS & INTERFACES 2022; 14:22889-22902. [PMID: 35135188 DOI: 10.1021/acsami.1c21038] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The microscopic origins of the activity and selectivity of electrocatalysts has been a long-lasting enigma since the 19th century. By applying an active-data-mining approach, employing a mean-field kinetic model and a statistical approach of Bayesian data assimilation, we demonstrate here a fast decoding to extract key properties in the kinetics of complicated electrode processes from current-potential profiles in experimental and literary data. As the proof-of-concept, kinetic parameters on the four-electron oxygen reduction reaction in the 0.1 M HClO4 solution (ORR: O2 + 4e- + 4H+ → 2H2O) of various platinum-based single-crystal electrocatalysts are extracted from our own experiments and third-party literature to investigate the microscopic electrode processes. Furthermore, data assimilation of the mean-field ORR model and experimental data is performed based on Bayesian inference for the inductive estimation of kinetic parameters, which sheds light on the dynamic behavior of kinetic parameters with respect to overpotential. This work shows that a fast-decoding algorithm based on a mean-field kinetic model and Bayesian data assimilation is a promising data-driven approach to extract key microscopic features of complicated electrode processes and therefore will be an important method toward building up advanced human-machine collaborations for the efficient search and discovery of high-performance electrochemical materials.
Collapse
Affiliation(s)
- Ken Sakaushi
- Center for Green Research on Energy and Environmental Materials, National Institute for Materials Science, 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan
| | - Aoi Watanabe
- Department of Materials Engineering, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Tomoaki Kumeda
- Center for Green Research on Energy and Environmental Materials, National Institute for Materials Science, 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan
| | - Yasushi Shibuta
- Department of Materials Engineering, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|
9
|
Nishimura S, Le SD, Miyazato I, Fujima J, Taniike T, Ohyama J, Takahashi K. High-Throughput Screening and Literature Data Driven Machine Learning Assisting Investigation of Multi-component La2O3-based Catalysts for Oxidative Coupling of Methane. Catal Sci Technol 2022. [DOI: 10.1039/d1cy02206g] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Multi-component La2O3-based catalysts for oxidative coupling of methane (OCM) were designed based on high-throughput screening (HTS) and literature datasets with multi-output machine learning (ML) approaches including random forest regression (RFR),...
Collapse
|