Kayode G, Montemore MM. Latent Variable Machine Learning Framework for Catalysis: General Models, Transfer Learning, and Interpretability.
JACS AU 2024;
4:80-91. [PMID:
38274257 PMCID:
PMC10807004 DOI:
10.1021/jacsau.3c00419]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 11/30/2023] [Accepted: 12/01/2023] [Indexed: 01/27/2024]
Abstract
Machine learning has been successfully applied in recent years to screen materials for a variety of applications. However, despite recent advances, most screening-based machine learning approaches are limited in generality and transferability, requiring new models to be created from scratch for each new application. This is particularly apparent in catalysis, where there are many possible intermediates and transition states of interest in addition to a large number of potential catalytic materials. In this work, we developed a new machine learning framework that is built on chemical principles and allows the creation of general, interpretable, reusable models. Our new architecture uses latent variables to create a set of submodels that each take on a relatively simple learning task, leading to higher data efficiency and promoting transfer learning. This architecture infuses fundamental chemical principles, such as the existence of elements as discrete entities. We show that this architecture allows for the creation of models that can be reused for many different applications, providing significant improvements in efficiency and convenience. For example, our architecture allows simultaneous prediction of adsorption energies for many adsorbates on a broad array of alloy surfaces with mean absolute errors (MAEs) around 0.20-0.25 eV. The integration of latent variables provides physical interpretability, as predictions can be explained in terms of the learned chemical environment as represented by the latent space. Further, these latent variables also serve as new feature representations, allowing efficient transfer learning. For example, new models with useful levels of accuracy can be created with less than 10 data points, including transfer learning to an experimental data set with an MAE less than 0.15 eV. Lastly, we show that our new machine learning architecture is general and robust enough to handle heterogeneous and multifidelity data sets, allowing researchers to leverage existing data sets to speed up screening using their own computational setup.
Collapse