1
|
Chen X, Lu S, Chen Q, Zhou Q, Wang J. From bulk effective mass to 2D carrier mobility accurate prediction via adversarial transfer learning. Nat Commun 2024; 15:5391. [PMID: 38918387 PMCID: PMC11199574 DOI: 10.1038/s41467-024-49686-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 06/10/2024] [Indexed: 06/27/2024] Open
Abstract
Data scarcity is one of the critical bottlenecks to utilizing machine learning in material discovery. Transfer learning can use existing big data to assist property prediction on small data sets, but the premise is that there must be a strong correlation between large and small data sets. To extend its applicability in scenarios with different properties and materials, here we develop a hybrid framework combining adversarial transfer learning and expert knowledge, which enables the direct prediction of carrier mobility of two-dimensional (2D) materials using the knowledge learned from bulk effective mass. Specifically, adversarial training ensures that only common knowledge between bulk and 2D materials is extracted while expert knowledge is incorporated to further improve the prediction accuracy and generalizability. Successfully, 2D carrier mobilities are predicted with the accuracy over 90% from only crystal structure, and 21 2D semiconductors with carrier mobilities far exceeding silicon and suitable bandgap are successfully screened out. This work enables transfer learning in simultaneous cross-property and cross-material scenarios, providing an effective tool to predict intricate material properties with limited data.
Collapse
Affiliation(s)
- Xinyu Chen
- Key Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing, China
| | - Shuaihua Lu
- Key Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing, China
| | - Qian Chen
- Key Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing, China
| | - Qionghua Zhou
- Key Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing, China.
- Suzhou Laboratory, Suzhou, China.
| | - Jinlan Wang
- Key Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing, China.
- Suzhou Laboratory, Suzhou, China.
| |
Collapse
|
2
|
Fu N, Wei L, Hu J. Physics-Guided Dual Self-Supervised Learning for Structure-Based Material Property Prediction. J Phys Chem Lett 2024:2841-2850. [PMID: 38442260 DOI: 10.1021/acs.jpclett.4c00100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Deep learning models have been widely used for high-performance material property prediction. However, training such models usually requires a large amount of labeled data, which are usually unavailable. Self-supervised learning (SSL) methods have been proposed to address this data scarcity issue. Herein, we present DSSL, a physics-guided dual SSL framework, for graph neural network-based material property prediction, which combines node masking-based generative SSL with atomic coordinate perturbation-based contrastive SSL strategies to capture local and global information about input crystals. Moreover, we achieve physics-guided pretraining by using the macroproperty (e.g., elasticity)-related microproperty prediction of atomic stiffness as an additional pretext task. We pretrain our DSSL model on the Materials Project database and fine-tune it with 10 material property data sets. The experimental results demonstrate that teaching neural networks some physics using the SSL strategy can afford ≤26.89% performance improvement compared to that of the baseline models.
Collapse
Affiliation(s)
- Nihang Fu
- Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29201, United States
| | - Lai Wei
- Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29201, United States
| | - Jianjun Hu
- Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29201, United States
| |
Collapse
|
3
|
Garrison A, Heras-Domingo J, Kitchin JR, dos Passos Gomes G, Ulissi ZW, Blau SM. Applying Large Graph Neural Networks to Predict Transition Metal Complex Energies Using the tmQM_wB97MV Data Set. J Chem Inf Model 2023; 63:7642-7654. [PMID: 38049389 PMCID: PMC10751796 DOI: 10.1021/acs.jcim.3c01226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 11/08/2023] [Accepted: 11/20/2023] [Indexed: 12/06/2023]
Abstract
Machine learning (ML) methods have shown promise for discovering novel catalysts but are often restricted to specific chemical domains. Generalizable ML models require large and diverse training data sets, which exist for heterogeneous catalysis but not for homogeneous catalysis. The tmQM data set, which contains properties of 86,665 transition metal complexes calculated at the TPSSh/def2-SVP level of density functional theory (DFT), provided a promising training data set for homogeneous catalyst systems. However, we find that ML models trained on tmQM consistently underpredict the energies of a chemically distinct subset of the data. To address this, we present the tmQM_wB97MV data set, which filters out several structures in tmQM found to be missing hydrogens and recomputes the energies of all other structures at the ωB97M-V/def2-SVPD level of DFT. ML models trained on tmQM_wB97MV show no pattern of consistently incorrect predictions and much lower errors than those trained on tmQM. The ML models tested on tmQM_wB97MV were, from best to worst, GemNet-T > PaiNN ≈ SpinConv > SchNet. Performance consistently improves when using only neutral structures instead of the entire data set. However, while models saturate with only neutral structures, more data continue to improve the models when including charged species, indicating the importance of accurately capturing a range of oxidation states in future data generation and model development. Furthermore, a fine-tuning approach in which weights were initialized from models trained on OC20 led to drastic improvements in model performance, indicating transferability between ML strategies of heterogeneous and homogeneous systems.
Collapse
Affiliation(s)
- Aaron
G. Garrison
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Javier Heras-Domingo
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - John R. Kitchin
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Gabriel dos Passos Gomes
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Wilton
E. Scott Institute for Energy Innovation, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Zachary W. Ulissi
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Wilton
E. Scott Institute for Energy Innovation, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Samuel M. Blau
- Lawrence
Berkeley National Laboratory, Berkeley, California 94720, United States
| |
Collapse
|
4
|
Pablo-García S, Morandi S, Vargas-Hernández RA, Jorner K, Ivković Ž, López N, Aspuru-Guzik A. Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks. NATURE COMPUTATIONAL SCIENCE 2023; 3:433-442. [PMID: 38177837 PMCID: PMC10766545 DOI: 10.1038/s43588-023-00437-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 03/23/2023] [Indexed: 01/06/2024]
Abstract
Modeling in heterogeneous catalysis requires the extensive evaluation of the energy of molecules adsorbed on surfaces. This is done via density functional theory but for large organic molecules it requires enormous computational time, compromising the viability of the approach. Here we present GAME-Net, a graph neural network to quickly evaluate the adsorption energy. GAME-Net is trained on a well-balanced chemically diverse dataset with C1-4 molecules with functional groups including N, O, S and C6-10 aromatic rings. The model yields a mean absolute error of 0.18 eV on the test set and is 6 orders of magnitude faster than density functional theory. Applied to biomass and plastics (up to 30 heteroatoms), adsorption energies are predicted with a mean absolute error of 0.016 eV per atom. The framework represents a tool for the fast screening of catalytic materials, particularly for systems that cannot be simulated by traditional methods.
Collapse
Affiliation(s)
- Sergio Pablo-García
- Institute of Chemical Research of Catalonia, The Barcelona Institute of Science and Technology, Tarragona, Spain
- Department of Chemistry, University of Toronto, Lash Miller Chemical Laboratories, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Sandford Fleming Building, Toronto, Ontario, Canada
| | - Santiago Morandi
- Institute of Chemical Research of Catalonia, The Barcelona Institute of Science and Technology, Tarragona, Spain
- Department of Physical and Inorganic Chemistry, Universitat Rovira i Virgili, Tarragona, Spain
| | - Rodrigo A Vargas-Hernández
- Department of Chemistry, University of Toronto, Lash Miller Chemical Laboratories, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Kjell Jorner
- Department of Chemistry, University of Toronto, Lash Miller Chemical Laboratories, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Sandford Fleming Building, Toronto, Ontario, Canada
- Department of Chemistry and Chemical Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Žarko Ivković
- Institute of Chemical Research of Catalonia, The Barcelona Institute of Science and Technology, Tarragona, Spain
| | - Núria López
- Institute of Chemical Research of Catalonia, The Barcelona Institute of Science and Technology, Tarragona, Spain.
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Lash Miller Chemical Laboratories, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Sandford Fleming Building, Toronto, Ontario, Canada.
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
- Department of Materials Science and Engineering, University of Toronto, Toronto, Ontario, Canada.
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, Ontario, Canada.
- Acceleration Consortium, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
5
|
Tran R, Lan J, Shuaibi M, Wood BM, Goyal S, Das A, Heras-Domingo J, Kolluru A, Rizvi A, Shoghi N, Sriram A, Therrien F, Abed J, Voznyy O, Sargent EH, Ulissi Z, Zitnick CL. The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts. ACS Catal 2023. [DOI: 10.1021/acscatal.2c05426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Affiliation(s)
- Richard Tran
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, United States
| | - Janice Lan
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Muhammed Shuaibi
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, United States
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Brandon M. Wood
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Siddharth Goyal
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Abhishek Das
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Javier Heras-Domingo
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, United States
| | - Adeesh Kolluru
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, United States
| | - Ammar Rizvi
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Nima Shoghi
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Anuroop Sriram
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| | - Félix Therrien
- Department of Electrical and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto, Ontario M5S 3G4, Canada
- Department of Physical and Environmental Sciences, University of Toronto Scarborough, Scarborough, Ontario M1C 1A4, Canada
| | - Jehad Abed
- Department of Electrical and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto, Ontario M5S 3G4, Canada
- Department of Materials Science and Engineering, University of Toronto, 10 King’s College Road, Toronto, Ontario M5S 3G4, Canada
| | - Oleksandr Voznyy
- Department of Physical and Environmental Sciences, University of Toronto Scarborough, Scarborough, Ontario M1C 1A4, Canada
| | - Edward H. Sargent
- Department of Electrical and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto, Ontario M5S 3G4, Canada
| | - Zachary Ulissi
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, United States
- Scott Institute for Energy Innovation, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - C. Lawrence Zitnick
- Fundamental AI Research, Meta AI, Menlo Park, California 94025, United States
| |
Collapse
|
6
|
Ess DH, Jelfs KE, Kulik HJ. Chemical design by artificial intelligence. J Chem Phys 2022; 157:120401. [PMID: 36182437 DOI: 10.1063/5.0123281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Daniel H Ess
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, Utah 84604, USA
| | - Kim E Jelfs
- Department of Chemistry, Molecular Sciences Research Hub, 82 Wood Lane, White City Campus, Imperial College London, London W12 0BZ, United Kingdom
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
7
|
Kolluru A, Shuaibi M, Palizhati A, Shoghi N, Das A, Wood B, Zitnick CL, Kitchin JR, Ulissi ZW. Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery. ACS Catal 2022. [DOI: 10.1021/acscatal.2c02291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Adeesh Kolluru
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Muhammed Shuaibi
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Aini Palizhati
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Nima Shoghi
- Fundamental AI Research at Meta AI, Menlo Park, California 94025, United States
| | - Abhishek Das
- Fundamental AI Research at Meta AI, Menlo Park, California 94025, United States
| | - Brandon Wood
- Fundamental AI Research at Meta AI, Menlo Park, California 94025, United States
| | - C. Lawrence Zitnick
- Fundamental AI Research at Meta AI, Menlo Park, California 94025, United States
| | - John R. Kitchin
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Zachary W. Ulissi
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
8
|
Wander B, Broderick K, Ulissi ZW. Catlas: an automated framework for catalyst discovery demonstrated for direct syngas conversion. Catal Sci Technol 2022. [DOI: 10.1039/d2cy01267g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Catlas may be used with off-the-shelf pretrained models to explore large design spaces for catalyst discovery and has been used here to identify promising materials for the direct conversion of syngas to multi-carbon oxygenates.
Collapse
Affiliation(s)
- Brook Wander
- Department of Chemical Engineering, Carnegie Mellon University, USA
| | - Kirby Broderick
- Department of Chemical Engineering, Carnegie Mellon University, USA
| | - Zachary W. Ulissi
- Department of Chemical Engineering, Carnegie Mellon University, USA
- Scott Institute for Energy Innovation, Carnegie Mellon University, USA
| |
Collapse
|