51
|
Machine learning to empower electrohydrodynamic processing. MATERIALS SCIENCE & ENGINEERING. C, MATERIALS FOR BIOLOGICAL APPLICATIONS 2022; 132:112553. [DOI: 10.1016/j.msec.2021.112553] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 11/09/2021] [Accepted: 11/11/2021] [Indexed: 01/13/2023]
|
52
|
Li S, Liu Y, Chen D, Jiang Y, Nie Z, Pan F. Encoding the atomic structure for machine learning in materials science. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1558] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Shunning Li
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| | - Yuanji Liu
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| | - Dong Chen
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| | - Yi Jiang
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| | - Zhiwei Nie
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| | - Feng Pan
- School of Advanced Materials Peking University, Shenzhen Graduate School Shenzhen China
| |
Collapse
|
53
|
Schmidt J, Pettersson L, Verdozzi C, Botti S, Marques MAL. Crystal graph attention networks for the prediction of stable materials. SCIENCE ADVANCES 2021; 7:eabi7948. [PMID: 34860548 PMCID: PMC8641929 DOI: 10.1126/sciadv.abi7948] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 10/14/2021] [Indexed: 05/27/2023]
Abstract
Graph neural networks for crystal structures typically use the atomic positions and the atomic species as input. Unfortunately, this information is not available when predicting new materials, for which the precise geometrical information is unknown. We circumvent this problem by replacing the precise bond distances with embeddings of graph distances. This allows our networks to be applied directly in high-throughput studies based on both composition and crystal structure prototype without using relaxed structures as input. To train these networks, we curate a dataset of over 2 million density functional calculations of crystals with consistent calculation parameters. We apply the resulting model to the high-throughput search of 15 million tetragonal perovskites of composition ABCD2. As a result, we identify several thousand potentially stable compounds and demonstrate that transfer learning from the newly curated dataset reduces the required training data by 50%.
Collapse
Affiliation(s)
- Jonathan Schmidt
- Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany
| | - Love Pettersson
- Department of Physics, Lund University Box 118, 221 00 Lund, Sweden
| | - Claudio Verdozzi
- Department of Physics, Lund University Box 118, 221 00 Lund, Sweden
| | - Silvana Botti
- Institut für Festkörpertheorie und Optik and European Theoretical Spectroscopy Facility, Friedrich-Schiller-Universität Jena, D-07743 Jena, Germany
| | - Miguel A. L. Marques
- Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany
| |
Collapse
|
54
|
Gupta V, Choudhary K, Tavazza F, Campbell C, Liao WK, Choudhary A, Agrawal A. Cross-property deep transfer learning framework for enhanced predictive analytics on small materials data. Nat Commun 2021; 12:6595. [PMID: 34782631 PMCID: PMC8594437 DOI: 10.1038/s41467-021-26921-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 10/28/2021] [Indexed: 11/30/2022] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) have been increasingly used in materials science to build predictive models and accelerate discovery. For selected properties, availability of large databases has also facilitated application of deep learning (DL) and transfer learning (TL). However, unavailability of large datasets for a majority of properties prohibits widespread application of DL/TL. We present a cross-property deep-transfer-learning framework that leverages models trained on large datasets to build models on small datasets of different properties. We test the proposed framework on 39 computational and two experimental datasets and find that the TL models with only elemental fractions as input outperform ML/DL models trained from scratch even when they are allowed to use physical attributes as input, for 27/39 (≈ 69%) computational and both the experimental datasets. We believe that the proposed framework can be widely useful to tackle the small data challenge in applying AI/ML in materials science.
Collapse
Affiliation(s)
- Vishu Gupta
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Kamal Choudhary
- Materials Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
- Theiss Research, La Jolla, CA, 92037, USA
| | - Francesca Tavazza
- Materials Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Carelyn Campbell
- Materials Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Wei-Keng Liao
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Alok Choudhary
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Ankit Agrawal
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA.
| |
Collapse
|
55
|
Huang L, Ling C. Leveraging Transfer Learning and Chemical Principles toward Interpretable Materials Properties. J Chem Inf Model 2021; 61:4200-4209. [PMID: 34435765 DOI: 10.1021/acs.jcim.1c00434] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning is emerging as a new paradigm to rationalize chemical properties for deepening our understanding of chemistry and providing instructive clues on better materials performance. While the complex architecture of machine learning contributes to unprecedented capability in this task, it prevents easy interpretation, leading to extensive criticisms on the lack of physical foundations for the black-box like models. Here, we demonstrate a transfer learning strategy that leverages fundamental principles of chemistry to offer adequate physical insights for the interpretation. Through interpreting the models for the formation energies of inorganic compounds, the proposed strategy revealed the deficiency of deep neural network in handling interelemental patterns and proved the more proper abstraction of recurrent neural network with attention mechanism, which led to predicting the elegant form of periodic table with high precision. The success demonstrates a new solution toward models with full transparency in materials informatics.
Collapse
Affiliation(s)
- Liyuan Huang
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan United States, 48105
| | - Chen Ling
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan United States, 48105
| |
Collapse
|
56
|
Vasylenko A, Gamon J, Duff BB, Gusev VV, Daniels LM, Zanella M, Shin JF, Sharp PM, Morscher A, Chen R, Neale AR, Hardwick LJ, Claridge JB, Blanc F, Gaultois MW, Dyer MS, Rosseinsky MJ. Element selection for crystalline inorganic solid discovery guided by unsupervised machine learning of experimentally explored chemistry. Nat Commun 2021; 12:5561. [PMID: 34548485 PMCID: PMC8455628 DOI: 10.1038/s41467-021-25343-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 08/04/2021] [Indexed: 02/08/2023] Open
Abstract
The selection of the elements to combine delimits the possible outcomes of synthetic chemistry because it determines the range of compositions and structures, and thus properties, that can arise. For example, in the solid state, the elemental components of a phase field will determine the likelihood of finding a new crystalline material. Researchers make these choices based on their understanding of chemical structure and bonding. Extensive data are available on those element combinations that produce synthetically isolable materials, but it is difficult to assimilate the scale of this information to guide selection from the diversity of potential new chemistries. Here, we show that unsupervised machine learning captures the complex patterns of similarity between element combinations that afford reported crystalline inorganic materials. This model guides prioritisation of quaternary phase fields containing two anions for synthetic exploration to identify lithium solid electrolytes in a collaborative workflow that leads to the discovery of Li3.3SnS3.3Cl0.7. The interstitial site occupancy combination in this defect stuffed wurtzite enables a low-barrier ion transport pathway in hexagonal close-packing.
Collapse
Affiliation(s)
| | - Jacinthe Gamon
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - Benjamin B Duff
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Stephenson Institute for Renewable Energy, University of Liverpool, Liverpool, UK
| | - Vladimir V Gusev
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Luke M Daniels
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - Marco Zanella
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - J Felix Shin
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - Paul M Sharp
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | | | - Ruiyong Chen
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - Alex R Neale
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Stephenson Institute for Renewable Energy, University of Liverpool, Liverpool, UK
| | - Laurence J Hardwick
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Stephenson Institute for Renewable Energy, University of Liverpool, Liverpool, UK
| | - John B Claridge
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Frédéric Blanc
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Stephenson Institute for Renewable Energy, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Michael W Gaultois
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Matthew S Dyer
- Department of Chemistry, University of Liverpool, Liverpool, UK
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Matthew J Rosseinsky
- Department of Chemistry, University of Liverpool, Liverpool, UK.
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK.
| |
Collapse
|
57
|
Abstract
In materials science, crystal structures are the cornerstone in the structure–property paradigm. The description of crystal compounds may be ascribed to the number of different atomic chemical environments, which are related to the Wyckoff sites. Hence, a set of features related to the different atomic environments in a crystal compound can be constructed as input data for artificial neural networks (ANNs). In this article, we show the performance of a series of ANNs developed using crystal-site-based features. These ANNs were developed to classify compounds into halite, garnet, fluorite, hexagonal perovskite, ilmenite, layered perovskite, -o-tp- perovskite, perovskite, and spinel structures. Using crystal-site-based features, the ANNs were able to classify the crystal compounds with a 93.72% average precision. Furthermore, the ANNs were able to retrieve missing compounds with one of these archetypical structure types from a database. Finally, we showed that the developed ANNs were also suitable for a multitask learning paradigm, since the extracted information in the hidden layers linearly correlated with lattice parameters of the crystal structures.
Collapse
|
58
|
Sifain AE, Lystrom L, Messerly RA, Smith JS, Nebgen B, Barros K, Tretiak S, Lubbers N, Gifford BJ. Predicting phosphorescence energies and inferring wavefunction localization with machine learning. Chem Sci 2021; 12:10207-10217. [PMID: 34447529 PMCID: PMC8336587 DOI: 10.1039/d1sc02136b] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 06/28/2021] [Indexed: 11/29/2022] Open
Abstract
Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling potential energy surfaces inaccurately predict singlet-triplet energy gaps due to the failure to account for spatial localities of spin transitions. To solve this, we introduce localization layers in a neural network model that weight atomic contributions to the energy gap, thereby allowing the model to isolate the most determinative chemical environments. Trained on the singlet-triplet energy gaps of organic molecules, we apply our method to an out-of-sample test set of large phosphorescent compounds and demonstrate the substantial improvement that localization layers have on predicting their phosphorescence energies. Remarkably, the inferred localization weights have a strong relationship with the ab initio spin density of the singlet-triplet transition, and thus infer localities of the molecule that determine the spin transition, despite the fact that no direct electronic information was provided during training. The use of localization layers is expected to improve the modeling of many localized, non-extensive phenomena and could be implemented in any atom-centered neural network model.
Collapse
Affiliation(s)
- Andrew E Sifain
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Levi Lystrom
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Richard A Messerly
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Justin S Smith
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory Los Alamos NM USA 87545
| | - Brendan J Gifford
- Theoretical Division, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Nonlinear Studies, Los Alamos National Laboratory Los Alamos NM USA 87545
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory Los Alamos NM USA 87545
| |
Collapse
|
59
|
Kulichenko M, Smith JS, Nebgen B, Li YW, Fedik N, Boldyrev AI, Lubbers N, Barros K, Tretiak S. The Rise of Neural Networks for Materials and Chemical Dynamics. J Phys Chem Lett 2021; 12:6227-6243. [PMID: 34196559 DOI: 10.1021/acs.jpclett.1c01357] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Machine learning (ML) is quickly becoming a premier tool for modeling chemical processes and materials. ML-based force fields, trained on large data sets of high-quality electron structure calculations, are particularly attractive due their unique combination of computational efficiency and physical accuracy. This Perspective summarizes some recent advances in the development of neural network-based interatomic potentials. Designing high-quality training data sets is crucial to overall model accuracy. One strategy is active learning, in which new data are automatically collected for atomic configurations that produce large ML uncertainties. Another strategy is to use the highest levels of quantum theory possible. Transfer learning allows training to a data set of mixed fidelity. A model initially trained to a large data set of density functional theory calculations can be significantly improved by retraining to a relatively small data set of expensive coupled cluster theory calculations. These advances are exemplified by applications to molecules and materials.
Collapse
Affiliation(s)
- Maksim Kulichenko
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322, United States
| | - Justin S Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nikita Fedik
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322, United States
| | - Alexander I Boldyrev
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
60
|
Singstock NR, Ortiz-Rodríguez JC, Perryman JT, Sutton C, Velázquez JM, Musgrave CB. Machine Learning Guided Synthesis of Multinary Chevrel Phase Chalcogenides. J Am Chem Soc 2021; 143:9113-9122. [PMID: 34107683 DOI: 10.1021/jacs.1c02971] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The Chevrel phase (CP) is a class of molybdenum chalcogenides that exhibit compelling properties for next-generation battery materials, electrocatalysts, and other energy applications. Despite their promise, CPs are underexplored, with only ∼100 compounds synthesized to date due to the challenge of identifying synthesizable phases. We present an interpretable machine-learned descriptor (Hδ) that rapidly and accurately estimates decomposition enthalpy (ΔHd) to assess CP stability. To develop Hδ, we first used density functional theory to compute ΔHd for 438 CP compositions. We then generated >560 000 descriptors with the new machine learning method SIFT, which provides an easy-to-use approach for developing accurate and interpretable chemical models. From a set of >200 000 compositions, we identified 48 501 CPs that Hδ predicts are synthesizable based on the criterion that ΔHd < 65 meV/atom, which was obtained as a statistical boundary from 67 experimentally synthesized CPs. The set of candidate CPs includes 2307 CP tellurides, an underexplored CP subset with a predicted preference for channel site occupation by cation intercalants that is rare among CPs. We successfully synthesized five of five novel CP tellurides attempted from this set and confirmed their preference for channel site occupation. Our joint computational and experimental approach for developing and validating screening tools that enable the rapid identification of synthesizable materials within a sparse class is likely transferable to other materials families to accelerate their discovery.
Collapse
Affiliation(s)
- Nicholas R Singstock
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80303, United States
| | | | - Joseph T Perryman
- Department of Chemistry, University of California Davis, Davis, California 95616, United States
| | - Christopher Sutton
- Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208, United States
| | - Jesús M Velázquez
- Department of Chemistry, University of California Davis, Davis, California 95616, United States
| | - Charles B Musgrave
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80303, United States.,Materials Science and Engineering Program, University of Colorado Boulder, Boulder, Colorado 80303, United States.,Renewable and Sustainable Energy Institute, University of Colorado Boulder, Boulder, Colorado 80303, United States
| |
Collapse
|
61
|
Mao Y, Yang H, Sheng Y, Wang J, Ouyang R, Ye C, Yang J, Zhang W. Prediction and Classification of Formation Energies of Binary Compounds by Machine Learning: An Approach without Crystal Structure Information. ACS OMEGA 2021; 6:14533-14541. [PMID: 34124476 PMCID: PMC8190927 DOI: 10.1021/acsomega.1c01517] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 05/13/2021] [Indexed: 05/06/2023]
Abstract
It is well believed that machine learning models could help to predict the formation energies of materials if all elemental and crystal structural details are known. In this paper, it is shown that even without detailed crystal structure information, the formation energies of binary compounds in various prototypes at the ground states can be reasonably evaluated using machine-learning feature abstraction to screen out the important features. By combining with the "white-box" sure independence screening and sparsifying operator (SISSO) approach, an interpretable and accurate formation energy model is constructed. The predicted formation energies of 183 experimental and 439 calculated stable binary compounds (E hull = 0) are predicted using this model, and both show reasonable agreements with experimental and Materials Project's calculated values. The descriptor set is capable of reflecting the formation energies of binary compounds and is also consistent with the common understanding that the formation energy is mainly determined by electronegativity, electron affinity, bond energy, and other atomic properties. As crystal structure parameters are not necessary prerequisites, it can be widely applied to the formation energy prediction and classification of binary compounds in large quantities.
Collapse
Affiliation(s)
- Yuanqing Mao
- Department
of Physics & Guangdong Provincial Key Laboratory of Computational
Science and Material Design, Southern University
of Science and Technology, Shenzhen 518055, China
| | - Hongliang Yang
- Department
of Physics & Guangdong Provincial Key Laboratory of Computational
Science and Material Design, Southern University
of Science and Technology, Shenzhen 518055, China
| | - Ye Sheng
- Materials
Genome Institute, Shanghai University, Shanghai 200444, China
| | - Jiping Wang
- Department
of Physics & Guangdong Provincial Key Laboratory of Computational
Science and Material Design, Southern University
of Science and Technology, Shenzhen 518055, China
| | - Runhai Ouyang
- Materials
Genome Institute, Shanghai University, Shanghai 200444, China
| | - Caichao Ye
- Department
of Physics & Guangdong Provincial Key Laboratory of Computational
Science and Material Design, Southern University
of Science and Technology, Shenzhen 518055, China
- Academy
for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen 518055, China
- Key
Laboratory of Energy Conversion and Storage Technologies (Southern
University of Science and Technology), Ministry
of Education, Shenzhen 518055, China
| | - Jiong Yang
- Materials
Genome Institute, Shanghai University, Shanghai 200444, China
| | - Wenqing Zhang
- Department
of Physics & Guangdong Provincial Key Laboratory of Computational
Science and Material Design, Southern University
of Science and Technology, Shenzhen 518055, China
- Shenzhen
Key Laboratory of Advanced Quantum Functional Materials and Devices, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
62
|
Wen H, Huang C, Guo S. The Application of Convolutional Neural Networks (CNNs) to Recognize Defects in 3D-Printed Parts. MATERIALS 2021; 14:ma14102575. [PMID: 34063484 PMCID: PMC8156518 DOI: 10.3390/ma14102575] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Revised: 05/05/2021] [Accepted: 05/11/2021] [Indexed: 11/16/2022]
Abstract
Cracks and pores are two common defects in metallic additive manufacturing (AM) parts. In this paper, deep learning-based image analysis is performed for defect (cracks and pores) classification/detection based on SEM images of metallic AM parts. Three different levels of complexities, namely, defect classification, defect detection and defect image segmentation, are successfully achieved using a simple CNN model, the YOLOv4 model and the Detectron2 object detection library, respectively. The tuned CNN model can classify any single defect as either a crack or pore at almost 100% accuracy. The other two models can identify more than 90% of the cracks and pores in the testing images. In addition to the application of static image analysis, defect detection is also successfully applied on a video which mimics the AM process control images. The trained Detectron2 model can identify almost all the pores and cracks that exist in the original video. This study lays a foundation for future in situ process monitoring of the 3D printing process.
Collapse
Affiliation(s)
- Hao Wen
- Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Chang Huang
- Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, LA 70803, USA;
| | - Shengmin Guo
- Department of Mechanical and Industrial Engineering, Louisiana State University, Baton Rouge, LA 70803, USA;
- Correspondence: ; Tel.: +1-225-578-7619
| |
Collapse
|
63
|
Taking the leap between analytical chemistry and artificial intelligence: A tutorial review. Anal Chim Acta 2021; 1161:338403. [DOI: 10.1016/j.aca.2021.338403] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 03/02/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023]
|
64
|
Joung J, Han M, Hwang J, Jeong M, Choi DH, Park S. Deep Learning Optical Spectroscopy Based on Experimental Database: Potential Applications to Molecular Design. JACS AU 2021; 1:427-438. [PMID: 34467305 PMCID: PMC8395663 DOI: 10.1021/jacsau.1c00035] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Indexed: 06/13/2023]
Abstract
Accurate and reliable prediction of the optical and photophysical properties of organic compounds is important in various research fields. Here, we developed deep learning (DL) optical spectroscopy using a DL model and experimental database to predict seven optical and photophysical properties of organic compounds, namely, the absorption peak position and bandwidth, extinction coefficient, emission peak position and bandwidth, photoluminescence quantum yield (PLQY), and emission lifetime. Our DL model included the chromophore-solvent interaction to account for the effect of local environments on the optical and photophysical properties of organic compounds and was trained using an experimental database of 30 094 chromophore/solvent combinations. Our DL optical spectroscopy made it possible to reliably and quickly predict the aforementioned properties of organic compounds in solution, gas phase, film, and powder with the root mean squared errors of 26.6 and 28.0 nm for absorption and emission peak positions, 603 and 532 cm-1 for absorption and emission bandwidths, and 0.209, 0.371, and 0.262 for the logarithm of the extinction coefficient, PLQY, and emission lifetime, respectively. Finally, we demonstrated how a blue emitter with desired optical and photophysical properties could be efficiently virtually screened and developed by DL optical spectroscopy. DL optical spectroscopy can be efficiently used for developing chromophores and fluorophores in various research areas.
Collapse
Affiliation(s)
| | | | - Jinhyo Hwang
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Minseok Jeong
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Dong Hoon Choi
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Sungnam Park
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| |
Collapse
|
65
|
Kaufmann K, Lane H, Liu X, Vecchio KS. Efficient few-shot machine learning for classification of EBSD patterns. Sci Rep 2021; 11:8172. [PMID: 33854109 PMCID: PMC8046977 DOI: 10.1038/s41598-021-87557-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 03/31/2021] [Indexed: 02/02/2023] Open
Abstract
Deep learning is quickly becoming a standard approach to solving a range of materials science objectives, particularly in the field of computer vision. However, labeled datasets large enough to train neural networks from scratch can be challenging to collect. One approach to accelerating the training of deep learning models such as convolutional neural networks is the transfer of weights from models trained on unrelated image classification problems, commonly referred to as transfer learning. The powerful feature extractors learned previously can potentially be fine-tuned for a new classification problem without hindering performance. Transfer learning can also improve the results of training a model using a small amount of data, known as few-shot learning. Herein, we test the effectiveness of a few-shot transfer learning approach for the classification of electron backscatter diffraction (EBSD) pattern images to six space groups within the [Formula: see text] point group. Training history and performance metrics are compared with a model of the same architecture trained from scratch. In an effort to make this approach more explainable, visualization of filters, activation maps, and Shapley values are utilized to provide insight into the model's operations. The applicability to real-world phase identification and differentiation is demonstrated using dual phase materials that are challenging to analyze with traditional methods.
Collapse
Affiliation(s)
- Kevin Kaufmann
- Department of NanoEngineering, UC San Diego, La Jolla, CA, 92093, USA
| | - Hobson Lane
- Tangible AI LLC, San Diego, CA, 92037, USA
- Department of Healthcare Research and Policy, UC San Diego-Extension, San Diego, CA, 92037, USA
| | - Xiao Liu
- Materials Science and Engineering Program, UC San Diego, La Jolla, CA, 92093, USA
| | - Kenneth S Vecchio
- Department of NanoEngineering, UC San Diego, La Jolla, CA, 92093, USA.
- Materials Science and Engineering Program, UC San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
66
|
Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic. SUSTAINABILITY 2021. [DOI: 10.3390/su13073836] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This paper explores the role of social media in tourist sentiment analysis. To do this, it describes previous studies that have carried out tourist sentiment analysis using social media data, before analyzing changes in tourists’ sentiments and behaviors during the COVID-19 pandemic. In the case study, which focuses on Andalusia, the changes experienced by the tourism sector in the southern Spanish region as a result of the COVID-19 pandemic are assessed using the Andalusian Tourism Situation Survey (ECTA). This information is then compared with data obtained from a sentiment analysis based on the social network Twitter. On the basis of this comparative analysis, the paper concludes that it is possible to identify and classify tourists’ perceptions using sentiment analysis on a mass scale with the help of statistical software (RStudio and Knime). The sentiment analysis using Twitter data correlates with and is supplemented by information from the ECTA survey, with both analyses showing that tourists placed greater value on safety and preferred to travel individually to nearby, less crowded destinations since the pandemic began. Of the two analytical tools, sentiment analysis can be carried out on social media on a continuous basis and offers cost savings.
Collapse
|
67
|
Rankine CD, Penfold TJ. Progress in the Theory of X-ray Spectroscopy: From Quantum Chemistry to Machine Learning and Ultrafast Dynamics. J Phys Chem A 2021; 125:4276-4293. [DOI: 10.1021/acs.jpca.0c11267] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- C. D. Rankine
- Chemistry—School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, U.K
| | - T. J. Penfold
- Chemistry—School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, U.K
| |
Collapse
|
68
|
Emami N, Ferdousi R. AptaNet as a deep learning approach for aptamer-protein interaction prediction. Sci Rep 2021; 11:6074. [PMID: 33727685 PMCID: PMC7971039 DOI: 10.1038/s41598-021-85629-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 03/03/2021] [Indexed: 02/08/2023] Open
Abstract
Aptamers are short oligonucleotides (DNA/RNA) or peptide molecules that can selectively bind to their specific targets with high specificity and affinity. As a powerful new class of amino acid ligands, aptamers have high potentials in biosensing, therapeutic, and diagnostic fields. Here, we present AptaNet-a new deep neural network-to predict the aptamer-protein interaction pairs by integrating features derived from both aptamers and the target proteins. Aptamers were encoded by using two different strategies, including k-mer and reverse complement k-mer frequency. Amino acid composition (AAC) and pseudo amino acid composition (PseAAC) were applied to represent target information using 24 physicochemical and conformational properties of the proteins. To handle the imbalance problem in the data, we applied a neighborhood cleaning algorithm. The predictor was constructed based on a deep neural network, and optimal features were selected using the random forest algorithm. As a result, 99.79% accuracy was achieved for the training dataset, and 91.38% accuracy was obtained for the testing dataset. AptaNet achieved high performance on our constructed aptamer-protein benchmark dataset. The results indicate that AptaNet can help identify novel aptamer-protein interacting pairs and build more-efficient insights into the relationship between aptamers and proteins. Our benchmark dataset and the source codes for AptaNet are available in: https://github.com/nedaemami/AptaNet .
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
69
|
Wang M, Zhu H. Machine Learning for Transition-Metal-Based Hydrogen Generation Electrocatalysts. ACS Catal 2021. [DOI: 10.1021/acscatal.1c00178] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Min Wang
- State Key Lab of New Ceramics and Fine Processing, School of Materials Science and Engineering, Tsinghua University, Beijing 100084, China
| | - Hongwei Zhu
- State Key Lab of New Ceramics and Fine Processing, School of Materials Science and Engineering, Tsinghua University, Beijing 100084, China
| |
Collapse
|
70
|
Jha D, Gupta V, Ward L, Yang Z, Wolverton C, Foster I, Liao WK, Choudhary A, Agrawal A. Enabling deeper learning on big data for materials informatics applications. Sci Rep 2021; 11:4244. [PMID: 33608599 PMCID: PMC7895970 DOI: 10.1038/s41598-021-83193-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Accepted: 01/21/2021] [Indexed: 11/08/2022] Open
Abstract
The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.
Collapse
Affiliation(s)
- Dipendra Jha
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
| | - Vishu Gupta
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
| | - Logan Ward
- Computation Institute, University of Chicago, Chicago, USA
- Data Science and Learning Division, Argonne National Laboratory, Lemont, USA
| | - Zijiang Yang
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
| | - Christopher Wolverton
- Department of Materials Science and Engineering, Northwestern University, Evanston, USA
| | - Ian Foster
- Computation Institute, University of Chicago, Chicago, USA
- Data Science and Learning Division, Argonne National Laboratory, Lemont, USA
| | - Wei-Keng Liao
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
| | - Alok Choudhary
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA
| | - Ankit Agrawal
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA.
| |
Collapse
|
71
|
Cox T, Gvozdetskyi V, Bertolami M, Lee S, Shipley K, Lebedev OI, Zaikina JV. Clathrate XI K
58
Zn
122
Sb
207
: A New Branch on the Clathrate Family Tree. Angew Chem Int Ed Engl 2021. [DOI: 10.1002/ange.202011120] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Tori Cox
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | | | - Mark Bertolami
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | - Shannon Lee
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
- Ames Laboratory US DOE Iowa State University Ames Iowa 50011 USA
| | - Kristian Shipley
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | - Oleg I. Lebedev
- Laboratoire CRISMAT ENSICAEN CNRS UMR 6508 14050 Caen France
| | - Julia V. Zaikina
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| |
Collapse
|
72
|
Piccinotti D, MacDonald KF, A Gregory S, Youngs I, Zheludev NI. Artificial intelligence for photonics and photonic materials. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021; 84:012401. [PMID: 33355315 DOI: 10.1088/1361-6633/abb4c7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Artificial intelligence (AI) is the most important new methodology in scientific research since the adoption of quantum mechanics and it is providing exciting results in numerous fields of science and technology. In this review we summarize research and discuss future opportunities for AI in the domains of photonics, nanophotonics, plasmonics and photonic materials discovery, including metamaterials.
Collapse
Affiliation(s)
- Davide Piccinotti
- Optoelectronics Research Centre and Centre for Photonic Metamaterials, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Kevin F MacDonald
- Optoelectronics Research Centre and Centre for Photonic Metamaterials, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Simon A Gregory
- Defence Science and Technology Laboratory, Salisbury, SP4 0JQ, United Kingdom
| | - Ian Youngs
- Defence Science and Technology Laboratory, Salisbury, SP4 0JQ, United Kingdom
| | - Nikolay I Zheludev
- Optoelectronics Research Centre and Centre for Photonic Metamaterials, University of Southampton, Southampton, SO17 1BJ, United Kingdom
- Centre for Disruptive Photonic Technologies, The Photonics Institute, School of Physical and Mathematical Sciences, Nanyang Technological University, 637371 Singapore
| |
Collapse
|
73
|
Rahaman O, Gagliardi A. Deep Learning Total Energies and Orbital Energies of Large Organic Molecules Using Hybridization of Molecular Fingerprints. J Chem Inf Model 2020; 60:5971-5983. [PMID: 33118351 DOI: 10.1021/acs.jcim.0c00687] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The ability to predict material properties without the need for resource-consuming experimental efforts can immensely accelerate material and drug discovery. Although ab initio methods can be reliable and accurate in making such predictions, they are computationally too expensive on a large scale. The recent advancements in artificial intelligence and machine learning as well as the availability of large quantum mechanics derived datasets enable us to train models on these datasets as a benchmark and to make fast predictions on much larger datasets. The success of these machine learning models highly depends on the machine-readable fingerprints of the molecules that capture their chemical properties as well as topological information. In this work, we propose a common deep learning-based framework to combine different types of molecular fingerprints to enhance prediction accuracy. A graph neural network (GNN), many-body tensor representation (MBTR), and a set of simple molecular descriptors (MD) were used to predict the total energies, highest occupied molecular orbital (HOMO) energies, and lowest unoccupied molecular orbital (LUMO) energies of a dataset containing ∼62k large organic molecules with complex aromatic rings and remarkably diverse functional groups. The results demonstrate that a combination of best performing molecular fingerprints can produce better results than the individual ones. The simple and flexible deep learning framework developed in this work can be easily adapted to incorporate other types of molecular fingerprints.
Collapse
Affiliation(s)
- Obaidur Rahaman
- Technische Universität München, Karlstr. 45, 80333 Munich, Germany
| | | |
Collapse
|
74
|
Wang MWH, Goodman JM, Allen TEH. Machine Learning in Predictive Toxicology: Recent Applications and Future Directions for Classification Models. Chem Res Toxicol 2020; 34:217-239. [PMID: 33356168 DOI: 10.1021/acs.chemrestox.0c00316] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from in vivo studies toward in silico studies. Currently, in vitro methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, k-nearest neighbors, and ensemble learning. The recent successes of these machine learning methods in predictive toxicology are summarized, and a comparison of some models used in predictive toxicology is presented. In predictive toxicology, SVMs, RF, and DTs are the dominant machine learning methods due to the characteristics of the data available. Lastly, this review describes the current challenges facing the use of machine learning in predictive toxicology and offers insights into the possible areas of improvement in the field.
Collapse
Affiliation(s)
- Marcus W H Wang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jonathan M Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.,MRC Toxicology Unit, University of Cambridge, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, United Kingdom
| |
Collapse
|
75
|
Sizochenko N, Hofmann M. Predictive Modeling of Critical Temperatures in Superconducting Materials. Molecules 2020; 26:molecules26010008. [PMID: 33375023 PMCID: PMC7792800 DOI: 10.3390/molecules26010008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 12/19/2020] [Accepted: 12/21/2020] [Indexed: 01/03/2023] Open
Abstract
In this study, we have investigated quantitative relationships between critical temperatures of superconductive inorganic materials and the basic physicochemical attributes of these materials (also called quantitative structure-property relationships). We demonstrated that one of the most recent studies (titled "A data-driven statistical model for predicting the critical temperature of a superconductor” and published in Computational Materials Science by K. Hamidieh in 2018) reports on models that were based on the dataset that contains 27% of duplicate entries. We aimed to deliver stable models for a properly cleaned dataset using the same modeling techniques (multiple linear regression, MLR, and gradient boosting decision trees, XGBoost). The predictive ability of our best XGBoost model (R2 = 0.924, RMSE = 9.336 using 10-fold cross-validation) is comparable to the XGBoost model by the author of the initial dataset (R2 = 0.920 and RMSE = 9.5 K in ten-fold cross-validation). At the same time, our best model is based on less sophisticated parameters, which allows one to make more accurate interpretations while maintaining a generalizable model. In particular, we found that the highest relative influence is attributed to variables that represent the thermal conductivity of materials. In addition to MLR and XGBoost, we explored the potential of other machine learning techniques (NN, neural networks and RF, random forests).
Collapse
Affiliation(s)
- Natalia Sizochenko
- Department of Informatics, Blanchardstown Campus, Technological University Dublin, 15 YV78 Dublin, Ireland;
- Department of Informatics, Postdoctoral Institute for Computational Studies, Enfield, NH 03748, USA
- Correspondence:
| | - Markus Hofmann
- Department of Informatics, Blanchardstown Campus, Technological University Dublin, 15 YV78 Dublin, Ireland;
| |
Collapse
|
76
|
Predicting materials properties without crystal structure: deep representation learning from stoichiometry. Nat Commun 2020; 11:6280. [PMID: 33293567 PMCID: PMC7722901 DOI: 10.1038/s41467-020-19964-7] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 11/04/2020] [Indexed: 01/31/2023] Open
Abstract
Machine learning has the potential to accelerate materials discovery by accurately predicting materials properties at a low computational cost. However, the model inputs remain a key stumbling block. Current methods typically use descriptors constructed from knowledge of either the full crystal structure — therefore only applicable to materials with already characterised structures — or structure-agnostic fixed-length representations hand-engineered from the stoichiometry. We develop a machine learning approach that takes only the stoichiometry as input and automatically learns appropriate and systematically improvable descriptors from data. Our key insight is to treat the stoichiometric formula as a dense weighted graph between elements. Compared to the state of the art for structure-agnostic methods, our approach achieves lower errors with less data. Predicting the structure of unknown materials’ compositions represents a challenge for high-throughput computational approaches. Here the authors introduce a new stoichiometry-based machine learning approach for predicting the properties of inorganic materials from their elemental compositions.
Collapse
|
77
|
Mirhosseini H, Kormath Madam Raghupathy R, Sahoo SK, Wiebeler H, Chugh M, Kühne TD. In silico investigation of Cu(In,Ga)Se 2-based solar cells. Phys Chem Chem Phys 2020; 22:26682-26701. [PMID: 33236749 DOI: 10.1039/d0cp04712k] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Photovoltaics is one of the most promising and fastest-growing renewable energy technologies. Although the price-performance ratio of solar cells has improved significantly over recent years, further systematic investigations are needed to achieve higher performance and lower cost for future solar cells. In conjunction with experiments, computer simulations are powerful tools to investigate the thermodynamics and kinetics of solar cells. Over the last few years, we have developed and employed advanced computational techniques to gain a better understanding of solar cells based on copper indium gallium selenide (Cu(In,Ga)Se2). Furthermore, we have utilized state-of-the-art data-driven science and machine learning for the development of photovoltaic materials. In this Perspective, we review our results along with a survey of the field.
Collapse
Affiliation(s)
- Hossein Mirhosseini
- Dynamics of Condensed Matter and Center for Sustainable Systems Design, Chair of Theoretical Chemistry, University of Paderborn, Warburger Str. 100, 33098 Paderborn, Germany.
| | | | | | | | | | | |
Collapse
|
78
|
Learning Representations of Inorganic Materials from Generative Adversarial Networks. Symmetry (Basel) 2020. [DOI: 10.3390/sym12111889] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The two most important aspects of material research using deep learning (DL) or machine learning (ML) are the characteristics of materials data and learning algorithms, where the proper characterization of materials data is essential for generating accurate models. At present, the characterization of materials based on the molecular composition includes some methods based on feature engineering, such as Magpie and One-hot. Although these characterization methods have achieved significant results in materials research, these methods based on feature engineering cannot guarantee the integrity of materials characterization. One possible approach is to learn the materials characterization via neural networks using the chemical knowledge and implicit composition rules shown in large-scale known materials. This article chooses an adversarial method to learn the composition of atoms using the Generative Adversarial Network (GAN), which makes sense for data symmetry. The total loss value of the discriminator on the test set is reduced from 4.1e13 to 0.3194, indicating that the designed GAN network can well capture the combination of atoms in real materials. We then use the trained discriminator weights for material characterization and predict bandgap, formation energy, critical temperature (Tc) of superconductors on the Open Quantum Materials Database (OQMD), Materials Project (MP), and SuperCond datasets. Experiments show that when using the same predictive model, our proposed method performs better than One-hot and Magpie. This article provides an effective method for characterizing materials based on molecular composition in addition to Magpie, One-hot, etc. In addition, the generator learned in this study generates hypothetical materials with the same distribution as known materials, and these hypotheses can be used as a source for new material discovery.
Collapse
|
79
|
Cox T, Gvozdetskyi V, Bertolami M, Lee S, Shipley K, Lebedev OI, Zaikina JV. Clathrate XI K
58
Zn
122
Sb
207
: A New Branch on the Clathrate Family Tree. Angew Chem Int Ed Engl 2020; 60:415-423. [DOI: 10.1002/anie.202011120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Indexed: 11/09/2022]
Affiliation(s)
- Tori Cox
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | | | - Mark Bertolami
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | - Shannon Lee
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
- Ames Laboratory US DOE Iowa State University Ames Iowa 50011 USA
| | - Kristian Shipley
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| | - Oleg I. Lebedev
- Laboratoire CRISMAT ENSICAEN CNRS UMR 6508 14050 Caen France
| | - Julia V. Zaikina
- Department of Chemistry Iowa State University Ames Iowa 50011 USA
| |
Collapse
|
80
|
Roberts J, Song Y, Crocker M, Risko C. A Genetic Algorithmic Approach to Determine the Structure of Li-Al Layered Double Hydroxides. J Chem Inf Model 2020; 60:4845-4855. [PMID: 32794767 DOI: 10.1021/acs.jcim.0c00493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Layered double hydroxides (LDH) demonstrate significant potential across a range of applications, including as catalysts, delivery vehicles for pharmaceuticals, environmental remediation, and supercapacitors. Explaining the mechanism of LDH action at the atomic scale in these and other applications is challenging, however, due to the difficulty in precisely defining the bulk and surface structure and chemical compositions. Here, we focus on the determination of the structure of lithium-aluminum (Li-Al) LDH, which has shown promise in the catalytic depolymerization of lignin, both directly as the catalyst and as a support for gold nanoparticles. While the relative positions of the Li and Al metals are generally well resolved by X-ray crystallography, it is the structures of the anionic layers, consisting of water and carbonate, that are less well established. Combinatorial analyses of all possible positions and rotations of the water and carbonate in the three-layered Li-AL LDH polytope reveals that the phase space is much too large to examine in any reasonable time frame in a one-by-one structure exploration. To overcome this limitation, we develop and deploy a genetic algorithm (GA) wherein fitness is determined by matching a calculated X-ray diffraction (XRD) pattern for a given structure to the known experimental XRD pattern. The GA approach results in structures of high fitness that portend the bulk Li-Al LDH structure. Importantly, the GA approach offers the potential to determine the structures of other LDH, and more generally layered materials, which are generally difficult to describe given the large chemical and structural space to be explored.
Collapse
Affiliation(s)
- Josiah Roberts
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Yang Song
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Mark Crocker
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Chad Risko
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
81
|
Casey AD, Son SF, Bilionis I, Barnes BC. Prediction of Energetic Material Properties from Electronic Structure Using 3D Convolutional Neural Networks. J Chem Inf Model 2020; 60:4457-4473. [PMID: 33054184 DOI: 10.1021/acs.jcim.0c00259] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Affiliation(s)
- Alex D. Casey
- School of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
| | - Steven F. Son
- School of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
| | - Ilias Bilionis
- School of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, United States
| | - Brian C. Barnes
- CCDC Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
82
|
Sheetal A, Feng Z, Savani K. Using Machine Learning to Generate Novel Hypotheses: Increasing Optimism About COVID-19 Makes People Less Willing to Justify Unethical Behaviors. Psychol Sci 2020; 31:1222-1235. [DOI: 10.1177/0956797620959594] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
How can we nudge people to not engage in unethical behaviors, such as hoarding and violating social-distancing guidelines, during the COVID-19 pandemic? Because past research on antecedents of unethical behavior has not provided a clear answer, we turned to machine learning to generate novel hypotheses. We trained a deep-learning model to predict whether or not World Values Survey respondents perceived unethical behaviors as justifiable, on the basis of their responses to 708 other items. The model identified optimism about the future of humanity as one of the top predictors of unethicality. A preregistered correlational study ( N = 218 U.S. residents) conceptually replicated this finding. A preregistered experiment ( N = 294 U.S. residents) provided causal support: Participants who read a scenario conveying optimism about the COVID-19 pandemic were less willing to justify hoarding and violating social-distancing guidelines than participants who read a scenario conveying pessimism. The findings suggest that optimism can help reduce unethicality, and they document the utility of machine-learning methods for generating novel hypotheses.
Collapse
Affiliation(s)
| | - Zhiyu Feng
- Department of Organization and Human Resources, Business School, Renmin University of China
| | - Krishna Savani
- Nanyang Business School, Nanyang Technological University
| |
Collapse
|
83
|
Hanaoka K. Deep Neural Networks for Multicomponent Molecular Systems. ACS OMEGA 2020; 5:21042-21053. [PMID: 32875241 PMCID: PMC7450624 DOI: 10.1021/acsomega.0c02599] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 07/20/2020] [Indexed: 06/11/2023]
Abstract
Deep neural networks (DNNs) represent promising approaches to molecular machine learning (ML). However, their applicability remains limited to single-component materials and a general DNN model capable of handling various multicomponent molecular systems with composition data is still elusive, while current ML approaches for multicomponent molecular systems are still molecular descriptor-based. Here, a general DNN architecture extending existing molecular DNN models to multicomponent systems called MEIA is proposed. Case studies showed that the MEIA architecture could extend two exiting molecular DNN models to multicomponent systems with the same procedure, and that the obtained models that could learn both the molecular structure and composition information with equal or better accuracies compared to a well-used molecular descriptor-based model in the best model for each case study. Furthermore, the case studies also showed that, for ML tasks where the molecular structure information plays a minor role, the performance improvements by DNN models were small; while for ML tasks where the molecular structure information plays a major role, the performance improvements by DNN models were large, and DNN models showed notable predictive accuracies for an extremely sparse dataset, which cannot be modeled without the molecular structure information. The enhanced predictive ability of DNN models for sparse datasets of multicomponent systems will extend the applicability of ML in the multicomponent material design. Furthermore, the general capability of MEIA to extend DNN models to multicomponent systems will provide new opportunities to utilize the progress of actively developed single-component DNNs for the modeling of multicomponent systems.
Collapse
|
84
|
Mancuso JL, Mroz AM, Le KN, Hendon CH. Electronic Structure Modeling of Metal-Organic Frameworks. Chem Rev 2020; 120:8641-8715. [PMID: 32672939 DOI: 10.1021/acs.chemrev.0c00148] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Owing to their molecular building blocks, yet highly crystalline nature, metal-organic frameworks (MOFs) sit at the interface between molecule and material. Their diverse structures and compositions enable them to be useful materials as catalysts in heterogeneous reactions, electrical conductors in energy storage and transfer applications, chromophores in photoenabled chemical transformations, and beyond. In all cases, density functional theory (DFT) and higher-level methods for electronic structure determination provide valuable quantitative information about the electronic properties that underpin the functions of these frameworks. However, there are only two general modeling approaches in conventional electronic structure software packages: those that treat materials as extended, periodic solids, and those that treat materials as discrete molecules. Each approach has features and benefits; both have been widely employed to understand the emergent chemistry that arises from the formation of the metal-organic interface. This Review canvases these approaches to date, with emphasis placed on the application of electronic structure theory to explore reactivity and electron transfer using periodic, molecular, and embedded models. This includes (i) computational chemistry considerations such as how functional, k-grid, and other model variables are selected to enable insights into MOF properties, (ii) extended solid models that treat MOFs as materials rather than molecules, (iii) the mechanics of cluster extraction and subsequent chemistry enabled by these molecular models, (iv) catalytic studies using both solids and clusters thereof, and (v) embedded, mixed-method approaches, which simulate a fraction of the material using one level of theory and the remainder of the material using another dissimilar theoretical implementation.
Collapse
Affiliation(s)
- Jenna L Mancuso
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97405, United States
| | - Austin M Mroz
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97405, United States
| | - Khoa N Le
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97405, United States
| | - Christopher H Hendon
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97405, United States
| |
Collapse
|
85
|
Sun Y, Liao H, Wang J, Chen B, Sun S, Ong SJH, Xi S, Diao C, Du Y, Wang JO, Breese MBH, Li S, Zhang H, Xu ZJ. Covalency competition dominates the water oxidation structure–activity relationship on spinel oxides. Nat Catal 2020. [DOI: 10.1038/s41929-020-0465-6] [Citation(s) in RCA: 140] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
86
|
Sigaki HYD, Lenzi EK, Zola RS, Perc M, Ribeiro HV. Learning physical properties of liquid crystals with deep convolutional neural networks. Sci Rep 2020; 10:7664. [PMID: 32376993 PMCID: PMC7203147 DOI: 10.1038/s41598-020-63662-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 04/03/2020] [Indexed: 02/03/2023] Open
Abstract
Machine learning algorithms have been available since the 1990s, but it is much more recently that they have come into use also in the physical sciences. While these algorithms have already proven to be useful in uncovering new properties of materials and in simplifying experimental protocols, their usage in liquid crystals research is still limited. This is surprising because optical imaging techniques are often applied in this line of research, and it is precisely with images that machine learning algorithms have achieved major breakthroughs in recent years. Here we use convolutional neural networks to probe several properties of liquid crystals directly from their optical images and without using manual feature engineering. By optimizing simple architectures, we find that convolutional neural networks can predict physical properties of liquid crystals with exceptional accuracy. We show that these deep neural networks identify liquid crystal phases and predict the order parameter of simulated nematic liquid crystals almost perfectly. We also show that convolutional neural networks identify the pitch length of simulated samples of cholesteric liquid crystals and the sample temperature of an experimental liquid crystal with very high precision.
Collapse
Affiliation(s)
- Higor Y D Sigaki
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR, 87020-900, Brazil
| | - Ervin K Lenzi
- Departamento de Física, Universidade Estadual de Ponta Grossa, Ponta Grossa, PR, 84030-900, Brazil
| | - Rafael S Zola
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR, 87020-900, Brazil
- Departamento de Física, Universidade Tecnológica Federal do Paraná, Apucarana, PR, 86812-460, Brazil
| | - Matjaž Perc
- Faculty of Natural Sciences and Mathematics, University of Maribor, Koroška cesta 160, 2000, Maribor, Slovenia
- Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
- Complexity Science Hub Vienna, Josefstädterstraße 39, 1080, Vienna, Austria
| | - Haroldo V Ribeiro
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR, 87020-900, Brazil.
| |
Collapse
|
87
|
Noh J, Gu GH, Kim S, Jung Y. Uncertainty-Quantified Hybrid Machine Learning/Density Functional Theory High Throughput Screening Method for Crystals. J Chem Inf Model 2020; 60:1996-2003. [DOI: 10.1021/acs.jcim.0c00003] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Juhwan Noh
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Daejeon 34141, Republic of Korea
| | - Geun Ho Gu
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Daejeon 34141, Republic of Korea
| | - Sungwon Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Daejeon 34141, Republic of Korea
| | - Yousung Jung
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Daejeon 34141, Republic of Korea
- Saudi Aramco–KAIST CO2 Management CenterKorea Advanced Institute of Science and Technology (KAIST)291 Daehak-ro, Daejeon 34141, Republic of Korea
| |
Collapse
|
88
|
Zhao Y, Cui Y, Xiong Z, Jin J, Liu Z, Dong R, Hu J. Machine Learning-Based Prediction of Crystal Systems and Space Groups from Inorganic Materials Compositions. ACS OMEGA 2020; 5:3596-3606. [PMID: 32118175 PMCID: PMC7045551 DOI: 10.1021/acsomega.9b04012] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 01/31/2020] [Indexed: 05/25/2023]
Abstract
Structural information of materials such as the crystal systems and space groups are highly useful for analyzing their physical properties. However, the enormous composition space of materials makes experimental X-ray diffraction (XRD) or first-principle-based structure determination methods infeasible for large-scale material screening in the composition space. Herein, we propose and evaluate machine-learning algorithms for determining the structure type of materials, given only their compositions. We couple random forest (RF) and multiple layer perceptron (MLP) neural network models with three types of features: Magpie, atom vector, and one-hot encoding (atom frequency) for the crystal system and space group prediction of materials. Four types of models for predicting crystal systems and space groups are proposed, trained, and evaluated including one-versus-all binary classifiers, multiclass classifiers, polymorphism predictors, and multilabel classifiers. The synthetic minority over-sampling technique (SMOTE) is conducted to mitigate the effects of imbalanced data sets. Our results demonstrate that RF with Magpie features generally outperforms other algorithms for binary and multiclass prediction of crystal systems and space groups, while MLP with atom frequency features is the best one for structural polymorphism prediction. For multilabel prediction, MLP with atom frequency and binary relevance with Magpie models are the best for predicting crystal systems and space groups, respectively. Our analysis of the related descriptors identifies a few key contributing features for structural-type prediction such as electronegativity, covalent radius, and Mendeleev number. Our work thus paves a way for fast composition-based structural screening of inorganic materials via predicted material structural properties.
Collapse
Affiliation(s)
- Yong Zhao
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
| | - Yuxin Cui
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
| | - Zheng Xiong
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
| | - Jing Jin
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
| | - Zhonghao Liu
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
| | - Rongzhi Dong
- School
of Mechanical Engineering, Guizhou University, Guiyang 550025, China
| | - Jianjun Hu
- Department
of Computer Science and Engineering, University
of South Carolina, Columbia 29208, South Carolina, United States
- School
of Mechanical Engineering, Guizhou University, Guiyang 550025, China
| |
Collapse
|
89
|
Selvaratnam B, Koodali RT, Miró P. Application of Symmetry Functions to Large Chemical Spaces Using a Convolutional Neural Network. J Chem Inf Model 2020; 60:1928-1935. [PMID: 32053367 DOI: 10.1021/acs.jcim.9b00835] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Balaranjan Selvaratnam
- Department of Chemistry, University of South Dakota, 57069 Vermillion, South Dakota, United States
| | - Ranjit T. Koodali
- Department of Chemistry, University of South Dakota, 57069 Vermillion, South Dakota, United States
| | - Pere Miró
- Department of Chemistry, University of South Dakota, 57069 Vermillion, South Dakota, United States
| |
Collapse
|
90
|
Critical Temperature Prediction of Superconductors Based on Atomic Vectors and Deep Learning. Symmetry (Basel) 2020. [DOI: 10.3390/sym12020262] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In this paper, a hybrid neural network (HNN) that combines a convolutional neural network (CNN) and long short-term memory neural network (LSTM) is proposed to extract the high-level characteristics of materials for critical temperature (Tc) prediction of superconductors. Firstly, by obtaining 73,452 inorganic compounds from the Materials Project (MP) database and building an atomic environment matrix, we obtained a vector representation (atomic vector) of 87 atoms by singular value decomposition (SVD) of the atomic environment matrix. Then, the obtained atom vector was used to implement the coded representation of the superconductors in the order of the atoms in the chemical formula of the superconductor. The experimental results of the HNN model trained with 12,413 superconductors were compared with three benchmark neural network algorithms and multiple machine learning algorithms using two commonly used material characterization methods. The experimental results show that the HNN method proposed in this paper can effectively extract the characteristic relationships between the atoms of superconductors, and it has high accuracy in predicting the Tc.
Collapse
|
91
|
Shin HK. Electron configuration-based neural network model to predict physicochemical properties of inorganic compounds. RSC Adv 2020; 10:33268-33278. [PMID: 35515036 PMCID: PMC9056678 DOI: 10.1039/d0ra05873d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/01/2020] [Indexed: 11/21/2022] Open
Abstract
Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds. To cope with the financial burden of the experiments, the use of computational models is permitted for prediction of properties. Although a number of physicochemical property prediction models have been developed, their applicability domain is limited to organic molecules since most available data are concerned with organic molecules, and most of the molecular descriptors are restricted to organic molecule calculations. Prediction models developed for inorganic compounds were intended to predict endpoints relevant to novel material design. Therefore, no models were available for predicting endpoints of inorganic compounds that are significant to regulatory perspectives. In this study, boiling point, water solubility, melting point, and pyrolysis point prediction models were developed for inorganic compounds based on their composition. The electron configuration of each element in the molecule was used as a descriptor in this study. The dataset covered a wide range of endpoints and diverse elements in their structure. The performance of the models was measured using R2, mean absolute error, and Spearman's correlation coefficient, and indicated good prediction accuracy of continuous endpoints and prioritization of inorganic compounds. Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds.![]()
Collapse
Affiliation(s)
- Hyun Kil Shin
- Toxicoinformatics Group
- Department of Predictive Toxicology
- Korea Institute of Toxicology
- Daejeon
- Republic of Korea
| |
Collapse
|
92
|
Pathak Y, Juneja KS, Varma G, Ehara M, Priyakumar UD. Deep learning enabled inorganic material generator. Phys Chem Chem Phys 2020; 22:26935-26943. [DOI: 10.1039/d0cp03508d] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
A machine learning framework that generates material compositions exhibiting properties desired by the user.
Collapse
Affiliation(s)
- Yashaswi Pathak
- International Institute of Information Technology
- Hyderabad 500 032
- India
| | | | - Girish Varma
- International Institute of Information Technology
- Hyderabad 500 032
- India
| | - Masahiro Ehara
- Research Center for Computational Science
- Institute for Molecular Science
- Okazaki 444-8585
- Japan
| | | |
Collapse
|
93
|
Toyao T, Maeno Z, Takakusagi S, Kamachi T, Takigawa I, Shimizu KI. Machine Learning for Catalysis Informatics: Recent Applications and Prospects. ACS Catal 2019. [DOI: 10.1021/acscatal.9b04186] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Takashi Toyao
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
| | - Zen Maeno
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
| | - Satoru Takakusagi
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
| | - Takashi Kamachi
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
- Department of Life, Environment and Materials Science, Fukuoka Institute of Technology, 3-30-1Wajiro-Higashi, Higashi-ku, Fukuoka 811-0295, Japan
| | - Ichigaku Takigawa
- RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido 001-0021, Japan
| | - Ken-ichi Shimizu
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
| |
Collapse
|
94
|
Computational Screening of New Perovskite Materials Using Transfer Learning and Deep Learning. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9245510] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
As one of the most studied materials, perovskites exhibit a wealth of superior properties that lead to diverse applications. Computational prediction of novel stable perovskite structures has big potential in the discovery of new materials for solar panels, superconductors, thermal electric, and catalytic materials, etc. By addressing one of the key obstacles of machine learning based materials discovery, the lack of sufficient training data, this paper proposes a transfer learning based approach that exploits the high accuracy of the machine learning model trained with physics-informed structural and elemental descriptors. This gradient boosting regressor model (the transfer learning model) allows us to predict the formation energy with sufficient precision of a large number of materials of which only the structural information is available. The enlarged training set is then used to train a convolutional neural network model (the screening model) with the generic Magpie elemental features with high prediction power. Extensive experiments demonstrate the superior performance of our transfer learning model and screening model compared to the baseline models. We then applied the screening model to filter out promising new perovskite materials out of 21,316 hypothetical perovskite structures with a large portion of them confirmed by existing literature.
Collapse
|
95
|
Cova TFGG, Pais AACC. Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns. Front Chem 2019; 7:809. [PMID: 32039134 PMCID: PMC6988795 DOI: 10.3389/fchem.2019.00809] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/11/2019] [Indexed: 12/14/2022] Open
Abstract
Computational Chemistry is currently a synergistic assembly between ab initio calculations, simulation, machine learning (ML) and optimization strategies for describing, solving and predicting chemical data and related phenomena. These include accelerated literature searches, analysis and prediction of physical and quantum chemical properties, transition states, chemical structures, chemical reactions, and also new catalysts and drug candidates. The generalization of scalability to larger chemical problems, rather than specialization, is now the main principle for transforming chemical tasks in multiple fronts, for which systematic and cost-effective solutions have benefited from ML approaches, including those based on deep learning (e.g. quantum chemistry, molecular screening, synthetic route design, catalysis, drug discovery). The latter class of ML algorithms is capable of combining raw input into layers of intermediate features, enabling bench-to-bytes designs with the potential to transform several chemical domains. In this review, the most exciting developments concerning the use of ML in a range of different chemical scenarios are described. A range of different chemical problems and respective rationalization, that have hitherto been inaccessible due to the lack of suitable analysis tools, is thus detailed, evidencing the breadth of potential applications of these emerging multidimensional approaches. Focus is given to the models, algorithms and methods proposed to facilitate research on compound design and synthesis, materials design, prediction of binding, molecular activity, and soft matter behavior. The information produced by pairing Chemistry and ML, through data-driven analyses, neural network predictions and monitoring of chemical systems, allows (i) prompting the ability to understand the complexity of chemical data, (ii) streamlining and designing experiments, (ii) discovering new molecular targets and materials, and also (iv) planning or rethinking forthcoming chemical challenges. In fact, optimization engulfs all these tasks directly.
Collapse
Affiliation(s)
- Tânia F. G. G. Cova
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| | - Alberto A. C. C. Pais
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
96
|
Jha D, Choudhary K, Tavazza F, Liao WK, Choudhary A, Campbell C, Agrawal A. Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning. Nat Commun 2019; 10:5316. [PMID: 31757948 PMCID: PMC6874674 DOI: 10.1038/s41467-019-13297-w] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 10/24/2019] [Indexed: 01/11/2023] Open
Abstract
The current predictive modeling techniques applied to Density Functional Theory (DFT) computations have helped accelerate the process of materials discovery by providing significantly faster methods to scan materials candidates, thereby reducing the search space for future DFT computations and experiments. However, in addition to prediction error against DFT-computed properties, such predictive models also inherit the DFT-computation discrepancies against experimentally measured properties. To address this challenge, we demonstrate that using deep transfer learning, existing large DFT-computational data sets (such as the Open Quantum Materials Database (OQMD)) can be leveraged together with other smaller DFT-computed data sets as well as available experimental observations to build robust prediction models. We build a highly accurate model for predicting formation energy of materials from their compositions; using an experimental data set of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$1,963$$\end{document}1,963 observations, the proposed approach yields a mean absolute error (MAE) of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$0.06$$\end{document}0.06 eV/atom, which is significantly better than existing machine learning (ML) prediction modeling based on DFT computations and is comparable to the MAE of DFT-computation itself. Machine-learning approaches based on DFT computations can greatly enhance materials discovery. Here the authors leverage existing large DFT-computational data sets and experimental observations by deep transfer learning to predict the formation energy of materials from their elemental compositions with high accuracy.
Collapse
Affiliation(s)
- Dipendra Jha
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Kamal Choudhary
- Thermodynamics and Kinetics Group, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Francesca Tavazza
- Thermodynamics and Kinetics Group, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Wei-Keng Liao
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Alok Choudhary
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Carelyn Campbell
- Thermodynamics and Kinetics Group, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Ankit Agrawal
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA.
| |
Collapse
|
97
|
Schütt KT, Gastegger M, Tkatchenko A, Müller KR, Maurer RJ. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat Commun 2019; 10:5024. [PMID: 31729373 PMCID: PMC6858523 DOI: 10.1038/s41467-019-12875-2] [Citation(s) in RCA: 201] [Impact Index Per Article: 40.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 09/25/2019] [Indexed: 12/03/2022] Open
Abstract
Machine learning advances chemistry and materials science by enabling large-scale exploration of chemical space based on quantum chemical calculations. While these models supply fast and accurate predictions of atomistic chemical properties, they do not explicitly capture the electronic degrees of freedom of a molecule, which limits their applicability for reactive chemistry and chemical analysis. Here we present a deep learning framework for the prediction of the quantum mechanical wavefunction in a local basis of atomic orbitals from which all other ground-state properties can be derived. This approach retains full access to the electronic structure via the wavefunction at force-field-like efficiency and captures quantum mechanics in an analytically differentiable representation. On several examples, we demonstrate that this opens promising avenues to perform inverse design of molecular structures for targeting electronic property optimisation and a clear path towards increased synergy of machine learning and quantum chemistry.
Collapse
Affiliation(s)
- K T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - M Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - A Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511, Luxembourg, Luxembourg.
| | - K-R Müller
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max-Planck-Institut für Informatik, Saarbrücken, Germany.
| | - R J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, CV4 7AL, Coventry, UK.
| |
Collapse
|
98
|
Huang L, Ling C. Representing Multiword Chemical Terms through Phrase-Level Preprocessing and Word Embedding. ACS OMEGA 2019; 4:18510-18519. [PMID: 31737809 PMCID: PMC6854573 DOI: 10.1021/acsomega.9b02060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 10/18/2019] [Indexed: 06/10/2023]
Abstract
In recent years, data-driven methods and artificial intelligence have been widely used in chemoinformatic and material informatics domains, for which the success is critically determined by the availability of training data with good quality and large quantity. A potential approach to break this bottleneck is by leveraging the chemical literature such as papers and patents as alternative data resources to high throughput experiments and simulation. Compared to other domains where natural language processing techniques have established successes, the chemical literature contains a large portion of phrases of multiple words that create additional challenges for accurate identification and representation. Here, we introduce a chemistry domain suitable approach to identify multiword chemical terms and train word representations at the phrase level. Through a series of special-designed experiments, we demonstrate that our multiword identifying and representing method effectively and accurately identifies multiword chemical terms from 119, 166 chemical patents and is more robust and precise to preserve the semantic meaning of chemical phrases compared to the conventional approach, which represents constituent single words first and combine them afterward. Because the accurate representation of chemical terms is the first and essential step to provide learning features for downstream natural language processing tasks, our results pave the road to utilize the large volume of chemical literature in future data-driven studies.
Collapse
Affiliation(s)
- Liyuan Huang
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan 48105, United States
| | - Chen Ling
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan 48105, United States
| |
Collapse
|
99
|
A fast neural network approach for direct covariant forces prediction in complex multi-element extended systems. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0098-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
100
|
Paul A, Furmanchuk A, Liao W, Choudhary A, Agrawal A. Property Prediction of Organic Donor Molecules for Photovoltaic Applications Using Extremely Randomized Trees. Mol Inform 2019; 38:e1900038. [DOI: 10.1002/minf.201900038] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 07/18/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Arindam Paul
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Alona Furmanchuk
- Institute for Public Health and Medicine, Feinberg School of Medicine, Center for Health Information Partnerships Northwestern University Chicago IL, 60611 USA
| | - Wei‐keng Liao
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Alok Choudhary
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Ankit Agrawal
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| |
Collapse
|