1
|
Smith NB, Garden AL. A Divide-and-Conquer Approach to Nanoparticle Global Optimisation Using Machine Learning. J Chem Inf Model 2024; 64:8743-8755. [PMID: 39546324 DOI: 10.1021/acs.jcim.4c01516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Global optimization of the structure of atomic nanoparticles is often hampered by the presence of many funnels on the potential energy surface. While broad funnels are readily encountered and easily exploited by the search, narrow funnels are more difficult to locate and explore, presenting a problem if the global minimum is situated in such a funnel. Here, a divide-and-conquer approach is applied to overcome the issue posed by the multifunnel effect using a machine learning approach, without using a priori knowledge of the potential energy surface. This approach begins with a truncated exploration to gather coarse-grained knowledge of the potential energy surface. This is then used to train a machine learning Gaussian mixture model to divide up the potential energy surface into separate regions, with each region then being explored in more detail (or conquered) separately. This scheme was tested on a variety of multifunnel systems and yielded significant improvements to the times taken to locate the global minima of Lennard-Jones (LJ) nanoparticles, LJ75 and LJ104, as well as two metallic systems, Au55 and Pd88. However, difficulties were encountered for LJ98, providing insight into how the scheme could be further improved.
Collapse
Affiliation(s)
- Nicholas B Smith
- Department of Chemistry, University of Otago, P.O. Box 56, Dunedin 9054, New Zealand
- MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, P.O. Box 600, Wellington 6140, New Zealand
| | - Anna L Garden
- Department of Chemistry, University of Otago, P.O. Box 56, Dunedin 9054, New Zealand
- MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, P.O. Box 600, Wellington 6140, New Zealand
| |
Collapse
|
2
|
Forni T, Baldoni M, Le Piane F, Mercuri F. GrapheNet: a deep learning framework for predicting the physical and electronic properties of nanographenes using images. Sci Rep 2024; 14:24576. [PMID: 39426999 PMCID: PMC11490583 DOI: 10.1038/s41598-024-75841-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 10/08/2024] [Indexed: 10/21/2024] Open
Abstract
In this work we introduce GrapheNet, a deep learning framework based on an Inception-Resnet architecture using image-like encoding of structural features for the prediction of the properties of nanographenes. The model is validated on datasets of computed structure/property data on graphene oxide and defected graphene nanoflakes. By exploiting the planarity of quasi-bidimensional systems and through encoding structures into images, and leveraging the flexibility and power of deep learning in image processing, Graphenet achieves significant accuracy in predicting the physicochemical properties of nanographenes. This approach is able to efficiently encode structures composed of hundreds of atoms, scaling efficiently with the size of the model and enabling the prediction of the properties of large systems, which contrasts with the limitations of current atomistic-level representations for deep learning applications. The approach proposed based on image encoding exhibit a significant numerical accuracy and outperforms the computational efficiency of current representations of materials at the atomistic level, with significant advantages especially in the representation of nanostructures and large planar systems.
Collapse
Affiliation(s)
- Tommaso Forni
- DAIMON Lab, Istituto per lo Studio dei Materiali Nanostrutturati (ISMN), Consiglio Nazionale delle Ricerche (CNR), Via P. Gobetti 101, Bologna, 40129, Italy
- Department of Control and Computer Engineering, Polytechnic University of Turin, Corso Castelfidardo 34/d, Turin, 10138, Italy
| | - Matteo Baldoni
- DAIMON Lab, Istituto per lo Studio dei Materiali Nanostrutturati (ISMN), Consiglio Nazionale delle Ricerche (CNR), Via P. Gobetti 101, Bologna, 40129, Italy
| | - Fabio Le Piane
- DAIMON Lab, Istituto per lo Studio dei Materiali Nanostrutturati (ISMN), Consiglio Nazionale delle Ricerche (CNR), Via P. Gobetti 101, Bologna, 40129, Italy
- Department of Computer Science and Engineering, University of Bologna, via Zamboni 33, Bologna, 40126, Italy
| | - Francesco Mercuri
- DAIMON Lab, Istituto per lo Studio dei Materiali Nanostrutturati (ISMN), Consiglio Nazionale delle Ricerche (CNR), Via P. Gobetti 101, Bologna, 40129, Italy.
| |
Collapse
|
3
|
Zhuang Z, Barnard AS. Classification of battery compounds using structure-free Mendeleev encodings. J Cheminform 2024; 16:47. [PMID: 38671512 PMCID: PMC11055346 DOI: 10.1186/s13321-024-00836-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 03/29/2024] [Indexed: 04/28/2024] Open
Abstract
Machine learning is a valuable tool that can accelerate the discovery and design of materials occupying combinatorial chemical spaces. However, the prerequisite need for vast amounts of training data can be prohibitive when significant resources are needed to characterize or simulate candidate structures. Recent results have shown that structure-free encoding of complex materials, based entirely on chemical compositions, can overcome this impediment and perform well in unsupervised learning tasks. In this study, we extend this exploration to supervised classification, and show how structure-free encoding can accurately predict classes of material compounds for battery applications without time consuming measurement of bonding networks, lattices or densities. SCIENTIFIC CONTRIBUTION: The comprehensive evaluation of structure-free encodings of complex materials in classification tasks, including binary and multi-class separation, inclusive of three classifiers based on different logic function, is measured four metrics and learning curves. The encoding is applied to two data sets from computational and experimental sources, and the outcomes visualised using 5 approaches to confirms the suitability and superiority of Mendeleev encoding. These methods are general and accessible using source software, to provide simple, intuitive and interpretable materials informatics outcomes to accelerate materials design.
Collapse
Affiliation(s)
- Zixin Zhuang
- School of Computing, Australian National University, 145 Science Road, Acton, 2601, ACT, Australia
| | - Amanda S Barnard
- School of Computing, Australian National University, 145 Science Road, Acton, 2601, ACT, Australia.
| |
Collapse
|
4
|
Abstract
A significant challenge in the development of functional materials is understanding the growth and transformations of anisotropic colloidal metal nanocrystals. Theory and simulations can aid in the development and understanding of anisotropic nanocrystal syntheses. The focus of this review is on how results from first-principles calculations and classical techniques, such as Monte Carlo and molecular dynamics simulations, have been integrated into multiscale theoretical predictions useful in understanding shape-selective nanocrystal syntheses. Also, examples are discussed in which machine learning has been useful in this field. There are many areas at the frontier in condensed matter theory and simulation that are or could be beneficial in this area and these prospects for future progress are discussed.
Collapse
Affiliation(s)
- Kristen A Fichthorn
- Department of Chemical Engineering and Department of Physics The Pennsylvania State University University Park, Pennsylvania 16803 United States
| |
Collapse
|
5
|
Bhat N, Barnard AS, Birbilis N. Unsupervised machine learning discovers classes in aluminium alloys. ROYAL SOCIETY OPEN SCIENCE 2023; 10:220360. [PMID: 36756073 PMCID: PMC9890099 DOI: 10.1098/rsos.220360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 10/26/2022] [Indexed: 06/18/2023]
Abstract
Aluminium (Al) alloys are critical to many applications. Although Al alloys have been commercially widespread for over a century, their development has predominantly taken a trial-and-error approach. Furthermore, many discrete studies regarding Al alloys, often application specific, have precluded a broader consolidation of Al alloy classification. Iterative label spreading (ILS), an unsupervised machine learning approach, was used to identify the different classes of Al alloys, drawing from a specifically curated dataset of 1154 Al alloys (including alloy composition and processing conditions). Using ILS, eight classes of Al alloys were identified based on a comprehensive feature set under two descriptors. Further, a decision tree classifier was used to validate the separation of classes.
Collapse
Affiliation(s)
- Ninad Bhat
- College of Engineering and Computer Science, The Australian National University, Acton, ACT 2601, Australia
| | - Amanda S. Barnard
- College of Engineering and Computer Science, The Australian National University, Acton, ACT 2601, Australia
| | - Nick Birbilis
- College of Engineering and Computer Science, The Australian National University, Acton, ACT 2601, Australia
| |
Collapse
|
6
|
Frömbgen T, Blasius J, Alizadeh V, Chaumont A, Brehm M, Kirchner B. Cluster Analysis in Liquids: A Novel Tool in TRAVIS. J Chem Inf Model 2022; 62:5634-5644. [PMID: 36315975 DOI: 10.1021/acs.jcim.2c01244] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a novel cluster analysis implemented in our open-source software TRAVIS and its application to realistic and complex chemical systems. The underlying algorithm is exclusively based on atom distances. Using a two-dimensional model system, we first introduce different cluster analysis functions and their application to single snapshots and trajectories including periodicity and temporal propagation. Using molecular dynamics simulations of pure water with varying system size, we show that our cluster analysis is size-independent. Furthermore, we observe a similar clustering behavior of pure water in classical and ab initio molecular dynamics simulations, showing that our cluster analysis is universal. In order to emphasize the application to more complex systems and mixtures, we additionally apply the cluster analysis to ab initio molecular dynamics simulations of the [C2C1Im][OAc] ionic liquid and its mixture with water. Using that, we show that our cluster analysis is able to analyze the clustering of the individual components in a mixture as well as the clustering of the ionic liquid with water.
Collapse
Affiliation(s)
- Tom Frömbgen
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstraße 4+6, D-53115 Bonn, Germany
| | - Jan Blasius
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstraße 4+6, D-53115 Bonn, Germany
| | - Vahideh Alizadeh
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstraße 4+6, D-53115 Bonn, Germany
| | - Alain Chaumont
- Laboratoire MSM, UMR 7140 CNRS, Institut de Chimie, 4 Rue Blaise Pascal, F-67000 Strasbourg, France
| | - Martin Brehm
- Institut für Chemie, Martin-Luther-Universität Halle-Wittenberg, Von-Danckelmann-Platz 4, D-06120 Halle (Saale), Germany
| | - Barbara Kirchner
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstraße 4+6, D-53115 Bonn, Germany
| |
Collapse
|
7
|
Tao Z, Fangfang X. Research on the therapeutic effects of drugs for patients with mental illness based on cluster analysis. APPLIED NANOSCIENCE 2021. [DOI: 10.1007/s13204-021-02071-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Zhang H, Barnard AS. Impact of atomistic or crystallographic descriptors for classification of gold nanoparticles. NANOSCALE 2021; 13:11887-11898. [PMID: 34190263 DOI: 10.1039/d1nr02258j] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Machine learning models are known to be sensitive to the features used to train them, but there is currently no way to predict the impact of using different features prior to feature extraction. This is particularly important to fields such as nanotechnology that are highly multi-disciplinary, and samples can be characterised many different ways depending on the preferences of individual researchers. Does it matter if nanomaterials are described using the interatomic coordinations or more complex order parameters? In this study we compare results of supervised and unsupervised learning on a single set of gold nanoparticles that has been characterised by two different descriptors, each with a unique feature space. We find that there are some consistencies, and model selection is descriptor-agnostic, but the level of detail and the type of information that can be extracted from the results is sensitive to the way the particles are described. Unsupervised clustering revealed that an atomistic descriptor provides a finer-grained interpretation and clusters that are sub-clusters of a more sophisticated crystallographic descriptor, which is consistent with both how the features were calculated, and how they are interpreted in the domain. A supervised classifier revealed that the types of features responsible for the separation are related to the bulk structure, regardless of the descriptor, but capture different types of information. For both the atomistic and crystallographic descriptor the gradient boosting decision tree classifier gave superior results of F1-scores of 0.96 and 0.98, respectively, with excellent precision and recall, even though the clustering presented a challenging multi-classification problem.
Collapse
Affiliation(s)
- Haonan Zhang
- School of Computing, Australian National University, Acton 2601, Australia.
| | | |
Collapse
|
9
|
Deng B, Wu J. The Cultivation of Innovation and Entrepreneurship Skills and Teaching Strategies for College Students from the Perspective of Big Data. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2021. [DOI: 10.1007/s13369-021-05893-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Calvo F, Simon A, Parneix P, Falvo C, Dubosq C. Infrared Spectroscopy of Chemically Diverse Carbon Clusters: A Data-Driven Approach. J Phys Chem A 2021; 125:5509-5518. [PMID: 34138562 DOI: 10.1021/acs.jpca.1c03368] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Carbon clusters exhibit a broad diversity of topologies and shapes, encompassing fullerene-like cages, graphene-like flakes, and more disordered pretzel-like and branched structures. Here, we examine computationally their infrared spectra in relation with these structures from a statistical perspective. Individual spectra for broad samples of isomers were determined by means of the self-consistent charge density functional-based tight-binding method, and an interpolation scheme is designed to reproduce the spectral features by regression on a much smaller subset of the sample. This interpolation proceeds by encoding the structures using appropriate descriptors and selecting them through principal component analysis, Gaussian regression or inverse distance weighting providing the nonlinear weighting functions. Metric learning is employed to reduce the global error on a preselected testing set. The interpolated spectra satisfactorily reproduce the specific spectral features and their dependence on the size and shape, enabling quantitative prediction away from the testing set. Finally, the classification of structures within the four proposed families is critically discussed through a statistical analysis of the sample based on iterative label spreading.
Collapse
Affiliation(s)
- Florent Calvo
- Univ. Grenoble Alpes, CNRS, LiPhy, 38000 Grenoble, France
| | - Aude Simon
- Laboratoire de Chimie et Physique Quantiques LCPQ/FeRMI, UMR5626, Université de Toulouse (UPS) and CNRS, 31062 Toulouse, France
| | - Pascal Parneix
- Université Paris-Saclay, CNRS, Institut des Sciences Moléculaires d'Orsay, 91405 Orsay, France
| | - Cyril Falvo
- Univ. Grenoble Alpes, CNRS, LiPhy, 38000 Grenoble, France.,Université Paris-Saclay, CNRS, Institut des Sciences Moléculaires d'Orsay, 91405 Orsay, France
| | - Clément Dubosq
- Laboratoire de Chimie et Physique Quantiques LCPQ/FeRMI, UMR5626, Université de Toulouse (UPS) and CNRS, 31062 Toulouse, France
| |
Collapse
|
11
|
Du H, Feng L, Xu Y, Zhan E, Xu W. Clinical Influencing Factors of Acute Myocardial Infarction Based on Improved Machine Learning. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:5569039. [PMID: 33854744 PMCID: PMC8019385 DOI: 10.1155/2021/5569039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 02/25/2021] [Accepted: 03/14/2021] [Indexed: 11/26/2022]
Abstract
At present, there is no method to predict or monitor patients with AMI, and there is no specific treatment method. In order to improve the analysis of clinical influencing factors of acute myocardial infarction, based on the machine learning algorithm, this paper uses the K-means algorithm to carry out multifactor analysis and constructs a hybrid model combined with the ART2 network. Moreover, this paper simulates and analyzes the model training process and builds a system structure model based on the KNN algorithm. After constructing the model system, this paper studies the clinical influencing factors of acute myocardial infarction and combines mathematical statistics and factor analysis to carry out statistical analysis of test results. The research results show that the system model constructed in this paper has a certain effect in the clinical analysis of acute myocardial infarction.
Collapse
Affiliation(s)
- Hongwei Du
- Department of Cardiology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Linxing Feng
- Department of Cardiology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Yan Xu
- Department of Cardiology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Enbo Zhan
- Department of Cardiology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Wei Xu
- Department of Cardiology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| |
Collapse
|
12
|
Parker AJ, Barnard AS. Unsupervised structure classes vs. supervised property classes of silicon quantum dots using neural networks. NANOSCALE HORIZONS 2021; 6:277-282. [PMID: 33527922 DOI: 10.1039/d0nh00637h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Machine learning classification is a useful technique to predict structure/property relationships in samples of nanomaterials where distributions of sizes and mixtures of shapes are persistent. The separation of classes, however, can either be supervised based on domain knowledge (human intelligence), or based entirely on unsupervised machine learning (artificial intelligence). This raises the questions as to which approach is more reliable, and how they compare? In this study we combine an ensemble data set of electronic structure simulations of the size, shape and peak wavelength for the optical emission of hydrogen passivated silicon quantum dots with artificial neural networks to explore the utility of different types of classes. By comparing the domain-driven and data-driven approaches we find there is a disconnect between what we see (optical emission) and assume (that a particular color band represents a special class), and what the data supports. Contrary to expectation, controlling a limited set of structural characteristics is not specific enough to classify a quantum dot based on color, even though it is experimentally intuitive.
Collapse
Affiliation(s)
- Amanda J Parker
- CSIRO Data61, Door 34 Goods Shed Village St, Docklands, Victoria, Australia
| | | |
Collapse
|
13
|
Parker AJ, Motevalli B, Opletal G, Barnard AS. The pure and representative types of disordered platinum nanoparticles from machine learning. NANOTECHNOLOGY 2021; 32:095404. [PMID: 33212430 DOI: 10.1088/1361-6528/abcc23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The development of interpretable structure/property relationships is a cornerstone of nanoscience, but can be challenging when the structural diversity and complexity exceeds our ability to characterise it. This is often the case for imperfect, disordered and amorphous nanoparticles, where even the nomenclature can be unspecific. Disordered platinum nanoparticles have exhibited superior performance for some reactions, which makes a systematic way of describing them highly desirable. In this study we have used a diverse set of disorder platinum nanoparticles and machine learning to identify the pure and representative structures based on their similarity in 121 dimensions. We identify two prototypes that are representative of separable classes, and seven archetypes that are the pure structures on the convex hull with which all other possibilities can be described. Together these nine nanoparticles can explain all of the variance in the set, and can be described as either single crystal, twinned, spherical or branched; with or without roughened surfaces. This forms a robust sub-set of platinum nanoparticle upon which to base further work, and provides a theoretical basis for discussing structure/property relationships of platinum nanoparticles that are not geometrically ideal.
Collapse
Affiliation(s)
| | | | | | - Amanda S Barnard
- ANU Research School of Computer Science, Acton ACT 2601, Australia
| |
Collapse
|
14
|
Parker AJ, Barnard AS. Machine learning reveals multiple classes of diamond nanoparticles. NANOSCALE HORIZONS 2020; 5:1394-1399. [PMID: 32840548 DOI: 10.1039/d0nh00382d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Generating samples of nanoparticles with specific properties that allow for structural diversity, rather than requiring structural precision, is a more sustainable prospect for industry, where samples need to be both targeted to specific applications and cost effective. This can be better enabled by defining classes of nanoparticles and characterising the properties of the class as a whole. In this study, we use machine learning to predict the different classes of diamond nanoparticles based entirely on the structural features and explore the populations of these classes in terms of the size, shape, speciation and charge transfer properties. We identify 9 different types of diamond nanoparticles based on their similarity in 17 dimensions and, contrary to conventional wisdom, find that the fraction of sp2 or sp3 hybridized atoms are not strong determinants, and that the classes are only weakly related to size. Each class has been describe in such way as to enable rapid assignment using microanalysis techniques.
Collapse
Affiliation(s)
- Amanda J Parker
- Data61 CSIRO, Door 34 Goods Shed Village St, Docklands, Victoria, Australia.
| | | |
Collapse
|
15
|
Barnard AS, Opletal G. Predicting structure/property relationships in multi-dimensional nanoparticle data using t-distributed stochastic neighbour embedding and machine learning. NANOSCALE 2019; 11:23165-23172. [PMID: 31777891 DOI: 10.1039/c9nr03940f] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Combining researchers' domain expertise and advanced dimension reduction methods we demonstrate how visually comparing the distribution of nanoparticles mapped from multiple dimensions to a two dimensional plane can rapidly identify possible single-structure/property relationships and to a lesser extent multi-structure/property relationships. These relationships can be further investigated and confirmed with machine learning, using genetic programming to inform the choice of property-specific models and their hyper-parameters. In the case of our nanodiamond case study, we visually identify and confirm a strong relationship between the size and the probability of observation (stability) and a more complicated (and visually ambiguous) relationship between the ionisation potential and band gaps with a range of different structural, chemical and statistical surface features, making it more difficult to engineer in practice.
Collapse
Affiliation(s)
- A S Barnard
- CSIRO Data61, Docklands, Victoria, Australia.
| | - G Opletal
- CSIRO Data61, Docklands, Victoria, Australia.
| |
Collapse
|