1
|
He S, Segura Abarrategi J, Bediaga H, Arrasate S, González-Díaz H. On the additive artificial intelligence-based discovery of nanoparticle neurodegenerative disease drug delivery systems. BEILSTEIN JOURNAL OF NANOTECHNOLOGY 2024; 15:535-555. [PMID: 38774585 PMCID: PMC11106676 DOI: 10.3762/bjnano.15.47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/23/2024] [Indexed: 05/24/2024]
Abstract
Neurodegenerative diseases are characterized by slowly progressing neuronal cell death. Conventional drug treatment strategies often fail because of poor solubility, low bioavailability, and the inability of the drugs to effectively cross the blood-brain barrier. Therefore, the development of new neurodegenerative disease drugs (NDDs) requires immediate attention. Nanoparticle (NP) systems are of increasing interest for transporting NDDs to the central nervous system. However, discovering effective nanoparticle neuronal disease drug delivery systems (N2D3Ss) is challenging because of the vast number of combinations of NP and NDD compounds, as well as the various assays involved. Artificial intelligence/machine learning (AI/ML) algorithms have the potential to accelerate this process by predicting the most promising NDD and NP candidates for assaying. Nevertheless, the relatively limited amount of reported data on N2D3S activity compared to assayed NDDs makes AI/ML analysis challenging. In this work, the IFPTML technique, which combines information fusion (IF), perturbation theory (PT), and machine learning (ML), was employed to address this challenge. Initially, we conducted the fusion into a unified dataset comprising 4403 NDD assays from ChEMBL and 260 NP cytotoxicity assays from journal articles. Through a resampling process, three new working datasets were generated, each containing 500,000 cases. We utilized linear discriminant analysis (LDA) along with artificial neural network (ANN) algorithms, such as multilayer perceptron (MLP) and deep learning networks (DLN), to construct linear and non-linear IFPTML models. The IFPTML-LDA models exhibited sensitivity (Sn) and specificity (Sp) values in the range of 70% to 73% (>375,000 training cases) and 70% to 80% (>125,000 validation cases), respectively. In contrast, the IFPTML-MLP and IFPTML-DLN achieved Sn and Sp values in the range of 85% to 86% for both training and validation series. Additionally, IFPTML-ANN models showed an area under the receiver operating curve (AUROC) of approximately 0.93 to 0.95. These results indicate that the IFPTML models could serve as valuable tools in the design of drug delivery systems for neurosciences.
Collapse
Affiliation(s)
- Shan He
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Julen Segura Abarrategi
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Harbil Bediaga
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- Painting Department, Fine Arts Faculty, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Instituto Biofisika (UPV/EHU-CSIC), 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
2
|
Lu S, Jayaraman A. Pair-Variational Autoencoders for Linking and Cross-Reconstruction of Characterization Data from Complementary Structural Characterization Techniques. JACS AU 2023; 3:2510-2521. [PMID: 37772182 PMCID: PMC10523369 DOI: 10.1021/jacsau.3c00275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/11/2023] [Accepted: 07/11/2023] [Indexed: 09/30/2023]
Abstract
In materials research, structural characterization often requires multiple complementary techniques to obtain a holistic morphological view of a synthesized material. Depending on the availability and accessibility of the different characterization techniques (e.g., scattering, microscopy, spectroscopy), each research facility or academic research lab may have access to high-throughput capability in one technique but face limitations (sample preparation, resolution, access time) with other technique(s). Furthermore, one type of structural characterization data may be easier to interpret than another (e.g., microscopy images are easier to interpret than small-angle scattering profiles). Thus, it is useful to have machine learning models that can be trained on paired structural characterization data from multiple techniques (easy and difficult to interpret, fast and slow in data collection or sample preparation) so that the model can generate one set of characterization data from the other. In this paper we demonstrate one such machine learning workflow, Pair-Variational Autoencoders (PairVAE), that works with data from small-angle X-ray scattering (SAXS) that present information about bulk morphology and images from scanning electron microscopy (SEM) that present two-dimensional local structural information on the sample. Using paired SAXS and SEM data of newly observed block copolymer assembled morphologies [open access data from Doerk G. S.; et al. Sci. Adv.2023, 9 ( (2), ), eadd3687], we train our PairVAE. After successful training, we demonstrate that the PairVAE can generate SEM images of the block copolymer morphology when it takes as input that sample's corresponding SAXS 2D pattern and vice versa. This method can be extended to other soft material morphologies as well and serves as a valuable tool for easy interpretation of 2D SAXS patterns as well as an engine for generating ensembles of similar microscopy images to create a database for other downstream calculations of structure-property relationships.
Collapse
Affiliation(s)
- Shizhao Lu
- Department
of Chemical and Biomolecular Engineering, University of Delaware, Newark, Delaware 19716, United States
| | - Arthi Jayaraman
- Department
of Chemical and Biomolecular Engineering, University of Delaware, Newark, Delaware 19716, United States
- Department
of Materials Science and Engineering, University
of Delaware, Newark, Delaware 19716, United
States
| |
Collapse
|
3
|
Kirschbaum T, von Seggern B, Dzubiella J, Bande A, Noé F. Machine Learning Frontier Orbital Energies of Nanodiamonds. J Chem Theory Comput 2023; 19:4461-4473. [PMID: 37053438 DOI: 10.1021/acs.jctc.2c01275] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Nanodiamonds have a wide range of applications including catalysis, sensing, tribology, and biomedicine. To leverage nanodiamond design via machine learning, we introduce the new data set ND5k, consisting of 5089 diamondoid and nanodiamond structures and their frontier orbital energies. ND5k structures are optimized via tight-binding density functional theory (DFTB) and their frontier orbital energies are computed using density functional theory (DFT) with the PBE0 hybrid functional. From this data set we derive a qualitative design suggestion for nanodiamonds in photocatalysis. We also compare recent machine learning models for predicting frontier orbital energies for similar structures as they have been trained on (interpolation on ND5k), and we test their abilities to extrapolate predictions to larger structures. For both the interpolation and extrapolation task, we find the best performance using the equivariant message passing neural network PaiNN. The second best results are achieved with a message passing neural network using a tailored set of atomic descriptors proposed here.
Collapse
Affiliation(s)
- Thorren Kirschbaum
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Börries von Seggern
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Arnimallee 22, 14195 Berlin, Germany
| | - Joachim Dzubiella
- Institute of Physics, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 3, 79104 Freiburg im Breisgau, Germany
| | - Annika Bande
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
4
|
Wang Z, Sun Z, Yin H, Liu X, Wang J, Zhao H, Pang CH, Wu T, Li S, Yin Z, Yu XF. Data-Driven Materials Innovation and Applications. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2022; 34:e2104113. [PMID: 35451528 DOI: 10.1002/adma.202104113] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 03/19/2022] [Indexed: 05/07/2023]
Abstract
Owing to the rapid developments to improve the accuracy and efficiency of both experimental and computational investigative methodologies, the massive amounts of data generated have led the field of materials science into the fourth paradigm of data-driven scientific research. This transition requires the development of authoritative and up-to-date frameworks for data-driven approaches for material innovation. A critical discussion on the current advances in the data-driven discovery of materials with a focus on frameworks, machine-learning algorithms, material-specific databases, descriptors, and targeted applications in the field of inorganic materials is presented. Frameworks for rationalizing data-driven material innovation are described, and a critical review of essential subdisciplines is presented, including: i) advanced data-intensive strategies and machine-learning algorithms; ii) material databases and related tools and platforms for data generation and management; iii) commonly used molecular descriptors used in data-driven processes. Furthermore, an in-depth discussion on the broad applications of material innovation, such as energy conversion and storage, environmental decontamination, flexible electronics, optoelectronics, superconductors, metallic glasses, and magnetic materials, is provided. Finally, how these subdisciplines (with insights into the synergy of materials science, computational tools, and mathematics) support data-driven paradigms is outlined, and the opportunities and challenges in data-driven material innovation are highlighted.
Collapse
Affiliation(s)
- Zhuo Wang
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
- Department of Chemical and Environmental Engineering, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
| | - Zhehao Sun
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Hang Yin
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Xinghui Liu
- Department of Chemistry, Sungkyunkwan University (SKKU), 2066 Seoburo, Jangan-Gu, Suwon, 16419, Republic of Korea
| | - Jinlan Wang
- School of Physics, Southeast University, Nanjing, 211189, P. R. China
| | - Haitao Zhao
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
| | - Cheng Heng Pang
- Department of Chemical and Environmental Engineering, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
- Municipal Key Laboratory of Clean Energy Conversion Technologies, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
| | - Tao Wu
- Key Laboratory for Carbonaceous Wastes Processing and Process Intensification Research of Zhejiang Province, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China
- New Materials Institute, University of Nottingham, Ningbo, China, Ningbo, 315100, P. R. China
| | - Shuzhou Li
- School of Materials Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
| | - Zongyou Yin
- Research School of Chemistry, The Australian National University, ACT, 2601, Australia
| | - Xue-Feng Yu
- Materials Interfaces Center, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, P. R. China
| |
Collapse
|
5
|
Ting JYC, Li S, Barnard AS. Causal Paths Allowing Simultaneous Control of Multiple Nanoparticle Properties Using Multi‐Target Bayesian Inference. ADVANCED THEORY AND SIMULATIONS 2022. [DOI: 10.1002/adts.202200330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | - Sichao Li
- School of Computing Australian National University Acton 2601 Australia
| | - Amanda S. Barnard
- School of Computing Australian National University Acton 2601 Australia
| |
Collapse
|
6
|
Machine learning assisted optimization of blending process of polyphenylene sulfide with elastomer using high speed twin screw extruder. Sci Rep 2021; 11:24079. [PMID: 34911974 PMCID: PMC8674312 DOI: 10.1038/s41598-021-03513-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 12/06/2021] [Indexed: 11/08/2022] Open
Abstract
Random forest regression was applied to optimize the melt-blending process of polyphenylene sulfide (PPS) with poly(ethylene-glycidyl methacrylate-methyl acrylate) (E-GMA-MA) elastomer to improve the Charpy impact strength. A training dataset was constructed using four elastomers with different GMA and MA contents by varying the elastomer content up to 20 wt% and the screw rotation speed of the extruder up to 5000 rpm at a fixed barrel temperature of 300 °C. Besides the controlled parameters, the following measured parameters were incorporated into the descriptors for the regression: motor torque, polymer pressure, and polymer temperatures monitored by infrared-ray thermometers installed at four positions (T1 to T4) as well as the melt viscosity and elastomer particle diameter of the product. The regression without prior knowledge revealed that the polymer temperature T1 just after the first kneading block is an important parameter next to the elastomer content. High impact strength required high elastomer content and T1 below 320 °C. The polymer temperature T1 was much higher than the barrel temperature and increased with the screw speed due to the heat of shear. The overheating caused thermal degradation, leading to a decrease in the melt viscosity and an increase in the particle diameter at high screw speed. We thus reduced the barrel temperature to keep T1 around 310 °C. This increased the impact strength from 58.6 kJ m−2 as the maximum in the training dataset to 65.3 and 69.0 kJ m−2 at elastomer contents of 20 and 30 wt%, respectively.
Collapse
|
7
|
Li S, Barnard AS. Inverse Design of Nanoparticles Using Multi‐Target Machine Learning. ADVANCED THEORY AND SIMULATIONS 2021. [DOI: 10.1002/adts.202100414] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Sichao Li
- School of Computing Australian National University Acton Australian Capital Territory 2601 Australia
| | - Amanda S. Barnard
- School of Computing Australian National University Acton Australian Capital Territory 2601 Australia
| |
Collapse
|
8
|
Diéguez-Santana K, González-Díaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. NANOSCALE 2021; 13:17854-17870. [PMID: 34671801 DOI: 10.1039/d1nr04178a] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Artificial Intelligence/Machine Learning (AI/ML) algorithms may speed up the design of DADNP systems formed by Antibacterial Drugs (AD) and Nanoparticles (NP). In this work, we used IFPTML = Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm for the first time to study of a large dataset of putative DADNP systems composed by >165 000 ChEMBL AD assays and 300 NP assays vs. multiple bacteria species. We trained alternative models with Linear Discriminant Analysis (LDA), Artificial Neural Networks (ANN), Bayesian Networks (BNN), K-Nearest Neighbour (KNN) and other algorithms. IFPTML-LDA model was simpler with values of Sp ≈ 90% and Sn ≈ 74% in both training (>124 K cases) and validation (>41 K cases) series. IFPTML-ANN and KNN models are notably more complicated even when they are more balanced Sn ≈ Sp ≈ 88.5%-99.0% and AUROC ≈ 0.94-0.99 in both series. We also carried out a simulation (>1900 calculations) of the expected behavior for putative DADNPs in 72 different biological assays. The putative DADNPs studied are formed by 27 different drugs with multiple classes of NP and types of coats. In addition, we tested the validity of our additive model with 80 DADNP complexes experimentally synthetized and biologically tested (reported in >45 papers). All these DADNPs show values of MIC < 50 μg mL-1 (cutoff used) better that MIC of AD and NP alone (synergistic or additive effect). The assays involve DADNP complexes with 10 types of NP, 6 coating materials, NP size range 5-100 nm vs. 15 different antibiotics, and 12 bacteria species. The IFPTML-LDA model classified correctly 100% (80 out of 80) DADNP complexes as biologically active. IFPMTL additive strategy may become a useful tool to assist the design of DADNP systems for antibacterial therapy taking into consideration only information about AD and NP components by separate.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940 Leioa, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
9
|
Rincón-López J, Almanza-Arjona YC, Riascos AP, Rojas-Aguirre Y. When Cyclodextrins Met Data Science: Unveiling Their Pharmaceutical Applications through Network Science and Text-Mining. Pharmaceutics 2021; 13:1297. [PMID: 34452258 PMCID: PMC8399453 DOI: 10.3390/pharmaceutics13081297] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 08/14/2021] [Accepted: 08/16/2021] [Indexed: 12/21/2022] Open
Abstract
We present a data-driven approach to unveil the pharmaceutical technologies of cyclodextrins (CDs) by analyzing a dataset of CD pharmaceutical patents. First, we implemented network science techniques to represent CD patents as a single structure and provide a framework for unsupervised detection of keywords in the patent dataset. Guided by those keywords, we further mined the dataset to examine the patenting trends according to CD-based dosage forms. CD patents formed complex networks, evidencing the supremacy of CDs for solubility enhancement and how this has triggered cutting-edge applications based on or beyond the solubility improvement. The networks exposed the significance of CDs to formulate aqueous solutions, tablets, and powders. Additionally, they highlighted the role of CDs in formulations of anti-inflammatory drugs, cancer therapies, and antiviral strategies. Text-mining showed that the trends in CDs for aqueous solutions, tablets, and powders are going upward. Gels seem to be promising, while patches and fibers are emerging. Cyclodextrins' potential in suspensions and emulsions is yet to be recognized and can become an opportunity area. This is the first unsupervised/supervised data-mining approach aimed at depicting a landscape of CDs to identify trending and emerging technologies and uncover opportunity areas in CD pharmaceutical research.
Collapse
Affiliation(s)
- Juliana Rincón-López
- Instituto de Investigaciones en Materiales, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico City 04510, Mexico;
| | - Yara C. Almanza-Arjona
- Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico City 04510, Mexico;
| | - Alejandro P. Riascos
- Instituto de Física, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico City 04510, Mexico
| | - Yareli Rojas-Aguirre
- Instituto de Investigaciones en Materiales, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico City 04510, Mexico;
| |
Collapse
|
10
|
Zhang H, Barnard AS. Impact of atomistic or crystallographic descriptors for classification of gold nanoparticles. NANOSCALE 2021; 13:11887-11898. [PMID: 34190263 DOI: 10.1039/d1nr02258j] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Machine learning models are known to be sensitive to the features used to train them, but there is currently no way to predict the impact of using different features prior to feature extraction. This is particularly important to fields such as nanotechnology that are highly multi-disciplinary, and samples can be characterised many different ways depending on the preferences of individual researchers. Does it matter if nanomaterials are described using the interatomic coordinations or more complex order parameters? In this study we compare results of supervised and unsupervised learning on a single set of gold nanoparticles that has been characterised by two different descriptors, each with a unique feature space. We find that there are some consistencies, and model selection is descriptor-agnostic, but the level of detail and the type of information that can be extracted from the results is sensitive to the way the particles are described. Unsupervised clustering revealed that an atomistic descriptor provides a finer-grained interpretation and clusters that are sub-clusters of a more sophisticated crystallographic descriptor, which is consistent with both how the features were calculated, and how they are interpreted in the domain. A supervised classifier revealed that the types of features responsible for the separation are related to the bulk structure, regardless of the descriptor, but capture different types of information. For both the atomistic and crystallographic descriptor the gradient boosting decision tree classifier gave superior results of F1-scores of 0.96 and 0.98, respectively, with excellent precision and recall, even though the clustering presented a challenging multi-classification problem.
Collapse
Affiliation(s)
- Haonan Zhang
- School of Computing, Australian National University, Acton 2601, Australia.
| | | |
Collapse
|
11
|
Ortega-Tenezaca B, González-Díaz H. IFPTML mapping of nanoparticle antibacterial activity vs. pathogen metabolic networks. NANOSCALE 2021; 13:1318-1330. [PMID: 33410431 DOI: 10.1039/d0nr07588d] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Nanoparticles are useful antimicrobial drug-release systems, but some nanoparticles also exhibit antibacterial activity. However, investigation of their antibacterial activity is a difficult and slow process due to the numerous combinations of nanoparticle size, shape, and composition vs. biological tests, assay organisms, and multiple activity parameters to be measured. Additionally, the overuse of antibiotics has led to the emergence of resistant bacterial strains with different metabolic networks. Computational models may speed up this process, but the models reported to date do not to consider all the previous factors, and the data sources are dispersed and not curated. Thus, herein, we used an information fusion, perturbation-theory machine learning (IFPTML) approach, which is introduced by us for the first time, to fit a model for the discovery of antibacterial nanoparticles. The dataset studied had 15 classes of nanoparticles (1-100 nm) with most cases in the range of 1-50 nm vs. >20 pathogenic bacteria species with different metabolic networks. The nanoparticles studied included metal nanoparticles of Au, Ag, and Cu; oxide nanoparticles of Zn, Cu, La, Al, Fe, Sn, Ti, Cd, and Si; and metal salt nanoparticles of CuI and CdS. We used the SOFT.PTML software (our own application) with a user-friendly interface for the IFPTML calculations and a control statistics package. Using SOFT.PTML, we found a linear logistic regression equation that could model 4 biological activity parameters using only 8 variables with χ2 = 2265.75, p-level <0.05, sensitivity, Sn = 79.4, and specificity, Sp = 99.3, for 3213 cases (nanoparticle-bacteria pairs) in the training series. The model had Sn = 80.8 and Sp = 99.3 for 2114 cases in the external validation series. We also developed a random forest non-linear model with higher values of Sn and Sp = 98-99% in the training/validation series, although it was more complicated to use. SOFT.PTML has been demonstrated to be a useful tool for the analysis of complex data in nanotechnology. We also introduced a new anabolism-catabolism unbalance index of metabolic networks to reveal the biological connotation of the IFPTML predictions for antibacterial nanoparticles. These new models open a new door for the discovery of NPs vs. new bacterial species and strains with different topological structures of their metabolic networks.
Collapse
Affiliation(s)
- Bernabé Ortega-Tenezaca
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain and Amazon State University UEA, Puyo, Pastaza, Ecuador and Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain. and Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain and Center for Investigation on Technologies of Information and Communication (CITIC), University of Coruña (UDC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain. and Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940 Leioa, Spain and IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
12
|
Parker AJ, Barnard AS. Machine learning reveals multiple classes of diamond nanoparticles. NANOSCALE HORIZONS 2020; 5:1394-1399. [PMID: 32840548 DOI: 10.1039/d0nh00382d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Generating samples of nanoparticles with specific properties that allow for structural diversity, rather than requiring structural precision, is a more sustainable prospect for industry, where samples need to be both targeted to specific applications and cost effective. This can be better enabled by defining classes of nanoparticles and characterising the properties of the class as a whole. In this study, we use machine learning to predict the different classes of diamond nanoparticles based entirely on the structural features and explore the populations of these classes in terms of the size, shape, speciation and charge transfer properties. We identify 9 different types of diamond nanoparticles based on their similarity in 17 dimensions and, contrary to conventional wisdom, find that the fraction of sp2 or sp3 hybridized atoms are not strong determinants, and that the classes are only weakly related to size. Each class has been describe in such way as to enable rapid assignment using microanalysis techniques.
Collapse
Affiliation(s)
- Amanda J Parker
- Data61 CSIRO, Door 34 Goods Shed Village St, Docklands, Victoria, Australia.
| | | |
Collapse
|