1051
|
Duan Q, Lee J. Fast-developing machine learning support complex system research in environmental chemistry. NEW J CHEM 2020. [DOI: 10.1039/c9nj05717j] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Machine learning will radically accelerate analysis of complex material networks in environmental chemistry.
Collapse
Affiliation(s)
- Qiannan Duan
- Department of Environment Science
- Shaanxi Normal University
- Xi’an 710062
- China
- State Key Laboratory of Pollution Control and Resource Reuse
| | - Jianchao Lee
- Department of Environment Science
- Shaanxi Normal University
- Xi’an 710062
- China
| |
Collapse
|
1052
|
Cowan MJ, Mpourmpakis G. Towards elucidating structure of ligand-protected nanoclusters. Dalton Trans 2020; 49:9191-9202. [DOI: 10.1039/d0dt01418d] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Developing a centralized database for ligand-protected nanoclusters can fuel machine learning and data-science-based approaches towards theoretical structure prediction.
Collapse
Affiliation(s)
- Michael J. Cowan
- Department of Chemical and Petroleum Engineering
- University of Pittsburgh
- Pittsburgh
- USA
| | - Giannis Mpourmpakis
- Department of Chemical and Petroleum Engineering
- University of Pittsburgh
- Pittsburgh
- USA
| |
Collapse
|
1053
|
Liu S, Wang D, Maljovec D, Anirudh R, Thiagarajan JJ, Jacobs SA, Van Essen BC, Hysom D, Yeom JS, Gaffney J, Peterson L, Robinson PB, Bhatia H, Pascucci V, Spears BK, Bremer PT. Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:291-300. [PMID: 31484123 DOI: 10.1109/tvcg.2019.2934594] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications.
Collapse
|
1054
|
Dylla MT, Dunn A, Anand S, Jain A, Snyder GJ. Machine Learning Chemical Guidelines for Engineering Electronic Structures in Half-Heusler Thermoelectric Materials. RESEARCH (WASHINGTON, D.C.) 2020; 2020:6375171. [PMID: 32395718 PMCID: PMC7193307 DOI: 10.34133/2020/6375171] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 03/02/2020] [Indexed: 12/17/2022]
Abstract
Half-Heusler materials are strong candidates for thermoelectric applications due to their high weighted mobilities and power factors, which is known to be correlated to valley degeneracy in the electronic band structure. However, there are over 50 known semiconducting half-Heusler phases, and it is not clear how the chemical composition affects the electronic structure. While all the n-type electronic structures have their conduction band minimum at either the Γ- or X-point, there is more diversity in the p-type electronic structures, and the valence band maximum can be at either the Γ-, L-, or W-point. Here, we use high throughput computation and machine learning to compare the valence bands of known half-Heusler compounds and discover new chemical guidelines for promoting the highly degenerate W-point to the valence band maximum. We do this by constructing an "orbital phase diagram" to cluster the variety of electronic structures expressed by these phases into groups, based on the atomic orbitals that contribute most to their valence bands. Then, with the aid of machine learning, we develop new chemical rules that predict the location of the valence band maximum in each of the phases. These rules can be used to engineer band structures with band convergence and high valley degeneracy.
Collapse
Affiliation(s)
- Maxwell T. Dylla
- Department of Materials Science and Engineering, Northwestern University, IL 60208, USA
| | - Alexander Dunn
- Department of Materials Science and Engineering, UC Berkeley, CA 94720, USA
- Lawrence Berkeley National Laboratory, Energy Technologies Area, CA 94720, USA
| | - Shashwat Anand
- Department of Materials Science and Engineering, Northwestern University, IL 60208, USA
| | - Anubhav Jain
- Lawrence Berkeley National Laboratory, Energy Technologies Area, CA 94720, USA
| | - G. Jeffrey Snyder
- Department of Materials Science and Engineering, Northwestern University, IL 60208, USA
| |
Collapse
|
1055
|
Laghuvarapu S, Pathak Y, Priyakumar UD. BAND NN: A Deep Learning Framework for Energy Prediction and Geometry Optimization of Organic Small Molecules. J Comput Chem 2019; 41:790-799. [DOI: 10.1002/jcc.26128] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 11/13/2019] [Accepted: 11/21/2019] [Indexed: 12/26/2022]
Affiliation(s)
- Siddhartha Laghuvarapu
- Center for Computational Natural Sciences and BioinformaticsInternational Institute of Information Technology Hyderabad 500 032 India
| | - Yashaswi Pathak
- Center for Computational Natural Sciences and BioinformaticsInternational Institute of Information Technology Hyderabad 500 032 India
| | - U. Deva Priyakumar
- Center for Computational Natural Sciences and BioinformaticsInternational Institute of Information Technology Hyderabad 500 032 India
| |
Collapse
|
1056
|
Toyao T, Maeno Z, Takakusagi S, Kamachi T, Takigawa I, Shimizu KI. Machine Learning for Catalysis Informatics: Recent Applications and Prospects. ACS Catal 2019. [DOI: 10.1021/acscatal.9b04186] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Takashi Toyao
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
| | - Zen Maeno
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
| | - Satoru Takakusagi
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
| | - Takashi Kamachi
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
- Department of Life, Environment and Materials Science, Fukuoka Institute of Technology, 3-30-1Wajiro-Higashi, Higashi-ku, Fukuoka 811-0295, Japan
| | - Ichigaku Takigawa
- RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido 001-0021, Japan
| | - Ken-ichi Shimizu
- Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan
- Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
| |
Collapse
|
1057
|
Nijamudheen A, Datta A. Gold-Catalyzed Cross-Coupling Reactions: An Overview of Design Strategies, Mechanistic Studies, and Applications. Chemistry 2019; 26:1442-1487. [PMID: 31657487 DOI: 10.1002/chem.201903377] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 10/28/2019] [Indexed: 12/14/2022]
Abstract
Transition-metal-catalyzed cross-coupling reactions are central to many organic synthesis methodologies. Traditionally, Pd, Ni, Cu, and Fe catalysts are used to promote these reactions. Recently, many studies have showed that both homogeneous and heterogeneous Au catalysts can be used for activating selective cross-coupling reactions. Here, an overview of the past studies, current trends, and future directions in the field of gold-catalyzed coupling reactions is presented. Design strategies to accomplish selective homocoupling and cross-coupling reactions under both homogeneous and heterogeneous conditions, computational and experimental mechanistic studies, and their applications in diverse fields are critically reviewed. Specific topics covered are: oxidant-assisted and oxidant-free reactions; strain-assisted reactions; dual Au and photoredox catalysis; bimetallic synergistic reactions; mechanisms of reductive elimination processes; enzyme-mimicking Au chemistry; cluster and surface reactions; and plasmonic catalysis. In the relevant sections, theoretical and computational studies of AuI /AuIII chemistry are discussed and the predictions from the calculations are compared with the experimental observations to derive useful design strategies.
Collapse
Affiliation(s)
- A Nijamudheen
- School of Chemical Sciences, Indian Association for the, Cultivation of Sciences, 2A & 2B Raja S C Mullick Road, Kolkata, 700032, India.,Department of Chemical & Biomedical Engineering, Florida A&M University-Florida State University, Joint College of Engineering, 2525 Pottsdamer Street, Tallahassee, FL, 32310, USA
| | - Ayan Datta
- School of Chemical Sciences, Indian Association for the, Cultivation of Sciences, 2A & 2B Raja S C Mullick Road, Kolkata, 700032, India
| |
Collapse
|
1058
|
Mallawaarachchi S, Liu Y, Thang SH, Cheng W, Premaratne M. Machine learning based temperature prediction of poly(N-isopropylacrylamide)-capped plasmonic nanoparticle solutions. Phys Chem Chem Phys 2019; 21:24808-24819. [PMID: 31687699 DOI: 10.1039/c9cp04544a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The temperature-dependent optical properties of gold nanoparticles that are capped with the thermo-sensitive polymer: 'poly(N-isopropylacrylamide)' (PNIPAM), have been studied extensively for several years. Also, their suitability to function as nanoscopic thermometers for bio-sensing applications has been suggested numerous times. In an attempt to establish this, many have studied the temperature-dependent optical resonance characteristics of these particles; however, developing a simple mathematical relationship between the optical measurements and the solution temperature remains an open challenge. In this paper, we attempt to systematically address this problem using machine learning techniques to quickly and accurately predict the solution-temperature, based on spectroscopic data. Our emphasis is on establishing a simple and practically useful solution to this problem. Our dataset comprises spectroscopic absorption data from both nanorods and nanobipyramids capped with PNIPAM, measured at discretely varied and pre-set temperature states. Specific regions of the spectroscopic data are selected as features for prediction using random forest (RF), gradient boosting (GB) and adaptive boosting (AB) regression techniques. Our prediction results indicate that RF and GB techniques can be used successfully to predict solution temperatures instantly to within 1 °C of accuracy.
Collapse
Affiliation(s)
- Sudaraka Mallawaarachchi
- Advanced Computing and Simulation Laboratory (AχL), Department of Electrical and Computer Systems Engineering, Monash University, Clayton, Victoria 3800, Australia.
| | | | | | | | | |
Collapse
|
1059
|
Cova TFGG, Pais AACC. Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns. Front Chem 2019; 7:809. [PMID: 32039134 PMCID: PMC6988795 DOI: 10.3389/fchem.2019.00809] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/11/2019] [Indexed: 12/14/2022] Open
Abstract
Computational Chemistry is currently a synergistic assembly between ab initio calculations, simulation, machine learning (ML) and optimization strategies for describing, solving and predicting chemical data and related phenomena. These include accelerated literature searches, analysis and prediction of physical and quantum chemical properties, transition states, chemical structures, chemical reactions, and also new catalysts and drug candidates. The generalization of scalability to larger chemical problems, rather than specialization, is now the main principle for transforming chemical tasks in multiple fronts, for which systematic and cost-effective solutions have benefited from ML approaches, including those based on deep learning (e.g. quantum chemistry, molecular screening, synthetic route design, catalysis, drug discovery). The latter class of ML algorithms is capable of combining raw input into layers of intermediate features, enabling bench-to-bytes designs with the potential to transform several chemical domains. In this review, the most exciting developments concerning the use of ML in a range of different chemical scenarios are described. A range of different chemical problems and respective rationalization, that have hitherto been inaccessible due to the lack of suitable analysis tools, is thus detailed, evidencing the breadth of potential applications of these emerging multidimensional approaches. Focus is given to the models, algorithms and methods proposed to facilitate research on compound design and synthesis, materials design, prediction of binding, molecular activity, and soft matter behavior. The information produced by pairing Chemistry and ML, through data-driven analyses, neural network predictions and monitoring of chemical systems, allows (i) prompting the ability to understand the complexity of chemical data, (ii) streamlining and designing experiments, (ii) discovering new molecular targets and materials, and also (iv) planning or rethinking forthcoming chemical challenges. In fact, optimization engulfs all these tasks directly.
Collapse
Affiliation(s)
- Tânia F. G. G. Cova
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| | - Alberto A. C. C. Pais
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
1060
|
von der Esch B, Dietschreit JCB, Peters LDM, Ochsenfeld C. Finding Reactive Configurations: A Machine Learning Approach for Estimating Energy Barriers Applied to Sirtuin 5. J Chem Theory Comput 2019; 15:6660-6667. [PMID: 31765138 DOI: 10.1021/acs.jctc.9b00876] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Sirtuin 5 is a class III histone deacetylase that, unlike its classification, mainly catalyzes desuccinylation and demanoylation reactions. It is an interesting drug target that we use here to test new ideas for calculating reaction pathways of large molecular systems such as enzymes. A major issue with most schemes (e.g., adiabatic mapping) is that the resulting activation barrier height heavily depends on the chosen educt conformation. This makes the selection of the initial structure decisive for the success of the characterization. Here, we apply machine learning to a large number of molecular dynamics frames and potential energy barriers obtained by quantum mechanics/molecular mechanics calculations in order to identify (1) suitable start-conformations for reaction path calculations and (2) structural features relevant for the first step of the desuccinylation reaction catalyzed by Sirtuin 5. The latter generally aids the understanding of reaction mechanisms and important interactions in active centers. Using our novel approach, we found eleven key features that govern the reactivity. We were able to estimate reaction barriers with a mean absolute error of 3.6 kcal/mol and identified reactive configurations.
Collapse
Affiliation(s)
- Beatriz von der Esch
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Johannes C B Dietschreit
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Laurens D M Peters
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Christian Ochsenfeld
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| |
Collapse
|
1061
|
Huang Z, Liu X, Zang J. The inverse design of structural color using machine learning. NANOSCALE 2019; 11:21748-21758. [PMID: 31498348 DOI: 10.1039/c9nr06127d] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Efficiently identifying optical structures with desired functionalities, referred to as inverse design, can dramatically accelerate the invention of new photonic devices, and this is especially useful in the design of large scale integrated photonic chips. Structural color with high-resolution, high-saturation, and low-loss holds great promise in image display, data storage and information security. However, the inverse design of structural color remains an open challenge, and this impedes practical application. Here, we propose an inverse design strategy for structural color using machine learning (ML) technologies. The supervised learning (SL) models are trained with the geometries and colors of dielectric arrays to capture accurate geometry-color relationships, and these are then applied to a reinforcement learning (RL) algorithm in order to find the optical structural geometries for the desired color. Our work succeeds in finding simple and accurate models to describe geometry-color relationships, which significantly improves the efficiency of the design. This strategy provides a systematic method to directly encode generic functionality into a set of structures and geometries, paving the way for the inverse design of functional photonic devices.
Collapse
Affiliation(s)
- Zhao Huang
- School of Optical and Electronic Information and Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China430074.
| | | | | |
Collapse
|
1062
|
Zhang Y, He X, Chen Z, Bai Q, Nolan AM, Roberts CA, Banerjee D, Matsunaga T, Mo Y, Ling C. Unsupervised discovery of solid-state lithium ion conductors. Nat Commun 2019; 10:5260. [PMID: 31748523 PMCID: PMC6868160 DOI: 10.1038/s41467-019-13214-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 10/24/2019] [Indexed: 11/09/2022] Open
Abstract
Although machine learning has gained great interest in the discovery of functional materials, the advancement of reliable models is impeded by the scarcity of available materials property data. Here we propose and demonstrate a distinctive approach for materials discovery using unsupervised learning, which does not require labeled data and thus alleviates the data scarcity challenge. Using solid-state Li-ion conductors as a model problem, unsupervised materials discovery utilizes a limited quantity of conductivity data to prioritize a candidate list from a wide range of Li-containing materials for further accurate screening. Our unsupervised learning scheme discovers 16 new fast Li-conductors with conductivities of 10−4–10−1 S cm−1 predicted in ab initio molecular dynamics simulations. These compounds have structures and chemistries distinct to known systems, demonstrating the capability of unsupervised learning for discovering materials over a wide materials space with limited property data. Predictions of new solid-state Li-ion conductors are challenging due to the diverse chemistries and compositions involved. Here the authors combine unsupervised learning techniques and molecular dynamics simulations to discover new compounds with high Li-ion conductivity.
Collapse
Affiliation(s)
- Ying Zhang
- Toyota Research Institute of North America, Ann Arbor, MI, 48105, USA
| | - Xingfeng He
- Department of Materials Science and Engineering, University of Maryland, College Park, MD, 20742, USA
| | - Zhiqian Chen
- Department of Computer Science, Virginia Tech, 7054 Haycock Road, Falls Church, VA, 2043, USA
| | - Qiang Bai
- Department of Materials Science and Engineering, University of Maryland, College Park, MD, 20742, USA
| | - Adelaide M Nolan
- Department of Materials Science and Engineering, University of Maryland, College Park, MD, 20742, USA
| | - Charles A Roberts
- Toyota Research Institute of North America, Ann Arbor, MI, 48105, USA
| | - Debasish Banerjee
- Toyota Research Institute of North America, Ann Arbor, MI, 48105, USA
| | - Tomoya Matsunaga
- Toyota Research Institute of North America, Ann Arbor, MI, 48105, USA
| | - Yifei Mo
- Department of Materials Science and Engineering, University of Maryland, College Park, MD, 20742, USA. .,Maryland Energy Innovation Institute, University of Maryland, College Park, MD, 20742, USA.
| | - Chen Ling
- Toyota Research Institute of North America, Ann Arbor, MI, 48105, USA.
| |
Collapse
|
1063
|
Schran C, Behler J, Marx D. Automated Fitting of Neural Network Potentials at Coupled Cluster Accuracy: Protonated Water Clusters as Testing Ground. J Chem Theory Comput 2019; 16:88-99. [DOI: 10.1021/acs.jctc.9b00805] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr−Universität Bochum, 44780 Bochum, Germany
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstrasse 6, 37077 Göttingen, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr−Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
1064
|
Malek A, Eslamibidgoli MJ, Mokhtari M, Wang Q, Eikerling MH, Malek K. Virtual Materials Intelligence for Design and Discovery of Advanced Electrocatalysts. Chemphyschem 2019; 20:2946-2955. [PMID: 31587461 DOI: 10.1002/cphc.201900570] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 09/06/2019] [Indexed: 11/08/2022]
Abstract
Similar to advancements gained from big data in genomics, security, internet of things, and e-commerce, the materials workflow could be made more efficient and prolific through advances in streamlining data sources, autonomous materials synthesis, rapid characterization, big data analytics, and self-learning algorithms. In electrochemical materials science, data sets are large, unstructured/heterogeneous, and difficult to process and analyze from a single data channel or platform. Computer-aided materials design together with advances in data mining, machine learning, and predictive analytics are expected to provide inexpensive and accelerated pathways towards tailor-made functionally optimized energy materials. Fundamental research in the field of electrochemical energy materials focuses primarily on complex interfacial phenomena and kinetic electrocatalytic processes. This perspective article critically assesses AI-driven modeling and computational approaches that are currently applied to those objects. An application-driven materials intelligence platform is introduced, and its functionalities are scrutinized considering the development of electrocatalyst materials for CO2 conversion as a use case.
Collapse
Affiliation(s)
- Ali Malek
- NRC-EME, 4250 Wesbrook Mall, Vancouver, BC, V6T 1W5, Canada.,Department of Chemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
| | | | - Mehrdad Mokhtari
- Department of Chemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
| | - Qianpu Wang
- NRC-EME, 4250 Wesbrook Mall, Vancouver, BC, V6T 1W5, Canada
| | - Michael H Eikerling
- Department of Chemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.,Institute of Energy and Climate Research, IEK-13: Modelling and Simulation of Energy Materials, Forschungszentrum Jülich, 52425, Jülich, Germany
| | - Kourosh Malek
- NRC-EME, 4250 Wesbrook Mall, Vancouver, BC, V6T 1W5, Canada.,Department of Chemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
| |
Collapse
|
1065
|
Prediction of Absorption Spectrum Shifts in Dyes Adsorbed on Titania. Sci Rep 2019; 9:16983. [PMID: 31740733 PMCID: PMC6861231 DOI: 10.1038/s41598-019-53534-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/04/2019] [Indexed: 01/04/2023] Open
Abstract
Dye adsorption on metal-oxide films often results in small to substantial absorption shifts relative to the solution phase, with undesirable consequences for the performance of dye-sensitized solar cells and optical sensors. While density functional theory is frequently used to model such behaviour, it is too time-consuming for rapid assessment. In this paper, we explore the use of supervised machine learning to predict whether dye adsorption on titania is likely to induce a change in its absorption characteristics. The physicochemical features of each dye were encoded as a numeric vector whose elements are the counts of molecular fragments and topological indices. Various classification models were subsequently trained to predict the type of absorption shift i.e. blue, red or unchanged (|Δλ| ≤ 10 nm). The models were able to predict the nature of the shift with a good likelihood (~80%) of success when applied to unseen data.
Collapse
|
1066
|
Machine learning for target discovery in drug development. Curr Opin Chem Biol 2019; 56:16-22. [PMID: 31734566 DOI: 10.1016/j.cbpa.2019.10.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 10/01/2019] [Accepted: 10/03/2019] [Indexed: 12/15/2022]
Abstract
The discovery of macromolecular targets for bioactive agents is currently a bottleneck for the informed design of chemical probes and drug leads. Typically, activity profiling against genetically manipulated cell lines or chemical proteomics is pursued to shed light on their biology and deconvolute drug-target networks. By taking advantage of the ever-growing wealth of publicly available bioactivity data, learning algorithms now provide an attractive means to generate statistically motivated research hypotheses and thereby prioritize biochemical screens. Here, we highlight recent successes in machine intelligence for target identification and discuss challenges and opportunities for drug discovery.
Collapse
|
1067
|
Alvarez-Ramírez F, Ruiz-Morales Y. Database of Nuclear Independent Chemical Shifts (NICS) versus NICSZZ of Polycyclic Aromatic Hydrocarbons (PAHs). J Chem Inf Model 2019; 60:611-620. [DOI: 10.1021/acs.jcim.9b00909] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Fernando Alvarez-Ramírez
- Instituto Mexicano del Petróleo, Eje Central Lázaro Cárdenas Norte 152, Mexico City 07730, Mexico
| | - Yosadara Ruiz-Morales
- Instituto Mexicano del Petróleo, Eje Central Lázaro Cárdenas Norte 152, Mexico City 07730, Mexico
| |
Collapse
|
1068
|
Huang L, Ling C. Representing Multiword Chemical Terms through Phrase-Level Preprocessing and Word Embedding. ACS OMEGA 2019; 4:18510-18519. [PMID: 31737809 PMCID: PMC6854573 DOI: 10.1021/acsomega.9b02060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 10/18/2019] [Indexed: 06/10/2023]
Abstract
In recent years, data-driven methods and artificial intelligence have been widely used in chemoinformatic and material informatics domains, for which the success is critically determined by the availability of training data with good quality and large quantity. A potential approach to break this bottleneck is by leveraging the chemical literature such as papers and patents as alternative data resources to high throughput experiments and simulation. Compared to other domains where natural language processing techniques have established successes, the chemical literature contains a large portion of phrases of multiple words that create additional challenges for accurate identification and representation. Here, we introduce a chemistry domain suitable approach to identify multiword chemical terms and train word representations at the phrase level. Through a series of special-designed experiments, we demonstrate that our multiword identifying and representing method effectively and accurately identifies multiword chemical terms from 119, 166 chemical patents and is more robust and precise to preserve the semantic meaning of chemical phrases compared to the conventional approach, which represents constituent single words first and combine them afterward. Because the accurate representation of chemical terms is the first and essential step to provide learning features for downstream natural language processing tasks, our results pave the road to utilize the large volume of chemical literature in future data-driven studies.
Collapse
Affiliation(s)
- Liyuan Huang
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan 48105, United States
| | - Chen Ling
- Toyota Research Institute of North America, 1555 Woodridge Avenue, Ann Arbor, Michigan 48105, United States
| |
Collapse
|
1069
|
Adamczyk JM, Ghosh S, Braden TL, Hogan CJ, Toberer ES. Alloyed Thermoelectric PbTe-SnTe Films Formed via Aerosol Deposition. ACS COMBINATORIAL SCIENCE 2019; 21:753-759. [PMID: 31610114 DOI: 10.1021/acscombsci.9b00145] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The discovery of new thermoelectric materials has the potential to benefit from advances in high-throughput methodologies. Traditional synthesis and characterization routes for thermoelectrics are time-consuming serial processes. In contrast, high-throughput materials discovery is commonly done by thin film growth, which may produce microstructures that are metastable or compositionally graded and, therefore, are challenging to characterize. As a middle ground between bulk synthesis and thin film deposition, we find that the aerosol deposition process can rapidly produce samples that exhibit electronic property trends consistent with those produced by traditional bulk means. We demonstrate rapid growth of discrete thermoelectric thick films of varying chemical compositions (Pb1-xSnxTe) from PbTe and SnTe polydisperse micrometer sized powder feedstocks. The high deposition rate (near 1 μm min-1) and resultant microstructures are advantageous as the diffusion length scales promote rapid thermal treatment and equilibrium phase formation. Room-temperature high-throughput measurements of the Seebeck coefficient and resistivity are compared to traditionally produced bulk materials. The Seebeck coefficient of the films follows the trends of traditional samples, but the resistivity is found to be more sensitive to microstructural effects. Ultimately, we demonstrate a framework for exploratory materials science using aerosol deposition and high-throughput characterization instrumentation.
Collapse
Affiliation(s)
- Jesse M. Adamczyk
- Department of Material Science, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Souvik Ghosh
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Tara L. Braden
- Department of Material Science, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Christopher J. Hogan
- Department of Mechanical Engineering, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Eric S. Toberer
- Department of Physics, Colorado School of Mines, Golden, Colorado 80401, United States
| |
Collapse
|
1070
|
Masood H, Toe CY, Teoh WY, Sethu V, Amal R. Machine Learning for Accelerated Discovery of Solar Photocatalysts. ACS Catal 2019. [DOI: 10.1021/acscatal.9b02531] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Hassan Masood
- Particles and Catalysis Research Group, School of Chemical Engineering, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Cui Ying Toe
- Particles and Catalysis Research Group, School of Chemical Engineering, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Wey Yang Teoh
- Particles and Catalysis Research Group, School of Chemical Engineering, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Vidhyasaharan Sethu
- School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Rose Amal
- Particles and Catalysis Research Group, School of Chemical Engineering, The University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
1071
|
Himanen L, Geurts A, Foster AS, Rinke P. Data-Driven Materials Science: Status, Challenges, and Perspectives. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2019; 6:1900808. [PMID: 31728276 PMCID: PMC6839624 DOI: 10.1002/advs.201900808] [Citation(s) in RCA: 150] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/20/2019] [Indexed: 05/06/2023]
Abstract
Data-driven science is heralded as a new paradigm in materials science. In this field, data is the new resource, and knowledge is extracted from materials datasets that are too big or complex for traditional human reasoning-typically with the intent to discover new or improved materials or materials phenomena. Multiple factors, including the open science movement, national funding, and progress in information technology, have fueled its development. Such related tools as materials databases, machine learning, and high-throughput methods are now established as parts of the materials research toolset. However, there are a variety of challenges that impede progress in data-driven materials science: data veracity, integration of experimental and computational data, data longevity, standardization, and the gap between industrial interests and academic efforts. In this perspective article, the historical development and current state of data-driven materials science, building from the early evolution of open science to the rapid expansion of materials data infrastructures are discussed. Key successes and challenges so far are also reviewed, providing a perspective on the future development of the field.
Collapse
Affiliation(s)
- Lauri Himanen
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
| | - Amber Geurts
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Department of Management StudiesAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- TNO, Netherlands Organization for Applied Scientific ResearchExpertise Center for Strategy and PolicyAnna van Beurenplein 1DA 2595The HagueNetherlands
| | - Adam Stuart Foster
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Graduate School Materials Science in MainzStaudinger Weg 955128MainzGermany
- WPI Nano Life Science Institute (WPI‐NanoLSI)Kanazawa UniversityKakuma‐machiKanazawa920‐1192Japan
| | - Patrick Rinke
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Theoretical Chemistry and Catalysis Research CentreTechnische Universität MünchenLichtenbergstr. 4D‐85747GarchingGermany
| |
Collapse
|
1072
|
Feng C, Sharman E, Ye S, Luo Y, Jiang J. A neural network protocol for predicting molecular bond energy. Sci China Chem 2019. [DOI: 10.1007/s11426-019-9619-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
1073
|
Yuan R, Tian Y, Xue D, Xue D, Zhou Y, Ding X, Sun J, Lookman T. Accelerated Search for BaTiO 3-Based Ceramics with Large Energy Storage at Low Fields Using Machine Learning and Experimental Design. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2019; 6:1901395. [PMID: 31728287 PMCID: PMC6839636 DOI: 10.1002/advs.201901395] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 08/17/2019] [Indexed: 06/01/2023]
Abstract
The problem that is considered is that of maximizing the energy storage density of Pb-free BaTiO3-based dielectrics at low electric fields. It is demonstrated that how varying the size of the combinatorial search space influences the efficiency of material discovery by comparing the performance of two machine learning based approaches where different levels of physical insights are involved. It is started with physics intuition to provide guiding principles to find better performers lying in the crossover region in the composition-temperature phase diagram between the ferroelectric phase and relaxor ferroelectric phase. Such an approach is limiting for multidopant solid solutions and motivates the use of two data-driven machine learning and design strategies with a feedback loop to experiments. Strategy I considers learning and property prediction on all the compounds, and strategy II learns to preselect compounds in the crossover region on which prediction is carried out. By performing only two active learning loops via strategy II, the compound (Ba0.86Ca0.14)(Ti0.79Zr0.11Hf0.10)O3 is synthesized with the largest energy storage density ≈73 mJ cm-3 at a field of 20 kV cm-1, and an insight into the relative performance of the strategies using varying levels of knowledge is provided.
Collapse
Affiliation(s)
- Ruihao Yuan
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
- Theoretical DivisionLos Alamos National LaboratoryLos AlamosNM87545USA
| | - Yuan Tian
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Dezhen Xue
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Deqing Xue
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Yumei Zhou
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Xiangdong Ding
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Jun Sun
- State Key Laboratory for Mechanical Behavior of MaterialsXi'an Jiaotong UniversityXi'an710049China
| | - Turab Lookman
- Theoretical DivisionLos Alamos National LaboratoryLos AlamosNM87545USA
| |
Collapse
|
1074
|
Deringer VL, Caro MA, Csányi G. Machine Learning Interatomic Potentials as Emerging Tools for Materials Science. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2019; 31:e1902765. [PMID: 31486179 DOI: 10.1002/adma.201902765] [Citation(s) in RCA: 209] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 07/26/2019] [Indexed: 05/22/2023]
Abstract
Atomic-scale modeling and understanding of materials have made remarkable progress, but they are still fundamentally limited by the large computational cost of explicit electronic-structure methods such as density-functional theory. This Progress Report shows how machine learning (ML) is currently enabling a new degree of realism in materials modeling: by "learning" electronic-structure data, ML-based interatomic potentials give access to atomistic simulations that reach similar accuracy levels but are orders of magnitude faster. A brief introduction to the new tools is given, and then, applications to some select problems in materials science are highlighted: phase-change materials for memory devices; nanoparticle catalysts; and carbon-based electrodes for chemical sensing, supercapacitors, and batteries. It is hoped that the present work will inspire the development and wider use of ML-based interatomic potentials in diverse areas of materials research.
Collapse
Affiliation(s)
- Volker L Deringer
- Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK
- Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK
| | - Miguel A Caro
- Department of Electrical Engineering and Automation and Department of Applied Physics, Aalto University, Espoo, 02150, Finland
| | - Gábor Csányi
- Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK
| |
Collapse
|
1075
|
J B, M M B, Chanda K. Evolutionary approaches in protein engineering towards biomaterial construction. RSC Adv 2019; 9:34720-34734. [PMID: 35530663 PMCID: PMC9074691 DOI: 10.1039/c9ra06807d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 10/01/2019] [Indexed: 11/29/2022] Open
Abstract
The tailoring of proteins for specific applications by evolutionary methods is a highly active area of research. Rational design and directed evolution are the two main strategies to reengineer proteins or create chimeric structures. Rational engineering is often limited by insufficient knowledge about proteins' structure-function relationships; directed evolution overcomes this restriction but poses challenges in the screening of candidates. A combination of these protein engineering approaches will allow us to create protein variants with a wide range of desired properties. Herein, we focus on the application of these approaches towards the generation of protein biomaterials that are known for biodegradability, biocompatibility and biofunctionality, from combinations of natural, synthetic, or engineered proteins and protein domains. Potential applications depend on the enhancement of biofunctional, mechanical, or other desired properties. Examples include scaffolds for tissue engineering, thermostable enzymes for industrial biocatalysis, and other therapeutic applications.
Collapse
Affiliation(s)
- Brindha J
- Department of Chemistry, School of Advanced Science, Vellore Institute of Technology, Chennai Campus Vandalur-Kelambakkam Road Chennai-600 127 Tamil Nadu India
| | - Balamurali M M
- Department of Chemistry, School of Advanced Science, Vellore Institute of Technology, Chennai Campus Vandalur-Kelambakkam Road Chennai-600 127 Tamil Nadu India
| | - Kaushik Chanda
- Department of Chemistry, School of Advanced Science, Vellore Institute of Technology Vellore-632014 Tamil Nadu India
| |
Collapse
|
1076
|
Kanekal KH, Bereau T. Resolution limit of data-driven coarse-grained models spanning chemical space. J Chem Phys 2019; 151:164106. [DOI: 10.1063/1.5119101] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Affiliation(s)
- Kiran H. Kanekal
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Tristan Bereau
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| |
Collapse
|
1077
|
Schleder GR, Padilha ACM, Reily Rocha A, Dalpian GM, Fazzio A. Ab Initio Simulations and Materials Chemistry in the Age of Big Data. J Chem Inf Model 2019; 60:452-459. [DOI: 10.1021/acs.jcim.9b00781] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Gabriel Ravanhani Schleder
- Federal University of ABC (UFABC), Santo André, São Paulo, Brazil
- Brazilian Nanotechnology National Laboratory (LNNano)/CNPEM, Campinas, São Paulo, Brazil
| | | | | | | | - Adalberto Fazzio
- Federal University of ABC (UFABC), Santo André, São Paulo, Brazil
- Brazilian Nanotechnology National Laboratory (LNNano)/CNPEM, Campinas, São Paulo, Brazil
| |
Collapse
|
1078
|
Cheng L, Kovachki NB, Welborn M, Miller TF. Regression Clustering for Improved Accuracy and Training Costs with Molecular-Orbital-Based Machine Learning. J Chem Theory Comput 2019; 15:6668-6677. [DOI: 10.1021/acs.jctc.9b00884] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Lixue Cheng
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Nikola B. Kovachki
- Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, United States
| | - Matthew Welborn
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Thomas F. Miller
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
1079
|
Liang J, Xu Y, Liu R, Zhu X. QM-sym, a symmetrized quantum chemistry database of 135 kilo molecules. Sci Data 2019; 6:213. [PMID: 31628326 PMCID: PMC6802082 DOI: 10.1038/s41597-019-0237-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 09/10/2019] [Indexed: 11/08/2022] Open
Abstract
Applying deep learning methods in materials science research is an important way of solving the time-consuming problems of typical ab initio quantum chemistry methodology, but due to the size of large molecules, large and uncharted fields still exist. Implementing symmetry information can significantly reduce the calculation complexity of structures, as they can be simplified to the minimum symmetric units. Because there are few quantum chemistry databases that include symmetry information, we constructed a new one, named QM-sym, by designing an algorithm to generate 135k organic molecules with the Cnh symmetry composite. Those generated molecules were optimized to a stable state using Gaussian 09. The geometric, electronic, energetic, and thermodynamic properties of the molecules were calculated, including their orbital degeneracy states and orbital symmetry around the HOMO-LUMO. The basic symmetric units were also included. This database p rovides consistent and comprehensive quantum chemical properties for structures with Cnh symmetries. QM-sym can be used as a benchmark for machine learning models in quantum chemistry or as a dataset for training new symmetry-based models.
Collapse
Affiliation(s)
- Jiechun Liang
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, Shenzhen, Guangdong, 518172, China
| | - Yanheng Xu
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, Shenzhen, Guangdong, 518172, China
| | - Rulin Liu
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, Shenzhen, Guangdong, 518172, China
| | - Xi Zhu
- Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS), 2001 Longxiang Road, Longgang District, Shenzhen, Guangdong, 518172, China.
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, Shenzhen, Guangdong, 518172, China.
| |
Collapse
|
1080
|
Schmidt J, Benavides-Riveros CL, Marques MAL. Machine Learning the Physical Nonlocal Exchange-Correlation Functional of Density-Functional Theory. J Phys Chem Lett 2019; 10:6425-6431. [PMID: 31596092 DOI: 10.1021/acs.jpclett.9b02422] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We train a neural network as the universal exchange-correlation functional of density-functional theory that simultaneously reproduces both the exact exchange-correlation energy and the potential. This functional is extremely nonlocal but retains the computational scaling of traditional local or semilocal approximations. It therefore holds the promise of solving some of the delocalization problems that plague density-functional theory, while maintaining the computational efficiency that characterizes the Kohn-Sham equations. Furthermore, by using automatic differentiation, a capability present in modern machine-learning frameworks, we impose the exact mathematical relation between the exchange-correlation energy and the potential, leading to a fully consistent method. We demonstrate the feasibility of our approach by looking at one-dimensional systems with two strongly correlated electrons, where density-functional methods are known to fail, and investigate the behavior and performance of our functional by varying the degree of nonlocality.
Collapse
Affiliation(s)
- Jonathan Schmidt
- Institut für Physik , Martin-Luther-Universität Halle-Wittenberg , 06120 Halle (Saale) , Germany
| | | | - Miguel A L Marques
- Institut für Physik , Martin-Luther-Universität Halle-Wittenberg , 06120 Halle (Saale) , Germany
| |
Collapse
|
1081
|
Abstract
Materials discovery has become significantly facilitated and accelerated by high-throughput ab-initio computations. This ability to rapidly design interesting novel compounds has displaced the materials innovation bottleneck to the development of synthesis routes for the desired material. As there is no a fundamental theory for materials synthesis, one might attempt a data-driven approach for predicting inorganic materials synthesis, but this is impeded by the lack of a comprehensive database containing synthesis processes. To overcome this limitation, we have generated a dataset of “codified recipes” for solid-state synthesis automatically extracted from scientific publications. The dataset consists of 19,488 synthesis entries retrieved from 53,538 solid-state synthesis paragraphs by using text mining and natural language processing approaches. Every entry contains information about target material, starting compounds, operations used and their conditions, as well as the balanced chemical equation of the synthesis reaction. The dataset is publicly available and can be used for data mining of various aspects of inorganic materials synthesis. Measurement(s) | solid-state synthesis data | Technology Type(s) | natural language processing |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.9906608
Collapse
|
1082
|
Moliner M, Román-Leshkov Y, Corma A. Machine Learning Applied to Zeolite Synthesis: The Missing Link for Realizing High-Throughput Discovery. Acc Chem Res 2019; 52:2971-2980. [PMID: 31553162 DOI: 10.1021/acs.accounts.9b00399] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Zeolites are microporous crystalline materials with well-defined cavities and pores, which can be prepared under different pore topologies and chemical compositions. Their preparation is typically defined by multiple interconnected variables (e.g., reagent sources, molar ratios, aging treatments, reaction time and temperature, among others), but unfortunately their distinctive influence, particularly on the nucleation and crystallization processes, is still far from being understood. Thus, the discovery and/or optimization of specific zeolites is closely related to the exploration of the parametric space through trial-and-error methods, generally by studying the influence of each parameter individually. In the past decade, machine learning (ML) methods have rapidly evolved to address complex problems involving highly nonlinear or massively combinatorial processes that conventional approaches cannot solve. Considering the vast and interconnected multiparametric space in zeolite synthesis, coupled with our poor understanding of the mechanisms involved in their nucleation and crystallization, the use of ML is especially timely for improving zeolite synthesis. Indeed, the complex space of zeolite synthesis requires drawing inferences from incomplete and imperfect information, for which ML methods are very well-suited to replace the intuition-based approaches traditionally used to guide experimentation. In this Account, we contend that both existing and new ML approaches can provide the "missing link" needed to complete the traditional zeolite synthesis workflow used in our quest to rationalize zeolite synthesis. Within this context, we have made important efforts on developing ML tools in different critical areas, such as (1) data-mining tools to process the large amount of data generated using high-throughput platforms; (2) novel complex algorithms to predict the formation of energetically stable hypothetical zeolites and guide the synthesis of new zeolite structures; (3) new "ab initio" organic structure directing agent predictions to direct the synthesis of hypothetical or known zeolites; (4) an automated tool for nonsupervised data extraction and classification from published research articles. ML has already revolutionized many areas in materials science by enhancing our ability to map intricate behavior to process variables, especially in the absence of well-understood mechanisms. Undoubtedly, ML is a burgeoning field with many future opportunities for further breakthroughs to advance the design of molecular sieves. For this reason, this Account includes an outlook of future research directions based on current challenges and opportunities. We envision this Account will become a hallmark reference for both well-established and new researchers in the field of zeolite synthesis.
Collapse
Affiliation(s)
- Manuel Moliner
- Instituto de Tecnología Química, Universitat Politècnica de València-Consejo Superior de Investigaciones Científicas, Avenida de los Naranjos s/n, 46022 València, Spain
| | - Yuriy Román-Leshkov
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Avelino Corma
- Instituto de Tecnología Química, Universitat Politècnica de València-Consejo Superior de Investigaciones Científicas, Avenida de los Naranjos s/n, 46022 València, Spain
| |
Collapse
|
1083
|
García-Muelas R, López N. Statistical learning goes beyond the d-band model providing the thermochemistry of adsorbates on transition metals. Nat Commun 2019; 10:4687. [PMID: 31615991 PMCID: PMC6794282 DOI: 10.1038/s41467-019-12709-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 08/23/2019] [Indexed: 12/30/2022] Open
Abstract
The rational design of heterogeneous catalysts relies on the efficient survey of mechanisms by density functional theory (DFT). However, massive reaction networks cannot be sampled effectively as they grow exponentially with the size of reactants. Here we present a statistical principal component analysis and regression applied to the DFT thermochemical data of 71 C\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${}_{1}$$\end{document}1–C\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${}_{2}$$\end{document}2 species on 12 close-packed metal surfaces. Adsorption is controlled by covalent (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$d$$\end{document}d-band center) and ionic terms (reduction potential), modulated by conjugation and conformational contributions. All formation energies can be reproduced from only three key intermediates (predictors) calculated with DFT. The results agree with accurate experimental measurements having error bars comparable to those of DFT. The procedure can be extended to single-atom and near-surface alloys reducing the number of explicit DFT calculation needed by a factor of 20, thus paving the way for a rapid and accurate survey of whole reaction networks on multimetallic surfaces. Assessing catalytic mechanisms using DFT calculations greatly aids catalyst design, but is impractical for large molecules. Here the authors develop a statistical learning-based thermochemical model for estimating adsorption of organics onto metals, retaining DFT accuracy while reducing the number of calculations by a factor of 20.
Collapse
Affiliation(s)
- Rodrigo García-Muelas
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Av. Països Catalans 16, 43007, Tarragona, Spain.
| | - Núria López
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Av. Països Catalans 16, 43007, Tarragona, Spain.
| |
Collapse
|
1084
|
Kruse H, Šponer J. Revisiting the Potential Energy Surface of the Stacked Cytosine Dimer: FNO-CCSD(T) Interaction Energies, SAPT Decompositions, and Benchmarking. J Phys Chem A 2019; 123:9209-9222. [PMID: 31560201 DOI: 10.1021/acs.jpca.9b05940] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Nucleobase stacking interactions are crucial for the stability of nucleic acids. This study investigates base stacking energies of the cytosine homodimer in different configurations, including intermolecular separation plots, detailed twist dependence, and displaced structures. Highly accurate ab initio quantum chemical single point energies using an energy function based on MP2 complete basis set extrapolation ([6 → 7]ZaPa-NR) and a CCSD(T)/cc-pVTZ-F12 high-level correction are presented as new reference data, providing the most accurate stacking energies of nucleobase dimers currently available. Accurate SAPT2+(3)δMP2 energy decomposition is used to obtain detailed insights into the nature of base stacking interactions at varying vertical distances and twist values. The ab initio symmetry adapted perturbation theory (SAPT) energy decomposition suggests that the base stacking originates from an intricate interplay between dispersion attraction, short-range exchange-repulsion, and Coulomb interaction. The interpretation of the SAPT data is a complex issue as key energy terms vary substantially in the region of optimal (low energy) base stacking geometries. Thus, attempts to highlight one leading stabilizing SAPT base stacking term may be misleading and the outcome strongly depends on the used geometries within the range of geometries sampled in nucleic acids upon thermal fluctuations. Modern dispersion-corrected density functional theory (among them DSD-BLYP-D3, ωB97M-V, and ωB97M-D3BJ) is benchmarked and often reaches up to spectroscopic accuracy (below 1 kJ/mol). The classical AMBER force field is benchmarked with multiple different sets of point-charges (e.g. HF, DFT, and MP2-based) and is found to produce reasonable agreement with the benchmark data.
Collapse
Affiliation(s)
- Holger Kruse
- Institute of Biophysics of the Czech Academy of Sciences , Královopolská 135 , CZ-61265 Brno , Czech Republic
| | - Jiří Šponer
- Institute of Biophysics of the Czech Academy of Sciences , Královopolská 135 , CZ-61265 Brno , Czech Republic.,Central European Institute of Technology , Masaryk University , Kamenice 753/5 , 62500 Brno , Czech Republic
| |
Collapse
|
1085
|
Son K, Lee KB. Prediction of learning curves of 2 dental CAD software programs, part 2: Differences in learning effects by type of dental personnel. J Prosthet Dent 2019; 123:747-752. [PMID: 31590976 DOI: 10.1016/j.prosdent.2019.05.026] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 05/23/2019] [Accepted: 05/23/2019] [Indexed: 01/28/2023]
Abstract
STATEMENT OF PROBLEM Dental computer-aided design (CAD) software programs are essential elements of the digital workflow. Therefore, it is necessary to study the learning effect of dental CAD software programs for efficient use. PURPOSE The purpose of this in vitro study was to predict the learning curve of dental CAD software programs according to dental personnel by using the Wright model and to investigate the tendency of dental personnel to reduce working time according to repeated learning. MATERIAL AND METHODS A total of 36 participants were recruited, including an equal number of dentists, dental technicians, and dental students (12 each). A custom abutment design was evaluated by using exocad CAD and Deltanine CAD software programs. The design was carried out in the following order: 4 steps repeated 3 times each. This study applied the formula of the Wright model to predict 500 repetitive times. In the statistical analysis, 3-repetition and 500-repetition times were analyzed with the Kruskal-Wallis H test and Friedman test (α=.05), and a post hoc comparison was performed by using the Mann-Whitney U-test and Bonferroni correction method (α=.017). RESULTS Three repetitions resulted in shorter working time in the dental technician group. The 3-repetition time decreased statistically for all dental personnel (P<.001). The time for 500 repetitions showed a statistically significant difference according to the type of dental personnel (P=.036), but no significant difference was found after the fourth iteration (fifth iteration: P=.076). Furthermore, the estimated time of 500 iterations decreased statistically significantly from the first to the 500th iteration (P<.001). CONCLUSIONS All dental personnel showed learning effects of dental CAD software programs. Although the dental technician group initially showed less working time, after initial learning, the same learning effect appeared, regardless of the type of dental personnel.
Collapse
Affiliation(s)
- KeunBaDa Son
- Graduate student, Department of Dental Science, Graduate School, Kyungpook National University, Daegu, Republic of Korea
| | - Kyu-Bok Lee
- Professor, Department of Prosthodontics, School of Dentistry, Kyungpook National University, Daegu, Republic of Korea.
| |
Collapse
|
1086
|
Pruksawan S, Lambard G, Samitsu S, Sodeyama K, Naito M. Prediction and optimization of epoxy adhesive strength from a small dataset through active learning. SCIENCE AND TECHNOLOGY OF ADVANCED MATERIALS 2019; 20:1010-1021. [PMID: 31692965 PMCID: PMC6818118 DOI: 10.1080/14686996.2019.1673670] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 09/25/2019] [Accepted: 09/25/2019] [Indexed: 05/27/2023]
Abstract
Machine learning is emerging as a powerful tool for the discovery of novel high-performance functional materials. However, experimental datasets in the polymer-science field are typically limited and they are expensive to build. Their size (< 100 samples) limits the development of chemical intuition from experimentalists, as it constrains the use of machine-learning algorithms for extracting relevant information. We tackle this issue to predict and optimize adhesive materials by combining laboratory experimental design, an active learning pipeline and Bayesian optimization. We start from an initial dataset of 32 adhesive samples that were prepared from various molecular-weight bisphenol A-based epoxy resins and polyetheramine curing agents, mixing ratios and curing temperatures, and our data-driven method allows us to propose an optimal preparation of an adhesive material with a very high adhesive joint strength measured at 35.8 ± 1.1 MPa after three active learning cycles (five proposed preparations per cycle). A Gradient boosting machine learning model was used for the successive prediction of the adhesive joint strength in the active learning pipeline, and the model achieved a respectable accuracy with a coefficient of determination, root mean square error and mean absolute error of 0.85, 4.0 MPa and 3.0 MPa, respectively. This study demonstrates the important impact of active learning to accelerate the design and development of tailored highly functional materials from very small datasets.
Collapse
Affiliation(s)
- Sirawit Pruksawan
- Data-driven Polymer Design Group, Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Japan
- Program in Materials Science and Engineering, Graduate School of Pure and Applied Sciences, University of Tsukuba, Tsukuba, Japan
| | - Guillaume Lambard
- Energy Materials Design Group, Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Japan
| | - Sadaki Samitsu
- Data-driven Polymer Design Group, Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Japan
| | - Keitaro Sodeyama
- Energy Materials Design Group, Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Japan
| | - Masanobu Naito
- Data-driven Polymer Design Group, Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Japan
- Program in Materials Science and Engineering, Graduate School of Pure and Applied Sciences, University of Tsukuba, Tsukuba, Japan
- Department of Advanced Materials Science, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| |
Collapse
|
1087
|
Rai BK, Sresht V, Yang Q, Unwalla R, Tu M, Mathiowetz AM, Bakken GA. Comprehensive Assessment of Torsional Strain in Crystal Structures of Small Molecules and Protein–Ligand Complexes using ab Initio Calculations. J Chem Inf Model 2019; 59:4195-4208. [DOI: 10.1021/acs.jcim.9b00373] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
| | | | | | | | | | | | - Gregory A. Bakken
- Simulation and Modeling Sciences, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| |
Collapse
|
1088
|
A fast neural network approach for direct covariant forces prediction in complex multi-element extended systems. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0098-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
1089
|
Timoshenko J, Frenkel AI. “Inverting” X-ray Absorption Spectra of Catalysts by Machine Learning in Search for Activity Descriptors. ACS Catal 2019. [DOI: 10.1021/acscatal.9b03599] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Janis Timoshenko
- Department of Interface Science, Fritz-Haber-Institute of the Max Planck Society, 14195 Berlin, Germany
| | - Anatoly I. Frenkel
- Department of Materials Science and Chemical Engineering, Stony Brook University, Stony Brook, New York 11794, United States
- Chemistry Division, Brookhaven National Laboratory, Upton, New York 11973, United States
| |
Collapse
|
1090
|
Zhang K, Tan C, Zhao W, Guo E, Tian X. Computation-Guided Design of Ni-Mn-Sn Ferromagnetic Shape Memory Alloy with Giant Magnetocaloric Effect and Excellent Mechanical Properties and High Working Temperature via Multielement Doping. ACS APPLIED MATERIALS & INTERFACES 2019; 11:34827-34840. [PMID: 31461258 DOI: 10.1021/acsami.9b08640] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ni-Mn-Sn ferromagnetic shape memory alloys (FSMAs) have promise for application in efficient solid-state refrigeration. However, the simultaneous achievement of giant magnetocaloric effect (MCE) and excellent mechanical properties and high working temperature in these materials is always the challenge. Computation-guided materials design techniques provide an efficient way to design and identify new magnetocaloric materials. Herein, a new strategy of multidoping is presented. First, we conduct a detailed and comprehensive first-principles study and predict that Ni-Mn-Sn FSMAs with co-doping 6.25 atom % Cu and 6.25-12.5 atom % Co can realize the multiobjective optimization of magnetocaloric material. Then, it is confirmed by experiment and we report on Ni40Co8Mn37Sn9Cu6 FSMA exhibiting a large magnetic entropy change (34.8 J/(kg K)) of a large value in the prevalent MCE materials at high temperature (∼344 K) and whose compression stress and strain (∼1072.0 MPa and ∼11.9%) are both the largest among Ni-Mn-based MCE materials. Notably, the effect of Co and Cu doping is not simply stacked because they play opposite roles in Curie temperature (TC) and martensitic transformation temperature (TM). So, achieving the balance of their effect to combine their merits in a very narrow window is the key step. This approach of multielement doping holds promise to be extended to other magnetocaloric materials to enhance their multiple properties simultaneously.
Collapse
|
1091
|
Westermayr J, Gastegger M, Menger MFSJ, Mai S, González L, Marquetand P. Machine learning enables long time scale molecular photodynamics simulations. Chem Sci 2019; 10:8100-8107. [PMID: 31857878 PMCID: PMC6849489 DOI: 10.1039/c9sc01742a] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 08/02/2019] [Indexed: 02/04/2023] Open
Abstract
Photo-induced processes are fundamental in nature but accurate simulations of their dynamics are seriously limited by the cost of the underlying quantum chemical calculations, hampering their application for long time scales. Here we introduce a method based on machine learning to overcome this bottleneck and enable accurate photodynamics on nanosecond time scales, which are otherwise out of reach with contemporary approaches. Instead of expensive quantum chemistry during molecular dynamics simulations, we use deep neural networks to learn the relationship between a molecular geometry and its high-dimensional electronic properties. As an example, the time evolution of the methylenimmonium cation for one nanosecond is used to demonstrate that machine learning algorithms can outperform standard excited-state molecular dynamics approaches in their computational efficiency while delivering the same accuracy.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry , Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria .
| | - Michael Gastegger
- Machine Learning Group , Technical University of Berlin , 10587 Berlin , Germany
| | - Maximilian F S J Menger
- Institute of Theoretical Chemistry , Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria .
- Dipartimento di Chimica e Chimica Industriale , University of Pisa , Via G. Moruzzi 13 , 56124 Pisa , Italy
| | - Sebastian Mai
- Institute of Theoretical Chemistry , Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria .
| | - Leticia González
- Institute of Theoretical Chemistry , Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria .
| | - Philipp Marquetand
- Institute of Theoretical Chemistry , Faculty of Chemistry , University of Vienna , 1090 Vienna , Austria .
| |
Collapse
|
1092
|
Liang J, Zhu X. Phillips-Inspired Machine Learning for Band Gap and Exciton Binding Energy Prediction. J Phys Chem Lett 2019; 10:5640-5646. [PMID: 31479611 DOI: 10.1021/acs.jpclett.9b02232] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this work, inspired by Phillips's ionicity theory in solid-state physics, we directly sort out the critical factors of the band gap's feature correlations in the machine learning architected with the Lasso algorithm. Even based on a small 2D materials data set, we can fundamentally approach an accurate and rational model about the band gap and exciton binding energy with robust transferability to other databases. Our machine learning outputs can reveal the exact physics pictures behind the predicted quantity as well as the "secondary understanding" of the correlation between the approximated physics models in exciton. This work stresses the significant value of physics endorsement on the machine learning (ML) algorithm and provides a symbolic regression solution for the "few-shot" training scheme for ML technology in materials science. Moreover, physics-inspired secondary understanding could be an essential supplement for ML in scientific research fields.
Collapse
Affiliation(s)
- Jiechun Liang
- School of Science and Engineering , The Chinese University of Hong Kong , 2001 Longxiang Road , Longgang District, Shenzhen , Guangdong , China 518172
| | - Xi Zhu
- School of Science and Engineering , The Chinese University of Hong Kong , 2001 Longxiang Road , Longgang District, Shenzhen , Guangdong , China 518172
- Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS) , 2001 Longxiang Road , Longgang District, Shenzhen , Guangdong , China 518172
| |
Collapse
|
1093
|
Charlton SGV, White MA, Jana S, Eland LE, Jayathilake PG, Burgess JG, Chen J, Wipat A, Curtis TP. Regulating, Measuring, and Modeling the Viscoelasticity of Bacterial Biofilms. J Bacteriol 2019; 201:e00101-19. [PMID: 31182499 PMCID: PMC6707926 DOI: 10.1128/jb.00101-19] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Biofilms occur in a broad range of environments under heterogeneous physicochemical conditions, such as in bioremediation plants, on surfaces of biomedical implants, and in the lungs of cystic fibrosis patients. In these scenarios, biofilms are subjected to shear forces, but the mechanical integrity of these aggregates often prevents their disruption or dispersal. Biofilms' physical robustness is the result of the multiple biopolymers secreted by constituent microbial cells which are also responsible for numerous biological functions. A better understanding of the role of these biopolymers and their response to dynamic forces is therefore crucial for understanding the interplay between biofilm structure and function. In this paper, we review experimental techniques in rheology, which help quantify the viscoelasticity of biofilms, and modeling approaches from soft matter physics that can assist our understanding of the rheological properties. We describe how these methods could be combined with synthetic biology approaches to control and investigate the effects of secreted polymers on the physical properties of biofilms. We argue that without an integrated approach of the three disciplines, the links between genetics, composition, and interaction of matrix biopolymers and the viscoelastic properties of biofilms will be much harder to uncover.
Collapse
Affiliation(s)
- Samuel G V Charlton
- School of Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Michael A White
- Interdisciplinary Computing & Complex BioSystems Research Group, School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Saikat Jana
- School of Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Lucy E Eland
- Interdisciplinary Computing & Complex BioSystems Research Group, School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom
| | | | - J Grant Burgess
- School of Natural & Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Jinju Chen
- School of Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Anil Wipat
- Interdisciplinary Computing & Complex BioSystems Research Group, School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Thomas P Curtis
- School of Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
1094
|
Janet JP, Duan C, Yang T, Nandy A, Kulik HJ. A quantitative uncertainty metric controls error in neural network-driven chemical discovery. Chem Sci 2019; 10:7913-7922. [PMID: 31588334 PMCID: PMC6764470 DOI: 10.1039/c9sc02298h] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Accepted: 07/11/2019] [Indexed: 12/14/2022] Open
Abstract
Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model's domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| | - Chenru Duan
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA
| | - Tzuhsiung Yang
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| | - Aditya Nandy
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| |
Collapse
|
1095
|
Winter R, Montanari F, Steffen A, Briem H, Noé F, Clevert DA. Efficient multi-objective molecular optimization in a continuous latent space. Chem Sci 2019; 10:8016-8024. [PMID: 31853357 PMCID: PMC6836962 DOI: 10.1039/c9sc01928f] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 07/02/2019] [Indexed: 12/21/2022] Open
Abstract
One of the main challenges in small molecule drug discovery is finding novel chemical compounds with desirable properties. In this work, we propose a novel method that combines in silico prediction of molecular properties such as biological activity or pharmacokinetics with an in silico optimization algorithm, namely Particle Swarm Optimization. Our method takes a starting compound as input and proposes new molecules with more desirable (predicted) properties. It navigates a machine-learned continuous representation of a drug-like chemical space guided by a defined objective function. The objective function combines multiple in silico prediction models, defined desirability ranges and substructure constraints. We demonstrate that our proposed method is able to consistently find more desirable molecules for the studied tasks in relatively short time. We hope that our method can support medicinal chemists in accelerating and improving the lead optimization process.
Collapse
Affiliation(s)
- Robin Winter
- Department of Digital Technologies , Bayer AG , Berlin , Germany .
- Department of Mathematics and Computer Science , Freie Universität Berlin , Berlin , Germany
| | | | - Andreas Steffen
- Department of Digital Technologies , Bayer AG , Berlin , Germany .
| | - Hans Briem
- Department of Digital Technologies , Bayer AG , Berlin , Germany .
| | - Frank Noé
- Department of Mathematics and Computer Science , Freie Universität Berlin , Berlin , Germany
| | | |
Collapse
|
1096
|
Kennedy GF, Zhang J, Bond AM. Automatically Identifying Electrode Reaction Mechanisms Using Deep Neural Networks. Anal Chem 2019; 91:12220-12227. [PMID: 31466438 DOI: 10.1021/acs.analchem.9b01891] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
At present, electrochemical mechanisms are most commonly identified subjectively based on the experience of the researcher. This subjectivity is reflected in bias to particular mechanisms as well as lack of quantifiable confidence in the chosen mechanism compared to potential alternative mechanisms. In this paper we demonstrate that a deep neural network trained to recognize dc cyclic voltammograms for three commonly encountered mechanisms provides correct classifications within 5 ms without the problem of subjectivity. To mimic experimental data, the impact of noise, uncompensated resistance, and dependence on scan rate, factors that are relevant to practical studies, has also been investigated. Outcomes with two experimental data sets are also presented.
Collapse
Affiliation(s)
- Gareth F Kennedy
- School of Chemistry , Monash University , Clayton , Victoria 3800 , Australia
| | - Jie Zhang
- School of Chemistry , Monash University , Clayton , Victoria 3800 , Australia.,ARC Centre of Excellence for Electromaterials Science, School of Chemistry , Monash University , Clayton , Victoria 3800 , Australia
| | - Alan M Bond
- School of Chemistry , Monash University , Clayton , Victoria 3800 , Australia.,ARC Centre of Excellence for Electromaterials Science, School of Chemistry , Monash University , Clayton , Victoria 3800 , Australia
| |
Collapse
|
1097
|
Paul A, Furmanchuk A, Liao W, Choudhary A, Agrawal A. Property Prediction of Organic Donor Molecules for Photovoltaic Applications Using Extremely Randomized Trees. Mol Inform 2019; 38:e1900038. [DOI: 10.1002/minf.201900038] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 07/18/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Arindam Paul
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Alona Furmanchuk
- Institute for Public Health and Medicine, Feinberg School of Medicine, Center for Health Information Partnerships Northwestern University Chicago IL, 60611 USA
| | - Wei‐keng Liao
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Alok Choudhary
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| | - Ankit Agrawal
- Department of Electrical and Computer Engineering Northwestern University Evanston IL, 60208 USA
| |
Collapse
|
1098
|
Predicting the onset temperature (T g) of Ge xSe 1-x glass transition: a feature selection based two-stage support vector regression method. Sci Bull (Beijing) 2019; 64:1195-1203. [PMID: 36659690 DOI: 10.1016/j.scib.2019.06.026] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 06/18/2019] [Accepted: 06/26/2019] [Indexed: 01/21/2023]
Abstract
Despite the usage of both experimental and topological methods, realizing a rapid and accurate measurement of the onset temperature (Tg) of GexSe1-x glass transition remains an open challenge. In this paper, a predictive model for the Tg in GexSe1-x glass system is presented by a machine learning method named feature selection based two-stage support vector regression (FSTS-SVR). Firstly, Pearson correlation coefficient (PCC) is used to select features highly correlated with Tg from the candidate features of GexSe1-x glass system. Secondly, in order to simulate the two-stage characteristic of Tg which is caused by structural variation with a turning point at x = 0.33 via the structural analysis, SVR is utilized to build predictive models for two stages separately and then the two achieved models are synthesized using a minimum error based model for Tg prediction. Compared with the topological and other methods based on SVR, the FSTS-SVR gives the highest predictive accuracy with the root mean square error (RMSE) and mean absolute percentage error (MAPE) of 10.64 K and 2.38%, respectively. This method is also expected to be more efficient for the prediction of Tg of other glass systems with the multi-stage characteristic.
Collapse
|
1099
|
Cho H, Choi IS. Enhanced Deep-Learning Prediction of Molecular Properties via Augmentation of Bond Topology. ChemMedChem 2019; 14:1604-1609. [PMID: 31389167 DOI: 10.1002/cmdc.201900458] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 08/07/2019] [Indexed: 12/13/2022]
Abstract
Deep learning has made great strides in tackling chemical problems, but still lacks full-fledged representations for three-dimensional (3D) molecular structures for its inner working. For example, the molecular graph, commonly used in chemistry and recently adapted to the graph convolutional network (GCN), is inherently a 2D representation of 3D molecules. Herein we propose an advanced version of the GCN, called 3DGCN, which receives 3D molecular information from a molecular graph augmented by information on bond direction. While outperforming state-of-the-art deep-learning models in the prediction of chemical and biological properties, 3DGCN has the ability to both generalize and distinguish molecular rotations in 3D, beyond 2D, which has great impact on drug discovery and development, not to mention the design of chemical reactions.
Collapse
Affiliation(s)
- Hyeoncheol Cho
- Center for Cell-Encapsulation Research, Department of Chemistry, KAIST, Daejeon, 34141, Korea
| | - Insung S Choi
- Center for Cell-Encapsulation Research, Department of Chemistry, KAIST, Daejeon, 34141, Korea
| |
Collapse
|
1100
|
Maji S, Shrestha LK, Ariga K. Nanoarchitectonics for Nanocarbon Assembly and Composite. J Inorg Organomet Polym Mater 2019. [DOI: 10.1007/s10904-019-01294-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|