1
|
Lelièvre T, Pigeon T, Stoltz G, Zhang W. Analyzing Multimodal Probability Measures with Autoencoders. J Phys Chem B 2024; 128:2607-2631. [PMID: 38466759 DOI: 10.1021/acs.jpcb.3c07075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Finding collective variables to describe some important coarse-grained information on physical systems, in particular metastable states, remains a key issue in molecular dynamics. Recently, machine learning techniques have been intensively used to complement and possibly bypass expert knowledge in order to construct collective variables. Our focus here is on neural network approaches based on autoencoders. We study some relevant mathematical properties of the loss function considered for training autoencoders and provide physical interpretations based on conditional variances and minimum energy paths. We also consider various extensions in order to better describe physical systems, by incorporating more information on transition states at saddle points, and/or allowing for multiple decoders in order to describe several transition paths. Our results are illustrated on toy two-dimensional systems and on alanine dipeptide.
Collapse
Affiliation(s)
- Tony Lelièvre
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Thomas Pigeon
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
- IFP Energies Nouvelles, Rond-Point de l'Echangeur de Solaize, BP 3, 69360 Solaize, France
| | - Gabriel Stoltz
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Wei Zhang
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
- Zuse Institute Berlin, Takustraße 7, 14195 Berlin, Germany
| |
Collapse
|
2
|
Swinburne TD. Coarse-Graining and Forecasting Atomic Material Simulations with Descriptors. PHYSICAL REVIEW LETTERS 2023; 131:236101. [PMID: 38134806 DOI: 10.1103/physrevlett.131.236101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 07/21/2023] [Accepted: 11/13/2023] [Indexed: 12/24/2023]
Abstract
Atomic simulations of materials require significant resources to generate, store, and analyze. Here, descriptor functions are proposed as a general, metric latent space for atomic structures, ideal for use in large-scale simulations. Descriptors can regress a broad range of properties, including character-dependent dislocation densities, stress states, or radial distribution functions. A vector autoregressive model can generate trajectories over yield points, resample from new initial conditions and forecast trajectory futures. A forecast confidence, essential for practical application, is derived by propagating forecasts through the Mahalanobis outlier distance, providing a powerful tool to assess coarse-grained models. Application to nanoparticles and yielding of nanoscale dislocation networks confirms low uncertainty forecasts are accurate and resampling allows for the propagation of smooth property distributions. Yielding is associated with a collapse in the intrinsic dimension of the descriptor manifold, which is discussed in relation to the yield surface.
Collapse
Affiliation(s)
- Thomas D Swinburne
- Aix-Marseille Université, CNRS, CINaM UMR 7325, Campus de Luminy, 13288 Marseille, France
| |
Collapse
|
3
|
Siddiqui GA, Stebani JA, Wragg D, Koutsourelakis PS, Casini A, Gagliardi A. Application of Machine Learning Algorithms to Metadynamics for the Elucidation of the Binding Modes and Free Energy Landscape of Drug/Target Interactions: a Case Study. Chemistry 2023; 29:e202302375. [PMID: 37555841 DOI: 10.1002/chem.202302375] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 08/09/2023] [Indexed: 08/10/2023]
Abstract
In the context of drug discovery, computational methods were able to accelerate the challenging process of designing and optimizing a new drug candidate. Amongst the possible atomistic simulation approaches, metadynamics (metaD) has proven very powerful. However, the choice of collective variables (CVs) is not trivial for complex systems. To automate the process of CVs identification, two different machine learning algorithms were applied in this study, namely DeepLDA and Autoencoder, to the metaD simulation of a well-researched drug/target complex, consisting in a pharmacologically relevant non-canonical DNA secondary structure (G-quadruplex) and a metallodrug acting as its stabilizer, as well as solvent molecules.
Collapse
Affiliation(s)
- Gohar Ali Siddiqui
- Professorship of Simulation of Nanosystems for Energy Conversion Department of Electrical and Computer Engineering School of Computation, Information and Technology, Technical University of Munich (TUM), Hans-Piloty-Str. 1, 85748, Garching b. München, Germany
| | - Julia A Stebani
- Chair of Medicinal and Bioinorganic Chemistry Department of Chemistry, School of Natural Sciences, Technical University of Munich (TUM), Lichtenbergstr. 4, 85748, Garching b. München, Germany
| | - Darren Wragg
- Chair of Medicinal and Bioinorganic Chemistry Department of Chemistry, School of Natural Sciences, Technical University of Munich (TUM), Lichtenbergstr. 4, 85748, Garching b. München, Germany
| | - Phaedon-Stelios Koutsourelakis
- Professorship for Data-driven Materials Modeling School of Engineering and Design, Technical University of Munich (TUM), Boltzmannstr. 15, 85748, Garching b. München, Germany
| | - Angela Casini
- Chair of Medicinal and Bioinorganic Chemistry Department of Chemistry, School of Natural Sciences, Technical University of Munich (TUM), Lichtenbergstr. 4, 85748, Garching b. München, Germany
| | - Alessio Gagliardi
- Professorship of Simulation of Nanosystems for Energy Conversion Department of Electrical and Computer Engineering School of Computation, Information and Technology, Technical University of Munich (TUM), Hans-Piloty-Str. 1, 85748, Garching b. München, Germany
| |
Collapse
|
4
|
Karcz MJ, Messina L, Kawasaki E, Rajaonson S, Bathellier D, Nastar M, Schuler T, Bourasseau E. Semi-supervised generative approach to chemical disorder: application to point-defect formation in uranium-plutonium mixed oxides. Phys Chem Chem Phys 2023; 25:23069-23080. [PMID: 37605928 DOI: 10.1039/d3cp02790b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Chemical disorder has a major impact on the characterization of the atomic-scale properties of highly complex chemical compounds, such as the properties of point defects. Due to the vast amount of possible atomic configurations, the study of such properties becomes intractable if treated with direct sampling. In this work, we propose an alternative approach, in which samples are selected based on the local atomic composition around the defect, and the defect formation energy is obtained as a function of this local composition with a reduced computational cost. We apply this approach to (U, Pu)O2 nuclear fuels. The formation-energy distribution is computed using machine-learning generative methods, and used to investigate the impact of chemical disorder and the range of influence of local composition on the defect properties. The predicted distributions are then used to calculate the concentration of thermal defects. This approach allows for the first time for the computation of the latter property with a physically meaningful exploration of the configuration space, and opens the way to a more efficient determination of physico-chemical properties in other chemically-disordered compounds such as high-entropy alloys.
Collapse
Affiliation(s)
- Maciej J Karcz
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
- Université Paris-Saclay, CEA, LIST, F-91120, Palaiseau, France
| | - Luca Messina
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Eiji Kawasaki
- Université Paris-Saclay, CEA, LIST, F-91120, Palaiseau, France
| | - Serenah Rajaonson
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Didier Bathellier
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Maylise Nastar
- Université Paris-Saclay, CEA, Service de Recherche en Corrosion et Comportement des Matériaux, SRMP, F-91191 Gif-sur-Yvette, France
| | - Thomas Schuler
- Université Paris-Saclay, CEA, Service de Recherche en Corrosion et Comportement des Matériaux, SRMP, F-91191 Gif-sur-Yvette, France
| | - Emeric Bourasseau
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| |
Collapse
|
5
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|