1
|
Fallani A, Medrano Sandonas L, Tkatchenko A. Inverse mapping of quantum properties to structures for chemical space of small organic molecules. Nat Commun 2024; 15:6061. [PMID: 39025883 PMCID: PMC11258234 DOI: 10.1038/s41467-024-50401-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 07/01/2024] [Indexed: 07/20/2024] Open
Abstract
Computer-driven molecular design combines the principles of chemistry, physics, and artificial intelligence to identify chemical compounds with tailored properties. While quantum-mechanical (QM) methods, coupled with machine learning, already offer a direct mapping from 3D molecular structures to their properties, effective methodologies for the inverse mapping in chemical space remain elusive. We address this challenge by demonstrating the possibility of parametrizing a chemical space with a finite set of QM properties. Our proof-of-concept implementation achieves an approximate property-to-structure mapping, the QIM model (which stands for "Quantum Inverse Mapping"), by forcing a variational auto-encoder with a property encoder to obtain a common internal representation for both structures and properties. After validating this mapping for small drug-like molecules, we illustrate its capabilities with an explainability study as well as by the generation of de novo molecular structures with targeted properties and transition pathways between conformational isomers. Our findings thus provide a proof-of-principle demonstration aiming to enable the inverse property-to-structure design in diverse chemical spaces.
Collapse
Affiliation(s)
- Alessio Fallani
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
- Institute for Materials Science and Max Bergmann Center of Biomaterials, TU Dresden, 01062, Dresden, Germany.
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
2
|
Greenstein BL, Elsey DC, Hutchison GR. Determining best practices for using genetic algorithms in molecular discovery. J Chem Phys 2023; 159:091501. [PMID: 37655763 DOI: 10.1063/5.0158053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Accepted: 08/09/2023] [Indexed: 09/02/2023] Open
Abstract
Genetic algorithms (GAs) are a powerful tool to search large chemical spaces for inverse molecular design. However, GAs have multiple hyperparameters that have not been thoroughly investigated for chemical space searches. In this tutorial, we examine the general effects of a number of hyperparameters, such as population size, elitism rate, selection method, mutation rate, and convergence criteria, on key GA performance metrics. We show that using a self-termination method with a minimum Spearman's rank correlation coefficient of 0.8 between generations maintained for 50 consecutive generations along with a population size of 32, a 50% elitism rate, three-way tournament selection, and a 40% mutation rate provides the best balance of finding the overall champion, maintaining good coverage of elite targets, and improving relative speedup for general use in molecular design GAs.
Collapse
Affiliation(s)
- Brianna L Greenstein
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, USA
| | - Danielle C Elsey
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, USA
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
3
|
Tom R, Gao S, Yang Y, Zhao K, Bier I, Buchanan EA, Zaykov A, Havlas Z, Michl J, Marom N. Inverse Design of Tetracene Polymorphs with Enhanced Singlet Fission Performance by Property-Based Genetic Algorithm Optimization. CHEMISTRY OF MATERIALS : A PUBLICATION OF THE AMERICAN CHEMICAL SOCIETY 2023; 35:1373-1386. [PMID: 36999121 PMCID: PMC10042130 DOI: 10.1021/acs.chemmater.2c03444] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/06/2023] [Indexed: 06/19/2023]
Abstract
The efficiency of solar cells may be improved by using singlet fission (SF), in which one singlet exciton splits into two triplet excitons. SF occurs in molecular crystals. A molecule may crystallize in more than one form, a phenomenon known as polymorphism. Crystal structure may affect SF performance. In the common form of tetracene, SF is experimentally known to be slightly endoergic. A second, metastable polymorph of tetracene has been found to exhibit better SF performance. Here, we conduct inverse design of the crystal packing of tetracene using a genetic algorithm (GA) with a fitness function tailored to simultaneously optimize the SF rate and the lattice energy. The property-based GA successfully generates more structures predicted to have higher SF rates and provides insight into packing motifs associated with improved SF performance. We find a putative polymorph predicted to have superior SF performance to the two forms of tetracene, whose structures have been determined experimentally. The putative structure has a lattice energy within 1.5 kJ/mol of the most stable common form of tetracene.
Collapse
Affiliation(s)
- Rithwik Tom
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Siyu Gao
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Yi Yang
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Kaiji Zhao
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Imanuel Bier
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Eric A. Buchanan
- Department
of Chemistry, University of Colorado, Boulder, Colorado80309, United States
| | - Alexandr Zaykov
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
- Department
of Physical Chemistry, University of Chemistry
and Technology, 166 28Prague 6, Czech Republic
| | - Zdeněk Havlas
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
| | - Josef Michl
- Department
of Chemistry, University of Colorado, Boulder, Colorado80309, United States
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, 16610Prague 6, Czech
Republic
| | - Noa Marom
- Department
of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania15213, United States
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
| |
Collapse
|
4
|
Hiener DC, Folmsbee DL, Langkamp LA, Hutchison GR. Evaluating fast methods for static polarizabilities on extended conjugated oligomers. Phys Chem Chem Phys 2022; 24:23173-23181. [PMID: 36128891 DOI: 10.1039/d2cp02375j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Given the importance of accurate polarizability calculations to many chemical applications, coupled with the need for efficiency when calculating the properties of sets of molecules or large oligomers, we present a benchmark study examining possible calculation methods for polarizable materials. We first investigate the accuracy of the additive model used in GFN2, a highly-efficient semi-empirical tight-binding method, and the D4 dispersion model, comparing its predicted additive polarizabilities to ωB97XD results for a subset of PubChemQC and a compiled benchmark set of molecules spanning polarizabilities from approximately 3 Å3 to 600 Å3, with some compounds in the range of approximately 1200-1400 Å3. Although we find additive GFN2 polarizabilities, and thus D4, to have large errors with polarizability calculations on large conjugated oligomers, it would appear an empirical quadratic correction can largely remedy this. We also compare the accuracy of DFT polarizability calculations run using basis sets of varying size and level of augmentation, determining that a non-augmented basis set may be used for large, highly polarizable species in conjunction with a linear correction factor to achieve accuracy extremely close to that of aug-cc-pVTZ.
Collapse
Affiliation(s)
- Danielle C Hiener
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Dakota L Folmsbee
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Luke A Langkamp
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA. .,Department of Chemical and Petroleum Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania, 15261, USA
| |
Collapse
|
5
|
Greenstein BL, Hiener DC, Hutchison GR. Computational Evolution of High-Performing Unfused Non-Fullerene Acceptors for Organic Solar Cells. J Chem Phys 2022; 156:174107. [DOI: 10.1063/5.0087299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Materials optimization for organic solar cells (OSCs) is a highly active field, with many approaches using empirical experimental synthesis, computational brute-force approaches to screen candidates in a given subset of chemical space, or generative machine learning methods which often require significant training sets. While these methods may find high-performing materials, they can be inefficient and time-consuming. Genetic algorithms (GAs) are an alternative approach, allowing for the "virtual synthesis" of molecules and a prediction of their ``fitness' for some property, with new candidates suggested based on good characteristics of previously generated molecules. In this work, a GA is used to discover high-performing unfused non-fullerene acceptors (NFAs) based on an empirical prediction of power conversion efficiency (PCE) and provides design rules for future work. The electron withdrawing/donating strength, as well as the sequence and symmetry of those units are examined. The utilization of a GA over a brute force approach resulted in speedups up to $1.8 \times 10^{12}$. New types of units not frequently seen in OSCs are suggested, and in total 5,426 NFAs are discovered with the GA. Of these, 1,087 NFAs are predicted to have a PCE greater than 18\%, which is roughly the current record efficiency. While the symmetry of the sequence showed no correlation with PCE, analysis of the sequence arrangement revealed that higher performance can be achieved with a donor core and acceptor end groups. Future NFA designs should consider this strategy as an alternative to the current A-D-A$'$-D-A architecture.
Collapse
|