1
|
Abarbanel OD, Hutchison GR. QupKake: Integrating Machine Learning and Quantum Chemistry for Micro-p Ka Predictions. J Chem Theory Comput 2024. [PMID: 38832803 DOI: 10.1021/acs.jctc.4c00328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Accurate prediction of micro-pKa values is crucial for understanding and modulating the acidity and basicity of organic molecules, with applications in drug discovery, materials science, and environmental chemistry. This work introduces QupKake, a novel method that combines graph neural network models with semiempirical quantum mechanical (QM) features to achieve exceptional accuracy and generalization in micro-pKa prediction. QupKake outperforms state-of-the-art models on a variety of benchmark data sets, with root-mean-square errors between 0.5 and 0.8 pKa units on five external test sets. Feature importance analysis reveals the crucial role of QM features in both the reaction site enumeration and micro-pKa prediction models. QupKake represents a significant advancement in micro-pKa prediction, offering a powerful tool for various applications in chemistry and beyond.
Collapse
Affiliation(s)
- Omri D Abarbanel
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department of Chemical and Petroleum Engineering, University of Pittsburgh, 3700 O'Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
2
|
Li CH, Tabor DP. Reorganization Energy Predictions with Graph Neural Networks Informed by Low-Cost Conformers. J Phys Chem A 2023; 127:3484-3489. [PMID: 37017992 PMCID: PMC10848248 DOI: 10.1021/acs.jpca.2c09030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 02/21/2023] [Indexed: 04/06/2023]
Abstract
A critical bottleneck for the design of high-conductivity organic materials is finding molecules with low reorganization energy. To enable high-throughput virtual screening campaigns for many types of organic electronic materials, a fast reorganization energy prediction method compared to density functional theory is needed. However, the development of low-cost machine-learning-based models for calculating the reorganization energy has proven to be challenging. In this paper, we combine a 3D graph-based neural network (GNN) recently benchmarked for drug design applications, ChIRo, with low-cost conformational features for reorganization energy predictions. By comparing the performance of ChIRo to another 3D GNN, SchNet, we find evidence that the bond-invariant property of ChIRo enables the model to learn from low-cost conformational features more efficiently. Through an ablation study with a 2D GNN, we find that using low-cost conformational features on top of 2D features informs the model for making more accurate predictions. Our results demonstrate the feasibility of reorganization energy predictions on the benchmark QM9 data set without needing DFT-optimized geometries and demonstrate the types of features needed for robust models that work on diverse chemical spaces. Furthermore, we show that ChIRo informed with low-cost conformational features achieves comparable performance with the previously reported structure-based model on π-conjugated hydrocarbon molecules. We expect this class of methods can be applied to the high-throughput screening of high-conductivity organic electronics candidates.
Collapse
Affiliation(s)
- Cheng-Han Li
- Department
of Chemistry, Texas A&M University, College Station, Texas 77843, United States
| | - Daniel P. Tabor
- Department
of Chemistry, Texas A&M University, College Station, Texas 77843, United States
| |
Collapse
|
3
|
Bhat V, Sornberger P, Pokuri BSS, Duke R, Ganapathysubramanian B, Risko C. Electronic, redox, and optical property prediction of organic π-conjugated molecules through a hierarchy of machine learning approaches. Chem Sci 2022; 14:203-213. [PMID: 36605753 PMCID: PMC9769113 DOI: 10.1039/d2sc04676h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 11/16/2022] [Indexed: 11/18/2022] Open
Abstract
Accelerating the development of π-conjugated molecules for applications such as energy generation and storage, catalysis, sensing, pharmaceuticals, and (semi)conducting technologies requires rapid and accurate evaluation of the electronic, redox, or optical properties. While high-throughput computational screening has proven to be a tremendous aid in this regard, machine learning (ML) and other data-driven methods can further enable orders of magnitude reduction in time while at the same time providing dramatic increases in the chemical space that is explored. However, the lack of benchmark datasets containing the electronic, redox, and optical properties that characterize the diverse, known chemical space of organic π-conjugated molecules limits ML model development. Here, we present a curated dataset containing 25k molecules with density functional theory (DFT) and time-dependent DFT (TDDFT) evaluated properties that include frontier molecular orbitals, ionization energies, relaxation energies, and low-lying optical excitation energies. Using the dataset, we train a hierarchy of ML models, ranging from classical models such as ridge regression to sophisticated graph neural networks, with molecular SMILES representation as input. We observe that graph neural networks augmented with contextual information allow for significantly better predictions across a wide array of properties. Our best-performing models also provide an uncertainty quantification for the predictions. To democratize access to the data and trained models, an interactive web platform has been developed and deployed.
Collapse
Affiliation(s)
- Vinayak Bhat
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | - Parker Sornberger
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | | | - Rebekah Duke
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | | | - Chad Risko
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| |
Collapse
|
4
|
Hiener DC, Folmsbee DL, Langkamp LA, Hutchison GR. Evaluating fast methods for static polarizabilities on extended conjugated oligomers. Phys Chem Chem Phys 2022; 24:23173-23181. [PMID: 36128891 DOI: 10.1039/d2cp02375j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Given the importance of accurate polarizability calculations to many chemical applications, coupled with the need for efficiency when calculating the properties of sets of molecules or large oligomers, we present a benchmark study examining possible calculation methods for polarizable materials. We first investigate the accuracy of the additive model used in GFN2, a highly-efficient semi-empirical tight-binding method, and the D4 dispersion model, comparing its predicted additive polarizabilities to ωB97XD results for a subset of PubChemQC and a compiled benchmark set of molecules spanning polarizabilities from approximately 3 Å3 to 600 Å3, with some compounds in the range of approximately 1200-1400 Å3. Although we find additive GFN2 polarizabilities, and thus D4, to have large errors with polarizability calculations on large conjugated oligomers, it would appear an empirical quadratic correction can largely remedy this. We also compare the accuracy of DFT polarizability calculations run using basis sets of varying size and level of augmentation, determining that a non-augmented basis set may be used for large, highly polarizable species in conjunction with a linear correction factor to achieve accuracy extremely close to that of aug-cc-pVTZ.
Collapse
Affiliation(s)
- Danielle C Hiener
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Dakota L Folmsbee
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Luke A Langkamp
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA. .,Department of Chemical and Petroleum Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania, 15261, USA
| |
Collapse
|
5
|
Greenstein BL, Hutchison GR. Organic Photovoltaic Efficiency Predictor: Data-Driven Models for Non-Fullerene Acceptor Organic Solar Cells. J Phys Chem Lett 2022; 13:4235-4243. [PMID: 35522056 DOI: 10.1021/acs.jpclett.2c00866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the design of organic solar cells, there has been a need for materials with high power conversion efficiencies. Scharber's model is commonly used to predict efficiency; however, it exhibits poor performance with new non-fullerene acceptor (NFA) devices, since it was designed for fullerene-based devices. In this work, an empirical model is proposed that can be a more accurate alternative for NFA organic solar cells. Additionally, many screening studies use computationally expensive methods. A model based on using semiempirical simplified time-dependent density functional theory (sTD-DFT) as an alternative method can accelerate the calculations and yield a similar accuracy. The models presented in this paper, termed organic photovoltaic efficiency predictor (OPEP) models, have shown significantly lower errors than previous models, with OPEP/B3LYP yielding errors of 1.53% and OPEP/sTD-DFT of 1.55%. The proposed computational models can be used for the fast and accurate screening of new high-efficiency NFAs/donor pairs.
Collapse
Affiliation(s)
- Brianna L Greenstein
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department of Chemical and Petroleum Engineering, University of Pittsburgh, 3700 O'Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
6
|
Abarbanel OD, Rozon J, Hutchison GR. Strategies for Computer-Aided Discovery of Novel Open-Shell Polymers. J Phys Chem Lett 2022; 13:2158-2164. [PMID: 35226497 DOI: 10.1021/acs.jpclett.2c00509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Organic π-conjugated polymers with a triplet ground state have been the focus of recent research for their interesting and unique electronic properties, arising from the presence of the two unpaired electrons. These compounds are usually built from alternating electron-donating and electron-accepting monomer pairs which lower the HOMO-LUMO gap and yield a triplet state instead of the typical singlet ground state. In this paper, we use density functional theory calculations to explore the design rules that govern the creation of a ground-state triplet conjugated polymer and find that a small HOMO-LUMO gap in the singlet state is the best predictor for the existence of a triplet ground state, compared to previous use of a pro-quinoidal bonding character. This work can accelerate the discovery of new stable triplet materials by reducing the computational resources needed for electronic-state calculations and the number of potential candidates for synthesis.
Collapse
Affiliation(s)
- Omri D Abarbanel
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - Julisa Rozon
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department of Chemical and Petroleum Engineering, University of Pittsburgh, 3700 O'Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|