1
|
Tuca E, DiLabio G, Otero-de-la-Roza A. Minimal Basis Set Hartree-Fock Corrected with Atom-Centered Potentials for Molecular Crystal Modeling and Crystal Structure Prediction. J Chem Inf Model 2022; 62:4107-4121. [PMID: 35980964 DOI: 10.1021/acs.jcim.2c00656] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Crystal structure prediction (CSP), determining the experimentally observable structure of a molecular crystal from the molecular diagram, is an important challenge with technologically relevant applications in materials manufacturing and drug design. For the purpose of screening the randomly generated candidate crystal structures, CSP protocols require energy ranking methods that are fast and can accurately capture the small energy differences between molecular crystals. In addition, a good ranking method should also produce accurate equilibrium geometries, both intramolecular and intermolecular. In this article, we explore the combination of minimal-basis-set Hartree-Fock (HF) with atom-centered potentials (ACPs) as a method for modeling the structure and energetics of molecular crystals. The ACPs are developed for the H, C, N, and O atoms and fitted to a set of reference data at the B86bPBE-XDM level in order to mitigate basis-set incompleteness and missing correlation. In particular, ACPs are developed in combination with two methods: HF-D3/MINIs and HF-3c. The application of ACPs greatly improves the performance of HF-D3/MINIs for lattice energies, crystal energy differences, energy-volume and energy-strain relations, and crystal geometries. In the case of HF-3c, the improvement in the crystal energy differences is much smaller than in HF-D3/MINIs, but lattice energies and particularly crystal geometries are considerably better when ACPs are used. The resulting methods may be useful for CSP but also for quick calculation of molecular crystal lattice energies and geometries.
Collapse
Affiliation(s)
- Emilian Tuca
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna V1 V 1 V7, British Columbia, Canada
| | - Gino DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna V1 V 1 V7, British Columbia, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica and MALTA-Consolider Team, Facultad de Química, Universidad de Oviedo, 33006 Oviedo, Spain
| |
Collapse
|
2
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Small-Basis Set Density-Functional Theory Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2913-2930. [PMID: 35412817 DOI: 10.1021/acs.jctc.2c00036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Density functional theory (DFT) is currently the most popular method for modeling noncovalent interactions and thermochemistry. The accurate calculation of noncovalent interaction energies, reaction energies, and barrier heights requires choosing an appropriate functional and, typically, a relatively large basis set. Deficiencies of the density-functional approximation and the use of a limited basis set are the leading sources of error in the calculation of noncovalent and thermochemical properties in molecular systems. In this article, we present three new DFT methods based on the BLYP, M06-2X, and CAM-B3LYP functionals in combination with the 6-31G* basis set and corrected with atom-centered potentials (ACPs). ACPs are one-electron potentials that have the same form as effective-core potentials, except they do not replace any electrons. The ACPs developed in this work are used to generate energy corrections to the underlying DFT/basis-set method such that the errors in predicted chemical properties are minimized while maintaining the low computational cost of the parent methods. ACPs were developed for the elements H, B, C, N, O, F, Si, P, S, and Cl. The ACP parameters were determined using an extensive training set of 118655 data points, mostly of complete basis set coupled-cluster level quality. The target molecular properties for the ACP-corrected methods include noncovalent interaction energies, molecular conformational energies, reaction energies, barrier heights, and bond separation energies. The ACPs were tested first on the training set and then on a validation set of 42567 additional data points. We show that the ACP-corrected methods can predict the target molecular properties with accuracy close to complete basis set wavefunction theory methods, but at a computational cost of double-ζ DFT methods. This makes the new BLYP/6-31G*-ACP, M06-2X/6-31G*-ACP, and CAM-B3LYP/6-31G*-ACP methods uniquely suited to the calculation of noncovalent, thermochemical, and kinetic properties in large molecular systems.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, MALTA Consolider Team, Oviedo E-33006, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| |
Collapse
|
3
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Fast and Accurate Quantum Mechanical Modeling of Large Molecular Systems Using Small Basis Set Hartree-Fock Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2208-2232. [PMID: 35313106 DOI: 10.1021/acs.jctc.1c01128] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There has been significant interest in developing fast and accurate quantum mechanical methods for modeling large molecular systems. In this work, by utilizing a machine learning regression technique, we have developed new low-cost quantum mechanical approaches to model large molecular systems. The developed approaches rely on using one-electron Gaussian-type functions called atom-centered potentials (ACPs) to correct for the basis set incompleteness and the lack of correlation effects in the underlying minimal or small basis set Hartree-Fock (HF) methods. In particular, ACPs are proposed for ten elements common in organic and bioorganic chemistry (H, B, C, N, O, F, Si, P, S, and Cl) and four different base methods: two minimal basis sets (MINIs and MINIX) plus a double-ζ basis set (6-31G*) in combination with dispersion-corrected HF (HF-D3/MINIs, HF-D3/MINIX, HF-D3/6-31G*) and the HF-3c method. The new ACPs are trained on a very large set (73 832 data points) of noncovalent properties (interaction and conformational energies) and validated additionally on a set of 32 048 data points. All reference data are of complete basis set coupled-cluster quality, mostly CCSD(T)/CBS. The proposed ACP-corrected methods are shown to give errors in the tenths of a kcal/mol range for noncovalent interaction energies and up to 2 kcal/mol for molecular conformational energies. More importantly, the average errors are similar in the training and validation sets, confirming the robustness and applicability of these methods outside the boundaries of the training set. In addition, the performance of the new ACP-corrected methods is similar to complete basis set density functional theory (DFT) but at a cost that is orders of magnitude lower, and the proposed ACPs can be used in any computational chemistry program that supports effective-core potentials without modification. It is also shown that ACPs improve the description of covalent and noncovalent bond geometries of the underlying methods and that the improvement brought about by the application of the ACPs is directly related to the number of atoms to which they are applied, allowing the treatment of systems containing some atoms for which ACPs are not available. Overall, the ACP-corrected methods proposed in this work constitute an alternative accurate, economical, and reliable quantum mechanical approach to describe the geometries, interaction energies, and conformational energies of systems with hundreds to thousands of atoms.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Alberto Otero-de-la-Roza
- MALTA Consolider Team, Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, E-33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
4
|
Prasad VK, Pei Z, Edelmann S, Otero-de-la-Roza A, DiLabio GA. BH9, a New Comprehensive Benchmark Data Set for Barrier Heights and Reaction Energies: Assessment of Density Functional Approximations and Basis Set Incompleteness Potentials. J Chem Theory Comput 2021; 18:151-166. [PMID: 34911294 DOI: 10.1021/acs.jctc.1c00694] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The calculation of accurate reaction energies and barrier heights is essential in computational studies of reaction mechanisms and thermochemistry. To assess methods regarding their ability to predict these two properties, high-quality benchmark sets are required that comprise a reasonably large and diverse set of organic reactions. Due to the time-consuming nature of both locating transition states and computing accurate reference energies for reactions involving large molecules, previous benchmark sets have been limited in scope, the number of reactions considered, and the size of the reactant and product molecules. Recent advances in coupled-cluster theory, in particular local correlation methods like DLPNO-CCSD(T), now allow the calculation of reaction energies and barrier heights for relatively large systems. In this work, we present a comprehensive and diverse benchmark set of barrier heights and reaction energies based on DLPNO-CCSD(T)/CBS called BH9. BH9 comprises 449 chemical reactions belonging to nine types common in organic chemistry and biochemistry. We examine the accuracy of DLPNO-CCSD(T) vis-a-vis canonical CCSD(T) for a subset of BH9 and conclude that, although there is a penalty in using the DLPNO approximation, the reference data are accurate enough to serve as a benchmark for density functional theory (DFT) methods. We then present two applications of the BH9 set. First, we examine the performance of several density functional approximations commonly used in thermochemical and mechanistic studies. Second, we assess our basis set incompleteness potentials regarding their ability to mitigate basis set incompleteness errors. The number of data points, the diversity of the reactions considered, and the relatively large size of the reactant molecules make BH9 the most comprehensive thermochemical benchmark set to date and a useful tool for the development and assessment of computational methods.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Zhipeng Pei
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Simon Edelmann
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica and MALTA Consolider Team, Facultad de Química, Universidad de Oviedo, 33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
5
|
Prasad VK, Khalilian MH, Otero-de-la-Roza A, DiLabio GA. BSE49, a diverse, high-quality benchmark dataset of separation energies of chemical bonds. Sci Data 2021; 8:300. [PMID: 34815431 PMCID: PMC8611007 DOI: 10.1038/s41597-021-01088-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 11/01/2021] [Indexed: 01/23/2023] Open
Abstract
We present an extensive and diverse dataset of bond separation energies associated with the homolytic cleavage of covalently bonded molecules (A-B) into their corresponding radical fragments (A. and B.). Our dataset contains two different classifications of model structures referred to as "Existing" (molecules with associated experimental data) and "Hypothetical" (molecules with no associated experimental data). In total, the dataset consists of 4502 datapoints (1969 datapoints from the Existing and 2533 datapoints from the Hypothetical classes). The dataset covers 49 unique X-Y type single bonds (except H-H, H-F, and H-Cl), where X and Y are H, B, C, N, O, F, Si, P, S, and Cl atoms. All the reference data was calculated at the (RO)CBS-QB3 level of theory. The reference bond separation energies are non-relativistic ground-state energy differences and contain no zero-point energy corrections. This new dataset of bond separation energies (BSE49) is presented as a high-quality reference dataset for assessing and developing computational chemistry methods.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Kelowna, British Columbia, V1V 1V7, Canada
| | - M Hossein Khalilian
- Department of Chemistry, University of British Columbia, Kelowna, British Columbia, V1V 1V7, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, MALTA Consolider Team, E-33006, Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Kelowna, British Columbia, V1V 1V7, Canada.
| |
Collapse
|
6
|
Romero-Montalvo E, DiLabio GA. Computational Study of Hydrogen Bond Interactions in Water Cluster–Organic Molecule Complexes. J Phys Chem A 2021; 125:3369-3377. [DOI: 10.1021/acs.jpca.1c01377] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Eduardo Romero-Montalvo
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Gino A. DiLabio
- Department of Chemistry, University of British Columbia, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
- Faculty of Management, University of British Columbia, 1137 Alumni Avenue, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
7
|
Mehta N, Goerigk L. Assessing the Applicability of the Geometric Counterpoise Correction in B2PLYP/Double-ζ Calculations for Thermochemistry, Kinetics, and Noncovalent Interactions. Aust J Chem 2021. [DOI: 10.1071/ch21133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
We present a proof-of-concept study of the suitability of Kruse and Grimme’s geometric counterpoise correction (gCP) for basis set superposition errors (BSSEs) in double-hybrid density functional calculations with a double-ζ basis set. The gCP approach only requires geometrical information as an input and no orbital/density information is needed. Therefore, this correction is practically free of any additional cost. gCP is trained against the Boys and Bernardi counterpoise correction across a set of 528 noncovalently bound dimers. We investigate the suitability of the approach for the B2PLYP/def2-SVP level of theory, and reveal error compensation effects—missing London dispersion and the BSSE—associated with B2PLYP/def2-SVP calculations, and present B2PLYP-gCP-D3(BJ)/def2-SVP with the reparametrised DFT-D3(BJ) and gCP corrections as a more balanced alternative. Benchmarking results on the S66x8 benchmark set for noncovalent interactions and the GMTKN55 database for main-group thermochemistry, kinetics, and noncovalent interactions show a statistical improvement of the B2PLYP-gCP-D3(BJ) scheme over plain B2PLYP and B2PLYP-D3(BJ). B2PLYP-D3(BJ) shows significant overestimation of interaction energies, barrier heights with larger deviations from the reference values, and wrong relative stabilities in conformers, all of which can be associated with BSSE. We find that the gCP-corrected method represents a significant improvement over B2PLYP-D3(BJ), particularly for intramolecular noncovalent interactions. These findings encourage future developments of efficient double-hybrid DFT strategies that can be applied when double-hybrid calculations with large basis sets are not feasible due to system size.
Collapse
|
8
|
Mancini G, Fusè M, Lazzari F, Chandramouli B, Barone V. Unsupervised search of low-lying conformers with spectroscopic accuracy: A two-step algorithm rooted into the island model evolutionary algorithm. J Chem Phys 2020; 153:124110. [DOI: 10.1063/5.0018314] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Giordano Mancini
- Scuola Normale Superiore, Piazza dei Cavalieri 7, 56125 Pisa, Italy
| | - Marco Fusè
- Scuola Normale Superiore, Piazza dei Cavalieri 7, 56125 Pisa, Italy
| | - Federico Lazzari
- Scuola Normale Superiore, Piazza dei Cavalieri 7, 56125 Pisa, Italy
| | | | - Vincenzo Barone
- Scuola Normale Superiore, Piazza dei Cavalieri 7, 56125 Pisa, Italy
| |
Collapse
|
9
|
Otero-de-la-Roza A, DiLabio GA. Improved Basis-Set Incompleteness Potentials for Accurate Density-Functional Theory Calculations in Large Systems. J Chem Theory Comput 2020; 16:4176-4191. [PMID: 32470304 DOI: 10.1021/acs.jctc.0c00102] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The accurate calculation of chemical properties using density-functional theory (DFT) requires the use of a nearly complete basis set. In chemical systems involving hundreds to thousands of atoms, the cost of the calculations place practical limitations on the number of basis functions that can be used. Therefore, in most practical applications of DFT to large systems, there exists a basis-set incompleteness error (BSIE). In this article, we present the next iteration of the basis-set incompleteness potentials (BSIPs), one-electron potentials designed to correct for basis-set incompleteness error. The ultimate goal associated with the development of BSIPs is to allow the calculation of molecular properties using DFT with near-complete-basis-set results at a computational cost that is similar to a small basis set calculation. In this work, we develop BSIPs for 10 atoms in the first and second rows (H, B-F, Si-Cl) and 15 common basis sets of the Pople, Dunning, Karlsruhe, and Huzinaga types. Our new BSIPs are constructed to minimize BSIE in the calculation of reaction energies, barrier heights, noncovalent binding energies, and intermolecular distances. The BSIPs were obtained using a training set of 15 944 data points. The fitting approach employed a regularized linear least-squares method with variable selection (the LASSO method), which results in a much better fit to the training data than our previous BSIPs while, at the same time, reducing the computational cost of BSIP development. The proposed BSIPs are tested on various benchmark sets and demonstrate excellent performance in practice. Our new BSIPs are also transferable; i.e., they can be used to correct BSIE in calculations that employ density functionals other than the one used in the BSIP development (B3LYP). Finally, BSIPs can be used in any quantum chemistry program that have implemented effective-core potentials without changes to the software.
Collapse
Affiliation(s)
- A Otero-de-la-Roza
- Departamento de Quı́mica Fı́sica y Analítica and MALTA-Consolider Team, Facultad de Quı́mica, Universidad de Oviedo, 33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada.,Faculty of Management, University of British Columbia, Okanagan, 1137 Alumni Avenue, Kelowna, British Columbia V1V 1V7, Canada
| |
Collapse
|
10
|
Sethio D, Martins JBL, Lawson Daku LM, Hagemann H, Kraka E. Modified Density Functional Dispersion Correction for Inorganic Layered MFX Compounds (M = Ca, Sr, Ba, Pb and X = Cl, Br, I). J Phys Chem A 2020; 124:1619-1633. [PMID: 31999454 DOI: 10.1021/acs.jpca.9b10357] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
MFX (M = Ca, Ba, Sr, Pb and X = Cl, Br, I) compounds have received considerable attention due to their technological application as X-ray detectors, pressure sensors, and optical data storage materials, when doped with rare-earth ions. MFX compounds belong to the class of layered materials with a tetragonal Matlockite crystal structure, characterized by weakly stacked double-halide layers along the crystallographic c-axis. These layers predominantly determine phase transitions, elastic, and mechanical properties. However, the correct description of the lattice parameter c is a challenge for most standard DFT functionals, which tend to overestimate the lattice parameter c. Because of the weak interactions between the halide layers, dispersion-corrected functionals seem to be a better choice. We investigated 11 different inorganic layered MFX compounds for which experimental data are available, with standard and dispersion-corrected functionals to assess their performance in reproducing the lattice parameter c, structural, and vibrational properties of the MFX compounds. Our results revealed that these functionals do not describe the weak interactions between the halide layers in a balanced way. Therefore, we modified Grimme's popular DFT-D2 dispersion correction scheme in two different ways by (i) replacing the dispersion coefficients and van der Waals radii with those of noble gas atoms or (ii) increasing the van der Waals radii of the MFX atoms up to 40%. Comparison with the available experimental data revealed that the latter approach applied to the PBE (Perdew-Burke-Ernzerhof)-D2 functional with 30% increased van der Waals radii, which we coined PBE-D2* (Srvdw 1.30) is best suited to fine-tune the description of the weak interlayer interactions in MFX compounds, thus significantly improving the description of their structural, vibrational, and mechanical properties. Work is in progress applying this new, computationally inexpensive scheme to other inorganic layered compounds and periodic systems with weakly stacked layers.
Collapse
Affiliation(s)
- Daniel Sethio
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry , Southern Methodist University , 3215 Daniel Avenue , Dallas , Texas 75275-0314 , United States
| | - João B L Martins
- Institute of Chemistry , University of Brasilia , Brasilia , DF 70910-900 , Brazil
| | - Latévi Max Lawson Daku
- Department of Physical Chemistry , University of Geneva , 30 Quai Ernest-Ansermet , CH-1211 Geneva 4 , Switzerland
| | - Hans Hagemann
- Department of Physical Chemistry , University of Geneva , 30 Quai Ernest-Ansermet , CH-1211 Geneva 4 , Switzerland
| | - Elfi Kraka
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry , Southern Methodist University , 3215 Daniel Avenue , Dallas , Texas 75275-0314 , United States
| |
Collapse
|
11
|
Kim T, Ri C, Yun H, An R, Han G, Chae S, Kim G, Jong G, Jon Y. A Novel Method for Calculation of Molecular Energies and Charge Distributions by Thermodynamic Formalization. Sci Rep 2019; 9:20264. [PMID: 31889056 PMCID: PMC6937252 DOI: 10.1038/s41598-019-56312-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/03/2019] [Indexed: 11/29/2022] Open
Abstract
The paper describes a new approach to the thermodynamic formalization for calculation of molecular energy and charge distribution in ground state by means of the variational equation of DFT. In order to thermodynamically formalize the molecular calculation, the pseudo chemical potential (PCP) is conceptualized, where a molecule is broken into multi-phase(atom) one-component(electron) systems and the energy of system is represented as PCP. Calculation of the molecular energy and atomic charge by PCP is put forward, thereafter the approach is proved to be valid and its efficiency (accuracy and calculation speed) is verified.
Collapse
Affiliation(s)
- TongIl Kim
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea.
| | - ChungIl Ri
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - HakSung Yun
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - RyongNam An
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - GwangBok Han
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - SungIl Chae
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - GyongNam Kim
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - GwangChol Jong
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea
| | - Yung Jon
- Institute of Chemistry and Biology, University of Science, Pyongyang, 950003, Democratic People's Republic of Korea.
| |
Collapse
|
12
|
Li W, Miao W, Cui J, Fang C, Su S, Li H, Hu L, Lu Y, Chen G. Efficient Corrections for DFT Noncovalent Interactions Based on Ensemble Learning Models. J Chem Inf Model 2019; 59:1849-1857. [PMID: 30912940 DOI: 10.1021/acs.jcim.8b00878] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Machine learning has exhibited powerful capabilities in many areas. However, machine learning models are mostly database dependent, requiring a new model if the database changes. Therefore, a universal model is highly desired to accommodate the widest variety of databases. Fortunately, this universality may be achieved by ensemble learning, which can integrate multiple learners to meet the demands of diversified databases. Therefore, we propose a general procedure for learning ensemble establishment based on noncovalent interactions (NCIs) databases. Additionally, accurate NCI computation is quite demanding for first-principles methods, for which a competent machine learning model can be an efficient solution to obtain high NCI accuracy with minimal computational resources. In regard to these aspects, multiple schemes of ensemble learning models (Bagging, Boosting, and Stacking frameworks), are explored in this study. The models are based on various low levels of density functional theory (DFT) calculations for the benchmark databases S66, S22, and X40. All NCIs computed by the DFT calculations can be improved to high-level accuracy (root-mean-square error RMSE = 0.22 kcal/mol in contrast to CCSD(T)/CBS benchmark) by established ensemble learning models. Compared with single machine learning models, ensemble models show better accuracy (RMSE of the best model is further lowered by ∼25%), robustness and goodness-of-fit according to evaluation parameters suggested by the OECD. Among ensemble learning models, heterogeneous Stacking ensemble models show the most valuable application potential. The standardized procedure of constructing learning ensembles has been well utilized on several NCI data sets, and this procedure may also be applicable for other chemical databases.
Collapse
Affiliation(s)
- Wenze Li
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - Wei Miao
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - Jingxia Cui
- Institute of Functional Material Chemistry, Faculty of Chemistry , Northeast Normal University , Changchun , 130024 , China
| | - Chao Fang
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - Shunting Su
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - Hongzhi Li
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - LiHong Hu
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China
| | - Yinghua Lu
- School of Information Science and Technology , Northeast Normal University , Changchun , 130117 , China.,Institute of Functional Material Chemistry, Faculty of Chemistry , Northeast Normal University , Changchun , 130024 , China
| | - GuanHua Chen
- Department of Chemistry , The University of Hong Kong , Hong Kong , China
| |
Collapse
|
13
|
Garcı́a JS, Brémond É, Campetella M, Ciofini I, Adamo C. Small Basis Set Allowing the Recovery of Dispersion Interactions with Double-Hybrid Functionals. J Chem Theory Comput 2019; 15:2944-2953. [DOI: 10.1021/acs.jctc.8b01203] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Juan Sanz Garcı́a
- Chimie ParisTech, PSL Research University, CNRS, Institute of Chemistry for Life and Health Sciences, 11, rue Pierre et Marie Curie, F-75005 Paris, France
| | - Éric Brémond
- Univ Paris Diderot, Sorbonne Paris Cité, ITODYS, UMR CNRS 7086, 15 rue J.-A. de Baïf, F-75013 Paris, France
| | - Marco Campetella
- Chimie ParisTech, PSL Research University, CNRS, Institute of Chemistry for Life and Health Sciences, 11, rue Pierre et Marie Curie, F-75005 Paris, France
| | - Ilaria Ciofini
- Chimie ParisTech, PSL Research University, CNRS, Institute of Chemistry for Life and Health Sciences, 11, rue Pierre et Marie Curie, F-75005 Paris, France
| | - Carlo Adamo
- Chimie ParisTech, PSL Research University, CNRS, Institute of Chemistry for Life and Health Sciences, 11, rue Pierre et Marie Curie, F-75005 Paris, France
- Institut Universitaire de France, 103 Boulevard Saint Michel, F-75005 Paris, France
| |
Collapse
|
14
|
PEPCONF, a diverse data set of peptide conformational energies. Sci Data 2019; 6:180310. [PMID: 30667382 PMCID: PMC6343515 DOI: 10.1038/sdata.2018.310] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 11/30/2018] [Indexed: 12/13/2022] Open
Abstract
We present an extensive and diverse database of peptide conformational energies. Our database contains five different classes of model geometries: dipeptides, tripeptides, and disulfide-bridged, bioactive, and cyclic peptides. In total, the database consists of 3775 conformational energy data points and 4530 conformer geometries. All the reference energies have been calculated at the LC-ωPBE-XDM/aug-cc-pVTZ level of theory, which is shown to yield conformational energies with an accuracy in the order of tenths of a kcal/mol when compared to complete-basis-set coupled-cluster reference data. The peptide conformational data set (PEPCONF) is presented as a high-quality reference set for the development and benchmarking of molecular-mechanics and semi-empirical electronic structure methods, which are the most commonly used techniques in the modeling of medium to large proteins.
Collapse
|
15
|
Chandramouli B, Del Galdo S, Fusè M, Barone V, Mancini G. Two-level stochastic search of low-energy conformers for molecular spectroscopy: implementation and validation of MM and QM models. Phys Chem Chem Phys 2019; 21:19921-19934. [DOI: 10.1039/c9cp03557e] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The search for stationary points in the molecular potential energy surfaces (PES) is a problem of increasing relevance in molecular sciences especially for large, flexible systems featuring several large-amplitude internal motions.
Collapse
Affiliation(s)
| | | | | | - Vincenzo Barone
- Scuola Normale Superiore
- 56126 Pisa
- Italy
- Istituto Nazionale di Fisica Nucleare (INFN)
- Sezione di Pisa
| | - Giordano Mancini
- Scuola Normale Superiore
- 56126 Pisa
- Italy
- Istituto Nazionale di Fisica Nucleare (INFN)
- Sezione di Pisa
| |
Collapse
|
16
|
Goerigk L, Mehta N. A Trip to the Density Functional Theory Zoo: Warnings and Recommendations for the User. Aust J Chem 2019. [DOI: 10.1071/ch19023] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
This account is written for general users of density functional theory (DFT) methods as well as experimental researchers who are new to the field and would like to conduct such calculations. Its main emphasis lies on how to find a way through the confusing ‘zoo’ of DFT by addressing common misconceptions and highlighting those modern methods that should ideally be used in calculations of energetic properties and geometries. A particular focus is on highly popular methods and the important fact that popularity does not imply accuracy. In this context, we present a new analysis of the openly available data published in Swart and co-workers’ famous annual ‘DFT poll’ (http://www.marcelswart.eu/dft-poll/) to demonstrate the existing communication gap between the DFT user and developer communities. We show that despite considerable methodological advances in the field, the perception of some parts of the user community regarding their favourite approaches has changed little. It is hoped that this account makes a contribution towards changing this status and that users are inspired to adjust their current computational protocols to accommodate strategies that are based on proven robustness, accuracy, and efficiency rather than popularity.
Collapse
|
17
|
Tahchieva DN, Bakowies D, Ramakrishnan R, von Lilienfeld OA. Torsional Potentials of Glyoxal, Oxalyl Halides, and Their Thiocarbonyl Derivatives: Challenges for Popular Density Functional Approximations. J Chem Theory Comput 2018; 14:4806-4817. [PMID: 30011363 DOI: 10.1021/acs.jctc.8b00174] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The reliability of popular density functionals was studied for the description of torsional profiles of 36 molecules: glyoxal, oxalyl halides, and their thiocarbonyl derivatives. HF and 18 functionals of varying complexity, from local density to range-separated hybrid approximations and double-hybrid, have been considered and benchmarked against CCSD(T)-level rotational profiles. For molecules containing heavy halogens, most functionals fail to reproduce barrier heights accurately and a number of functionals introduce spurious minima. Dispersion corrections show no improvement. Calibrated torsion-corrected atom-centered potentials rectify the shortcomings of PBE and also improve on σ-hole based intermolecular binding in dimers and crystals.
Collapse
Affiliation(s)
- Diana N Tahchieva
- Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), Department of Chemistry , University of Basel , Klingelbergstrasse 80 , CH-4056 Basel , Switzerland
| | - Dirk Bakowies
- Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), Department of Chemistry , University of Basel , Klingelbergstrasse 80 , CH-4056 Basel , Switzerland
| | - Raghunathan Ramakrishnan
- Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), Department of Chemistry , University of Basel , Klingelbergstrasse 80 , CH-4056 Basel , Switzerland
| | - O Anatole von Lilienfeld
- Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), Department of Chemistry , University of Basel , Klingelbergstrasse 80 , CH-4056 Basel , Switzerland
| |
Collapse
|