1
|
Zhu Y, Peng J, Xu C, Lan Z. Unsupervised Machine Learning in the Analysis of Nonadiabatic Molecular Dynamics Simulation. J Phys Chem Lett 2024:9601-9619. [PMID: 39270134 DOI: 10.1021/acs.jpclett.4c01751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
The all-atomic full-dimensional-level simulations of nonadiabatic molecular dynamics (NAMD) in large realistic systems has received high research interest in recent years. However, such NAMD simulations normally generate an enormous amount of time-dependent high-dimensional data, leading to a significant challenge in result analyses. Based on unsupervised machine learning (ML) methods, considerable efforts were devoted to developing novel and easy-to-use analysis tools for the identification of photoinduced reaction channels and the comprehensive understanding of complicated molecular motions in NAMD simulations. Here, we tried to survey recent advances in this field, particularly to focus on how to use unsupervised ML methods to analyze the trajectory-based NAMD simulation results. Our purpose is to offer a comprehensive discussion on several essential components of this analysis protocol, including the selection of ML methods, the construction of molecular descriptors, the establishment of analytical frameworks, their advantages and limitations, and persistent challenges.
Collapse
Affiliation(s)
- Yifei Zhu
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Jiawei Peng
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Chao Xu
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Zhenggang Lan
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| |
Collapse
|
2
|
Mo W, Ni S, Zhou M, Wen J, Qi D, Huang J, Yang Y, Xu Y, Wang X, Zhao Z. An electron density clustering based adaptive segmentation method for protein Raman spectrum calculation. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 314:124155. [PMID: 38552542 DOI: 10.1016/j.saa.2024.124155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/10/2024] [Accepted: 03/12/2024] [Indexed: 04/20/2024]
Abstract
Raman spectroscopy is a powerful technique for protein detection, but the calculation of Raman spectrum is a longstanding challenging problem due to the large sizes and complex structures of protein molecules. Dividing proteins into fragments can greatly accelerate the calculation, but this usually introduces large errors originating from ignored interactions between fragments into obtained spectra. In this paper, we proposed a new adaptive segmentation method based on the strength of interactions and molecular shapes and structures, i.e., electron density clustering, to divide proteins. It can reduce errors of obtained Raman spectra by about 20% compared to the uniform segmentation method without a significant increase in computational cost. This method can facilitate the validation and analysis of detected Raman spectra of proteins and promote the application of Raman spectroscopy in biological detection.
Collapse
Affiliation(s)
- Wenbo Mo
- National Key Laboratory of Plasma Physics, Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China; Department of Engineering Physics, Tsinghua University, 100084 Beijing, China
| | - Shuang Ni
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Minjie Zhou
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Jiaxing Wen
- National Key Laboratory of Plasma Physics, Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Daojian Qi
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Jinglin Huang
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Yue Yang
- National Key Laboratory of Plasma Physics, Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Yang Xu
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Xuewu Wang
- Department of Engineering Physics, Tsinghua University, 100084 Beijing, China
| | - Zongqing Zhao
- National Key Laboratory of Plasma Physics, Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China.
| |
Collapse
|
3
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
4
|
Pios SV, Gelin MF, Ullah A, Dral PO, Chen L. Artificial-Intelligence-Enhanced On-the-Fly Simulation of Nonlinear Time-Resolved Spectra. J Phys Chem Lett 2024; 15:2325-2331. [PMID: 38386692 DOI: 10.1021/acs.jpclett.4c00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
Time-resolved spectroscopy is an important tool for unraveling the minute details of structural changes in molecules of biological and technological significance. The nonlinear femtosecond signals detected for such systems must be interpreted, but it is a challenging task for which theoretical simulations are often indispensable. Accurate simulations of transient absorption or two-dimensional electronic spectra are, however, computationally very expensive, prohibiting the wider adoption of existing first-principles methods. Here, we report an artificial-intelligence-enhanced protocol to drastically reduce the computational cost of simulating nonlinear time-resolved electronic spectra, which makes such simulations affordable for polyatomic molecules of increasing size. The protocol is based on the doorway-window approach for the on-the-fly surface-hopping simulations. We show its applicability for the prototypical molecule of pyrazine for which it produces spectra with high precision with respect to ab initio reference while cutting the computational cost by at least 95% compared to pure first-principles simulations.
Collapse
Affiliation(s)
- Sebastian V Pios
- Zhejiang Laboratory, Hangzhou, Zhejiang 311100, People's Republic of China
| | - Maxim F Gelin
- School of Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, People's Republic of China
| | - Arif Ullah
- School of Physics and Optoelectronic Engineering, Anhui University, Hefei, Anhui 230601, People's Republic of China
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, People's Republic of China
| | - Lipeng Chen
- Zhejiang Laboratory, Hangzhou, Zhejiang 311100, People's Republic of China
| |
Collapse
|
5
|
Ye K, Wang S, Huang Y, Hu M, Zhou D, Luo Y, Ye S, Zhang G, Jiang J. Machine Learning Prediction of Molecular Binding Profiles on Metal-Porphyrin via Spectroscopic Descriptors. J Phys Chem Lett 2024; 15:1956-1961. [PMID: 38346267 DOI: 10.1021/acs.jpclett.3c03002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
The study of molecular adsorption is crucial for understanding various chemical processes. Spectroscopy offers a convenient and non-invasive way of probing structures of adsorbed states and can be used for real-time observation of molecular binding profiles, including both structural and energetic information. However, deciphering atomic structures from spectral information using the first-principles approach is computationally expensive and time-consuming because of the sophistication of recording spectra, chemical structures, and their relationship. Here, we demonstrate the feasibility of a data-driven machine learning approach for predicting binding energy and structural information directly from vibrational spectra of the adsorbate by using CO adsorption on iron porphyrin as an example. Our trained machine learning model is not only interpretable but also readily transferred to similar metal-nitrogen-carbon systems with comparable accuracy. This work shows the potential of using structure-encoded spectroscopic descriptors in machine learning models for the study of adsorbed states of molecules on transition metal complexes.
Collapse
Affiliation(s)
- Ke Ye
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Song Wang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Yan Huang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Min Hu
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Donglai Zhou
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Yi Luo
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui 230088, P. R. China
| | - Sheng Ye
- School of Artificial Intelligence, Anhui University, Hefei, Anhui 230601, P. R. China
| | - Guozhen Zhang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui 230088, P. R. China
| | - Jun Jiang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
- Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui 230088, P. R. China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| |
Collapse
|
6
|
Tu C, Huang W, Liang S, Wang K, Tian Q, Yan W. High-throughput virtual screening of organic second-order nonlinear optical chromophores within the donor-π-bridge-acceptor framework. Phys Chem Chem Phys 2024; 26:2363-2375. [PMID: 38167888 DOI: 10.1039/d3cp04046a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
In view of the theoretical importance and huge application potential of second-order nonlinear optical (NLO) materials, it is of great significance to conduct high-throughput virtual screening (HTVS) on a compound library to find candidate NLO chromophores. Under the donor-π-bridge-acceptor structural framework, a virtual compound library (size = 27 090) was constructed by enumeration of structural fragments. The kernel property adopted for optimization is the static first hyperpolarizability (β0). By combining machine learning and quantum chemical calculations, we have performed an HTVS procedure to sieve NLO chromophores out, and the response mechanism of the selected optimal NLO chromophores was examined. We have found: (a) The multi-layer perceptron/extended connectivity fingerprint combination with 20% selection ratio gives the highest prediction accuracy for the studied systems. (b) The two optimal donors are bis(4-diphenylaminophenyl)aminyl and bis(4-tert-butylphenyl)aminyl; the optimal π-bridges are composed of two thiophenyl, selenophenyl or furanyl units; and the two optimal acceptors are tri-s-triazinyl and 2,3-dicyanopyrazinyl. (c) The no. 1 candidate molecule can exhibit a calculated β0 equal to 8.55 × 104 a.u. (d) The difference in NLO responses of the optimal 16 molecules comes from the synergistic interaction of ES1, Δμ and f, by employing the two-level model. In addition, the sizable Δμ and f allow the studied optimal molecules to obtain a large NLO response in the meantime keeping a not-too-low excitation energy (retaining good optical transparency in the restricted range of the visible spectrum region). (e) With further modification on the acceptor, the designed DPA-π-TRZ-A' (A' = CN or NO2, π = oligo-thiophenyl or selenophenyl) systems can exhibit a rather large NLO response (maximum β0 = 3.17 × 105 a.u.), hence should have considerable potential as second-order NLO chromophores. With the above observations, we expect to provide some insight for the research community into the HTVS of organic second-order NLO chromophores.
Collapse
Affiliation(s)
- Chunyun Tu
- School of Chemistry and Materials Engineering, Guiyang University, Guiyang, 550005, P. R. China.
| | - Weijiang Huang
- School of Chemistry and Materials Engineering, Guiyang University, Guiyang, 550005, P. R. China.
| | - Sheng Liang
- School of Mathematics and Information Science, Guiyang University, Guiyang, 550005, P. R. China
| | - Kui Wang
- School of Chemistry and Materials Engineering, Guiyang University, Guiyang, 550005, P. R. China.
| | - Qin Tian
- School of Chemistry and Materials Engineering, Guiyang University, Guiyang, 550005, P. R. China.
| | - Wei Yan
- School of Chemistry and Materials Engineering, Guiyang University, Guiyang, 550005, P. R. China.
| |
Collapse
|
7
|
Chen Z, Wing-Wah Yam V. Encoding Hole-Particle Information in the Multi-Channel MolOrbImage for Machine-Learned Excited-State Energies of Large Photofunctional Materials. J Am Chem Soc 2023; 145:24098-24107. [PMID: 37874942 DOI: 10.1021/jacs.3c07766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
We present a novel class of one-electron multi-channel molecular orbital images (MolOrbImages) designed for the prediction of excited-state energetics in conjunction with the state-of-the-art VGG-type machine-learning architecture. By representing hole and particle states in the excitation process as channels of MolOrbImages, the revised VGG model achieves excellent prediction accuracy for both low-lying singlet and triplet states, with mean absolute errors (MAEs) of <0.08 and <0.1 eV for QM9 molecules and large photofunctional materials with up to 560 atoms, respectively. Remarkably, the model demonstrates exceptional performance (MAE < 1 kcal/mol) for the T1 state of QM9 molecules, making it a non-system-specific model that approaches chemical accuracy. The general rules attained, for instance, the improved performance with well-defined MO energies and the reduced overfitting concern via the inclusion of physically insightful hole-particle information, provide invaluable guidelines for the further design of orbital-based descriptors targeting molecular excited states.
Collapse
Affiliation(s)
- Ziyong Chen
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong, China
| | - Vivian Wing-Wah Yam
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong, China
- Hong Kong Quantum AI Lab Ltd., Hong Kong Science Park, Hong Kong, China
| |
Collapse
|
8
|
Zhang Y, Jiang B. Universal machine learning for the response of atomistic systems to external fields. Nat Commun 2023; 14:6424. [PMID: 37827998 PMCID: PMC10570356 DOI: 10.1038/s41467-023-42148-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 10/01/2023] [Indexed: 10/14/2023] Open
Abstract
Machine learned interatomic interaction potentials have enabled efficient and accurate molecular simulations of closed systems. However, external fields, which can greatly change the chemical structure and/or reactivity, have been seldom included in current machine learning models. This work proposes a universal field-induced recursively embedded atom neural network (FIREANN) model, which integrates a pseudo field vector-dependent feature into atomic descriptors to represent system-field interactions with rigorous rotational equivariance. This "all-in-one" approach correlates various response properties like dipole moment and polarizability with the field-dependent potential energy in a single model, very suitable for spectroscopic and dynamics simulations in molecular and periodic systems in the presence of electric fields. Especially for periodic systems, we find that FIREANN can overcome the intrinsic multiple-value issue of the polarization by training atomic forces only. These results validate the universality and capability of the FIREANN method for efficient first-principles modeling of complicated systems in strong external fields.
Collapse
Affiliation(s)
- Yaolong Zhang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China
- École Polytechnique Fédérale de Lausanne, 1015, Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China.
- Hefei National Laboratory, University of Science and Technology of China, Hefei, 230088, China.
| |
Collapse
|
9
|
Chen MS, Mao Y, Snider A, Gupta P, Montoya-Castillo A, Zuehlsdorff TJ, Isborn CM, Markland TE. Elucidating the Role of Hydrogen Bonding in the Optical Spectroscopy of the Solvated Green Fluorescent Protein Chromophore: Using Machine Learning to Establish the Importance of High-Level Electronic Structure. J Phys Chem Lett 2023; 14:6610-6619. [PMID: 37459252 DOI: 10.1021/acs.jpclett.3c01444] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Hydrogen bonding interactions with chromophores in chemical and biological environments play a key role in determining their electronic absorption and relaxation processes, which are manifested in their linear and multidimensional optical spectra. For chromophores in the condensed phase, the large number of atoms needed to simulate the environment has traditionally prohibited the use of high-level excited-state electronic structure methods. By leveraging transfer learning, we show how to construct machine-learned models to accurately predict the high-level excitation energies of a chromophore in solution from only 400 high-level calculations. We show that when the electronic excitations of the green fluorescent protein chromophore in water are treated using EOM-CCSD embedded in a DFT description of the solvent the optical spectrum is correctly captured and that this improvement arises from correctly treating the coupling of the electronic transition to electric fields, which leads to a larger response upon hydrogen bonding between the chromophore and water.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Yuezhi Mao
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Andrew Snider
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Prachi Gupta
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Andrés Montoya-Castillo
- Department of Chemistry, University of Colorado, Boulder, Boulder, Colorado 80309, United States
| | - Tim J Zuehlsdorff
- Department of Chemistry, Oregon State University, Corvallis, Oregon 97331, United States
| | - Christine M Isborn
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
10
|
Chen Z, Yam VWW. Machine-Learned Electronically Excited States with the MolOrbImage Generated from the Molecular Ground State. J Phys Chem Lett 2023; 14:1955-1961. [PMID: 36787423 DOI: 10.1021/acs.jpclett.3c00014] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
We present a general machine learning framework for probing the electronic state properties using the novel quantum descriptor MolOrbImage. Each pixel of the MolOrbImage records the quantum information generated by the integration of the physical operator with a pair of bra and ket molecular orbital (MO) states. Inspired by the success of deep convolutional neural networks (NNs) in computer vision, we have implemented the convolutional-layer-dominated MO-NN model. Using the orbital energy and electron repulsion integral MolOrbImages, the MO-NN model achieves promising prediction accuracies against the ADC(2)/cc-pVTZ reference for transition energies to both low-lying singlet [mean absolute error (MAE) < 0.16 eV] and triplet (MAE < 0.14 eV) states. An apparent improvement in the prediction of oscillator strength, which has been shown to be challenging previously, has been demonstrated in this study. Moreover, the transferability test indicates the remarkable extrapolation capacity of the MO-NN model to describe the out of data set systems.
Collapse
Affiliation(s)
- Ziyong Chen
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
| | - Vivian Wing-Wah Yam
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
- Hong Kong Quantum AI Lab Ltd., Hong Kong Science Park, Hong Kong 999077, China
| |
Collapse
|
11
|
Zhang Y, Lin Q, Jiang B. Atomistic neural network representations for chemical dynamics simulations of molecular, condensed phase, and interfacial systems: Efficiency, representability, and generalization. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Yaolong Zhang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Qidong Lin
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Bin Jiang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
12
|
Tu C, Huang W, Liang S, Wang K, Tian Q, Yan W. Combining machine learning and quantum chemical calculations for high-throughput virtual screening of thermally activated delayed fluorescence molecular materials: the impact of selection strategy and structural mutations. RSC Adv 2022; 12:30962-30975. [PMID: 36349007 PMCID: PMC9619240 DOI: 10.1039/d2ra05643g] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 10/09/2022] [Indexed: 11/23/2022] Open
Abstract
In view of the theoretical importance and huge application potential of Thermally Activated Delayed Fluorescence (TADF) materials, it is of great significance to conduct High-Throughput Virtual Screening (HTVS) on compound libraries to find TADF candidate molecules. This research focuses on the computational design of pure organic TADF molecules. By combining machine learning and quantum chemical calculations, using cheminformatics tools, and introducing the concept of selection and mutation from evolutionary theory, we have designed a computational program for HTVS of TADF molecular materials, especially the impact of selection strategy and structural mutations on the results of HTVS was explored. An initial compound library (size = 103) constructed by enumeration of typical donors and acceptors was used to evolve by successively applying selection and 10 different structural mutations. And a group fingerprint similarity (ΔMSPR) index was proposed to account for the similarity between two compound libraries with comparable sizes. Based on the computed data, we have found that the mix of selection and mutations into the evolution map does have great impact on the HTVS results: (a) except the fast mutation Sub2, all the rest of the mutations can effectively concentrate 'good' molecules in a compound library, and hence give large material abundance (typically >0.8) for high mutation generations (n g ≥ 6). (b) The mean energy gap can exhibit a fast convergent trend toward very low values, hence the studied mutations (except Sub2) can cooperate very well with the studied DA substrates to generate optimal molecules, and the group fingerprint similarity can retain high enough values for large n g, which can be associated with the apparent convergence in molecular skeletons as n g increases. (c) The distribution of skeleton frequencies for a specific mutation is generally uneven with one dominant skeleton. The overall numbers of common and generic cores for all mutations are 11 and 7 as n g = 9. Hence, in a sense, the 'optimal' skeletons seem unique and useful in realizing low energy gaps. With these observations and the development of related HTVS software, we expect to provide insight and tools to the research community of HTVS of molecular (TADF) materials.
Collapse
Affiliation(s)
- Chunyun Tu
- School of Chemistry and Materials Engineering, Guiyang University Guiyang 550005 P. R. China +86-180-9605-0905
| | - Weijiang Huang
- School of Chemistry and Materials Engineering, Guiyang University Guiyang 550005 P. R. China +86-180-9605-0905
| | - Sheng Liang
- School of Mathematics and Information Science, Guiyang University Guiyang 550005 P. R. China
| | - Kui Wang
- School of Chemistry and Materials Engineering, Guiyang University Guiyang 550005 P. R. China +86-180-9605-0905
| | - Qin Tian
- School of Chemistry and Materials Engineering, Guiyang University Guiyang 550005 P. R. China +86-180-9605-0905
| | - Wei Yan
- School of Chemistry and Materials Engineering, Guiyang University Guiyang 550005 P. R. China +86-180-9605-0905
| |
Collapse
|
13
|
Parker KA, Schultz JD, Singh N, Wasielewski MR, Beratan DN. Mapping Simulated Two-Dimensional Spectra to Molecular Models Using Machine Learning. J Phys Chem Lett 2022; 13:7454-7461. [PMID: 35930790 DOI: 10.1021/acs.jpclett.2c01913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Two-dimensional (2D) spectroscopy encodes molecular properties and dynamics into expansive spectral data sets. Translating these data into meaningful chemical insights is challenging because of the many ways chemical properties can influence the spectra. To address the task of extracting chemical information from 2D spectroscopy, we study the capacity of simple feedforward neural networks (NNs) to map simulated 2D electronic spectra to underlying physical Hamiltonians. We examined hundreds of simulated 2D spectra corresponding to monomers and dimers with varied Franck-Condon active vibrations and monomer-monomer electronic couplings. We find the NNs are able to correctly characterize most Hamiltonian parameters in this study with an accuracy above 90%. Our results demonstrate that NNs can aid in interpreting 2D spectra, leading from spectroscopic features to underlying effective Hamiltonians.
Collapse
Affiliation(s)
- Kelsey A Parker
- Department of Chemistry, Duke University, Durham, North Carolina 27708, United States
| | - Jonathan D Schultz
- Department of Chemistry and Institute for Sustainability and Energy, Northwestern University, Evanston, Illinois 60208-3113, United States
| | - Niven Singh
- Program in Computational Biology and Bioinformatics, Center for Genomics and Computational Biology, Duke University School of Medicine, Durham, North Carolina 27710, United States
| | - Michael R Wasielewski
- Department of Chemistry and Institute for Sustainability and Energy, Northwestern University, Evanston, Illinois 60208-3113, United States
| | - David N Beratan
- Department of Chemistry, Duke University, Durham, North Carolina 27708, United States
- Department of Biochemistry, Duke University, Durham, North Carolina 27710, United States
- Department of Physics, Duke University, Durham, North Carolina 27708, United States
| |
Collapse
|
14
|
Ni S, Yang Q, Huang J, Zhou M, Wei L, Yang Y, Wen J, Mo W, Le W, Qi D, Jin L, Li B, Zhao Z, Du K. Constructing high-accuracy theoretical Raman spectra of SARS-CoV-2 spike proteins based on a large fragment method. Chem Phys Lett 2022; 800:139663. [PMID: 35529782 PMCID: PMC9055380 DOI: 10.1016/j.cplett.2022.139663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 03/27/2022] [Accepted: 04/26/2022] [Indexed: 01/17/2023]
Abstract
In order to control COVID-19, rapid and accurate detection of the pathogenic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an urgent task. The target spike proteins of SARS-CoV-2 have been detected experimentally via Raman spectroscopy. However, there lacks high-accuracy theoretical Raman spectra of the spike proteins to as a standard reference for the clinic diagnostic purpose. In this paper, we propose a large fragment method to construct the high-precision Raman spectra for the spike proteins. The large fragment method not only reduces the calculation error but also improves the accuracy of the protein Raman spectra by completely calculating the interactions within the large fragment. The Pearson correlation coefficient of theoretical Raman spectra is greater than 0.929 or more. Compared with the experimental spectra, the characteristic patterns are easily visible. This work provides a detection standard for the spike proteins which shall bring a step closer to the fast recognition of SARS-CoV-2 via Raman spectroscopy method.
Collapse
Affiliation(s)
- Shuang Ni
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Qiang Yang
- China Academy of Engineering Physics, 621900 Mianyang, China
| | - Jinling Huang
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Minjie Zhou
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Lai Wei
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Yue Yang
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Jiaxin Wen
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
- Department of Engineering Physics, Tsinghua University, 100084 Beijing, China
| | - Wenbo Mo
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
- Department of Engineering Physics, Tsinghua University, 100084 Beijing, China
| | - Wei Le
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Daojian Qi
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Lei Jin
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Bo Li
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Zongqin Zhao
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| | - Kai Du
- Laser Fusion Research Center, China Academy of Engineering Physics, 621900 Mianyang, China
| |
Collapse
|
15
|
Chen Z, Bononi FC, Sievers CA, Kong WY, Donadio D. UV-Visible Absorption Spectra of Solvated Molecules by Quantum Chemical Machine Learning. J Chem Theory Comput 2022; 18:4891-4902. [PMID: 35913220 DOI: 10.1021/acs.jctc.1c01181] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Predicting UV-visible absorption spectra is essential to understand photochemical processes and design energy materials. Quantum chemical methods can deliver accurate calculations of UV-visible absorption spectra, but they are computationally expensive, especially for large systems or when one computes line shapes from thermal averages. Here, we present an approach to predict UV-visible absorption spectra of solvated aromatic molecules by quantum chemistry (QC) and machine learning (ML). We show that a ML model, trained on the high-level QC calculation of the excitation energy of a set of aromatic molecules, can accurately predict the line shape of the lowest-energy UV-visible absorption band of several related molecules with less than 0.1 eV deviation with respect to reference experimental spectra. Applying linear decomposition analysis on the excitation energies, we unveil that our ML models probe vertical excitations of these aromatic molecules primarily by learning the atomic environment of their phenyl rings, which align with the physical origin of the π →π* electronic transition. Our study provides an effective workflow that combines ML with quantum chemical methods to accelerate the calculations of UV-visible absorption spectra for various molecular systems.
Collapse
Affiliation(s)
- Zekun Chen
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Fernanda C Bononi
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Charles A Sievers
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Wang-Yeuk Kong
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Davide Donadio
- Department of Chemistry, University of California Davis 95616, California, United States
| |
Collapse
|
16
|
Yan J, Rodríguez-Martínez X, Pearce D, Douglas H, Bili D, Azzouzi M, Eisner F, Virbule A, Rezasoltani E, Belova V, Dörling B, Few S, Szumska AA, Hou X, Zhang G, Yip HL, Campoy-Quiles M, Nelson J. Identifying structure-absorption relationships and predicting absorption strength of non-fullerene acceptors for organic photovoltaics. ENERGY & ENVIRONMENTAL SCIENCE 2022; 15:2958-2973. [PMID: 35923416 PMCID: PMC9277517 DOI: 10.1039/d2ee00887d] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 05/20/2022] [Indexed: 06/15/2023]
Abstract
Non-fullerene acceptors (NFAs) are excellent light harvesters, yet the origin of their high optical extinction is not well understood. In this work, we investigate the absorption strength of NFAs by building a database of time-dependent density functional theory (TDDFT) calculations of ∼500 π-conjugated molecules. The calculations are first validated by comparison with experimental measurements in solution and solid state using common fullerene and non-fullerene acceptors. We find that the molar extinction coefficient (ε d,max) shows reasonable agreement between calculation in vacuum and experiment for molecules in solution, highlighting the effectiveness of TDDFT for predicting optical properties of organic π-conjugated molecules. We then perform a statistical analysis based on molecular descriptors to identify which features are important in defining the absorption strength. This allows us to identify structural features that are correlated with high absorption strength in NFAs and could be used to guide molecular design: highly absorbing NFAs should possess a planar, linear, and fully conjugated molecular backbone with highly polarisable heteroatoms. We then exploit a random decision forest algorithm to draw predictions for ε d,max using a computational framework based on extended tight-binding Hamiltonians, which shows reasonable predicting accuracy with lower computational cost than TDDFT. This work provides a general understanding of the relationship between molecular structure and absorption strength in π-conjugated organic molecules, including NFAs, while introducing predictive machine-learning models of low computational cost.
Collapse
Affiliation(s)
- Jun Yan
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Xabier Rodríguez-Martínez
- Electronic and Photonic Materials (EFM), Department of Physics, Chemistry and Biology (IFM), Linköping University Linköping SE 581 83 Sweden
- Instituto de Ciencia de Materiales de Barcelona, ICMAB-CSIC, Campus UAB Bellaterra 08193 Spain
| | - Drew Pearce
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Hana Douglas
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Danai Bili
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Mohammed Azzouzi
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Flurin Eisner
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Alise Virbule
- Department of Physics, Imperial College London SW7 2AZ London UK
| | | | - Valentina Belova
- Instituto de Ciencia de Materiales de Barcelona, ICMAB-CSIC, Campus UAB Bellaterra 08193 Spain
| | - Bernhard Dörling
- Instituto de Ciencia de Materiales de Barcelona, ICMAB-CSIC, Campus UAB Bellaterra 08193 Spain
| | - Sheridan Few
- Department of Physics, Imperial College London SW7 2AZ London UK
- Sustainability Research Institute, School of Earth and Environment, University of Leeds LS2 9JT Leeds UK
| | - Anna A Szumska
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Xueyan Hou
- Department of Physics, Imperial College London SW7 2AZ London UK
| | - Guichuan Zhang
- Institute of Polymer Optoelectronic Materials and Devices, State Key Laboratory of Luminescent Materials and Devices, South China University of Technology Guangzhou 510640 P. R. China
| | - Hin-Lap Yip
- Institute of Polymer Optoelectronic Materials and Devices, State Key Laboratory of Luminescent Materials and Devices, South China University of Technology Guangzhou 510640 P. R. China
- Department of Materials Science and Engineering, City University of Hong Kong, Tat Chee Avenue Kowloon Hong Kong
| | - Mariano Campoy-Quiles
- Instituto de Ciencia de Materiales de Barcelona, ICMAB-CSIC, Campus UAB Bellaterra 08193 Spain
| | - Jenny Nelson
- Department of Physics, Imperial College London SW7 2AZ London UK
| |
Collapse
|
17
|
Cerdán L, Roca-Sanjuán D. Reconstruction of Nuclear Ensemble Approach Electronic Spectra Using Probabilistic Machine Learning. J Chem Theory Comput 2022; 18:3052-3064. [PMID: 35481363 PMCID: PMC9097286 DOI: 10.1021/acs.jctc.2c00004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Indexed: 11/29/2022]
Abstract
The theoretical prediction of molecular electronic spectra by means of quantum mechanical (QM) computations is fundamental to gain a deep insight into many photophysical and photochemical processes. A computational strategy that is attracting significant attention is the so-called Nuclear Ensemble Approach (NEA), that relies on generating a representative ensemble of nuclear geometries around the equilibrium structure and computing the vertical excitation energies (ΔE) and oscillator strengths (f) and phenomenologically broadening each transition with a line-shaped function with empirical full-width δ. Frequently, the choice of δ is carried out by visually finding the trade-off between artificial vibronic features (small δ) and over-smoothing of electronic signatures (large δ). Nevertheless, this approach is not satisfactory, as it relies on a subjective perception and may lead to spectral inaccuracies overall when the number of sampled configurations is limited due to an excessive computational burden (high-level QM methods, complex systems, solvent effects, etc.). In this work, we have developed and tested a new approach to reconstruct NEA spectra, dubbed GMM-NEA, based on the use of Gaussian Mixture Models (GMMs), a probabilistic machine learning algorithm, that circumvents the phenomenological broadening assumption and, in turn, the use of δ altogether. We show that GMM-NEA systematically outperforms other data-driven models to automatically select δ overall for small datasets. In addition, we report the use of an algorithm to detect anomalous QM computations (outliers) that can affect the overall shape and uncertainty of the NEA spectra. Finally, we apply GMM-NEA to predict the photolysis rate for HgBrOOH, a compound involved in Earth's atmospheric chemistry.
Collapse
Affiliation(s)
- Luis Cerdán
- Institut de Ciència Molecular, Universitat de València, València 46071, Spain
| | - Daniel Roca-Sanjuán
- Institut de Ciència Molecular, Universitat de València, València 46071, Spain
| |
Collapse
|
18
|
Ren H, Zhang Q, Wang Z, Zhang G, Liu H, Guo W, Mukamel S, Jiang J. Machine learning recognition of protein secondary structures based on two-dimensional spectroscopic descriptors. Proc Natl Acad Sci U S A 2022; 119:e2202713119. [PMID: 35476517 PMCID: PMC9171355 DOI: 10.1073/pnas.2202713119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 03/28/2022] [Indexed: 11/29/2022] Open
Abstract
Protein secondary structure discrimination is crucial for understanding their biological function. It is not generally possible to invert spectroscopic data to yield the structure. We present a machine learning protocol which uses two-dimensional UV (2DUV) spectra as pattern recognition descriptors, aiming at automated protein secondary structure determination from spectroscopic features. Accurate secondary structure recognition is obtained for homologous (97%) and nonhomologous (91%) protein segments, randomly selected from simulated model datasets. The advantage of 2DUV descriptors over one-dimensional linear absorption and circular dichroism spectra lies in the cross-peak information that reflects interactions between local regions of the protein. Thanks to their ultrafast (∼200 fs) nature, 2DUV measurements can be used in the future to probe conformational variations in the course of protein dynamics.
Collapse
Affiliation(s)
- Hao Ren
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China
| | - Qian Zhang
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China
| | - Zhengjie Wang
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China
| | - Guozhen Zhang
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Hongzhang Liu
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China
| | - Wenyue Guo
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China
| | - Shaul Mukamel
- Department of Chemistry and Physics & Astronomy, University of California, Irvine, CA 92697
| | - Jun Jiang
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| |
Collapse
|
19
|
Fan J, Lan H, Ning W, Zhong R, Chen F, Yan G, Cai K. Modeling amide-I vibrations of alanine dipeptide in solution by using neural network protocol. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 268:120675. [PMID: 34890871 DOI: 10.1016/j.saa.2021.120675] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Revised: 10/27/2021] [Accepted: 11/26/2021] [Indexed: 06/13/2023]
Abstract
Infrared spectroscopy is a powerful tool for the understanding of molecular structure and function of polypeptides. Theoretical interpretation of IR spectra relies on ab initio calculations may be very costly in computational resources. Herein, we developed a neural network (NN) modeling protocol to evaluate a model dipeptide's backbone amide-I spectra. DFT calculations were performed for the amide-I vibrational motions and structural parameters of alanine dipeptide (ALAD) conformers in different micro-environments ranging from polar to non-polar ones. The obtained backbone dihedrals, C = O bond lengths and amide-I frequencies of ALAD were gather together for NN architecture. The applications of built NN protocols for the prediction of amide-I frequencies of ALAD in other solvation conditions are quite satisfactory with much less computational cost comparing with electronic structure calculations. The results show that this cost-effective way enables us to decipher the polypeptide's dynamic secondary structures and biological functions with their backbone vibrational probes.
Collapse
Affiliation(s)
- Jianping Fan
- College of Chemistry and Materials Science, Fujian Provincial Key Laboratory of Advanced Materials Oriented Chemical Engineering, Fujian Normal University, Fuzhou 350007, PR China; Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen 361005, PR China; Fujian Provincial Key Laboratory of Featured Biochemical and Chemical Materials, Ningde Normal University, Ningde 352100, PR China
| | - Huaying Lan
- College of Chemistry and Materials Science, Fujian Provincial Key Laboratory of Advanced Materials Oriented Chemical Engineering, Fujian Normal University, Fuzhou 350007, PR China; Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen 361005, PR China
| | - Wenfeng Ning
- College of Chemistry and Materials Science, Fujian Provincial Key Laboratory of Advanced Materials Oriented Chemical Engineering, Fujian Normal University, Fuzhou 350007, PR China; Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen 361005, PR China
| | - Rongzhen Zhong
- College of Chemistry and Materials Science, Fujian Provincial Key Laboratory of Advanced Materials Oriented Chemical Engineering, Fujian Normal University, Fuzhou 350007, PR China; Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen 361005, PR China
| | - Feng Chen
- Fujian Provincial Key Laboratory of Featured Biochemical and Chemical Materials, Ningde Normal University, Ningde 352100, PR China
| | - Guiyang Yan
- Fujian Provincial Key Laboratory of Featured Biochemical and Chemical Materials, Ningde Normal University, Ningde 352100, PR China
| | - Kaicong Cai
- College of Chemistry and Materials Science, Fujian Provincial Key Laboratory of Advanced Materials Oriented Chemical Engineering, Fujian Normal University, Fuzhou 350007, PR China; Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen 361005, PR China; Fujian Provincial Key Laboratory of Featured Biochemical and Chemical Materials, Ningde Normal University, Ningde 352100, PR China
| |
Collapse
|
20
|
Low K, Coote ML, Izgorodina EI. Inclusion of More Physics Leads to Less Data: Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data. J Chem Theory Comput 2022; 18:1607-1618. [PMID: 35175045 DOI: 10.1021/acs.jctc.1c01264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
Collapse
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| | - Michelle L Coote
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory 0200, Australia
| | - Ekaterina I Izgorodina
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
21
|
Zhao L, Zhang J, Zhang Y, Ye S, Zhang G, Chen X, Jiang B, Jiang J. Accurate Machine Learning Prediction of Protein Circular Dichroism Spectra with Embedded Density Descriptors. JACS AU 2021; 1:2377-2384. [PMID: 34977905 PMCID: PMC8715543 DOI: 10.1021/jacsau.1c00449] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Indexed: 05/08/2023]
Abstract
A data-driven approach to simulate circular dichroism (CD) spectra is appealing for fast protein secondary structure determination, yet the challenge of predicting electric and magnetic transition dipole moments poses a substantial barrier for the goal. To address this problem, we designed a new machine learning (ML) protocol in which ordinary pure geometry-based descriptors are replaced with alternative embedded density descriptors and electric and magnetic transition dipole moments are successfully predicted with an accuracy comparable to first-principle calculation. The ML model is able to not only simulate protein CD spectra nearly 4 orders of magnitude faster than conventional first-principle simulation but also obtain CD spectra in good agreement with experiments. Finally, we predicted a series of CD spectra of the Trp-cage protein associated with continuous changes of protein configuration along its folding path, showing the potential of our ML model for supporting real-time CD spectroscopy study of protein dynamics.
Collapse
Affiliation(s)
- Luyuan Zhao
- Hefei
National Laboratory for Physical Sciences at the Microscale, Collaborative
Innovation Center of Chemistry for Energy Materials, School of Chemistry
and Materials Science, University of Science
and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Jinxiao Zhang
- Guangxi
Key Laboratory of Electrochemical and Magneto-chemical Functional
Materials, College of Chemistry and Bioengineering, Guilin University of Technology, Guilin 541006, P. R. China
| | - Yaolong Zhang
- Hefei
National Laboratory for Physical Sciences at the Microscale, Collaborative
Innovation Center of Chemistry for Energy Materials, School of Chemistry
and Materials Science, University of Science
and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Sheng Ye
- School
of Artificial Intelligence, Anhui University, Hefei, Anhui 230601, P. R. China
| | - Guozhen Zhang
- Hefei
National Laboratory for Physical Sciences at the Microscale, Collaborative
Innovation Center of Chemistry for Energy Materials, School of Chemistry
and Materials Science, University of Science
and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Xin Chen
- Gusu
Laboratory of Materials, Suzhou, Jiangsu 215123, P. R. China
| | - Bin Jiang
- Hefei
National Laboratory for Physical Sciences at the Microscale, Collaborative
Innovation Center of Chemistry for Energy Materials, School of Chemistry
and Materials Science, University of Science
and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Jun Jiang
- Hefei
National Laboratory for Physical Sciences at the Microscale, Collaborative
Innovation Center of Chemistry for Energy Materials, School of Chemistry
and Materials Science, University of Science
and Technology of China, Hefei, Anhui 230026, P. R. China
| |
Collapse
|
22
|
Gandolfi M, Ceotto M. Unsupervised Machine Learning Neural Gas Algorithm for Accurate Evaluations of the Hessian Matrix in Molecular Dynamics. J Chem Theory Comput 2021; 17:6733-6746. [PMID: 34705463 PMCID: PMC8582248 DOI: 10.1021/acs.jctc.1c00707] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Indexed: 11/29/2022]
Abstract
The Hessian matrix of the potential energy of molecular systems is employed not only in geometry optimizations or high-order molecular dynamics integrators but also in many other molecular procedures, such as instantaneous normal mode analysis, force field construction, instanton calculations, and semiclassical initial value representation molecular dynamics, to name a few. Here, we present an algorithm for the calculation of the approximated Hessian in molecular dynamics. The algorithm belongs to the family of unsupervised machine learning methods, and it is based on the neural gas idea, where neurons are molecular configurations whose Hessians are adopted for groups of molecular dynamics configurations with similar geometries. The method is tested on several molecular systems of different dimensionalities both in terms of accuracy and computational time versus calculating the Hessian matrix at each time-step, that is, without any approximation, and other Hessian approximation schemes. Finally, the method is applied to the on-the-fly, full-dimensional simulation of a small synthetic peptide (the 46 atom N-acetyl-l-phenylalaninyl-l-methionine amide) at the level of DFT-B3LYP-D/6-31G* theory, from which the semiclassical vibrational power spectrum is calculated.
Collapse
Affiliation(s)
- Michele Gandolfi
- Dipartimento di Chimica, Università
degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Michele Ceotto
- Dipartimento di Chimica, Università
degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| |
Collapse
|
23
|
Abstract
Numerous linear and non-linear spectroscopic techniques have been developed to elucidate structural and functional information of complex systems ranging from natural systems, such as proteins and light-harvesting systems, to synthetic systems, such as solar cell materials and light-emitting diodes. The obtained experimental data can be challenging to interpret due to the complexity and potential overlapping spectral signatures. Therefore, computational spectroscopy plays a crucial role in the interpretation and understanding of spectral observables of complex systems. Computational modeling of various spectroscopic techniques has seen significant developments in the past decade, when it comes to the systems that can be addressed, the size and complexity of the sample types, the accuracy of the methods, and the spectroscopic techniques that can be addressed. In this Perspective, I will review the computational spectroscopy methods that have been developed and applied for infrared and visible spectroscopies in the condensed phase. I will discuss some of the questions that this has allowed answering. Finally, I will discuss current and future challenges and how these may be addressed.
Collapse
Affiliation(s)
- Thomas L C Jansen
- Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands
| |
Collapse
|
24
|
Kwac K, Freedman H, Cho M. Machine Learning Approach for Describing Water OH Stretch Vibrations. J Chem Theory Comput 2021; 17:6353-6365. [PMID: 34498885 DOI: 10.1021/acs.jctc.1c00540] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A machine learning approach employing neural networks is developed to calculate the vibrational frequency shifts and transition dipole moments of the symmetric and antisymmetric OH stretch vibrations of a water molecule surrounded by water molecules. We employed the atom-centered symmetry functions (ACSFs), polynomial functions, and Gaussian-type orbital-based density vectors as descriptor functions and compared their performances in predicting vibrational frequency shifts using the trained neural networks. The ACSFs perform best in modeling the frequency shifts of the OH stretch vibration of water among the types of descriptor functions considered in this paper. However, the differences in performance among these three descriptors are not significant. We also tried a feature selection method called CUR matrix decomposition to assess the importance and leverage of the individual functions in the set of selected descriptor functions. We found that a significant number of those functions included in the set of descriptor functions give redundant information in describing the configuration of the water system. We here show that the predicted vibrational frequency shifts by trained neural networks successfully describe the solvent-solute interaction-induced fluctuations of OH stretch frequencies.
Collapse
Affiliation(s)
- Kijeong Kwac
- Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science (IBS), Seoul 02841, Republic of Korea
| | - Holly Freedman
- Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science (IBS), Seoul 02841, Republic of Korea
| | - Minhaeng Cho
- Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science (IBS), Seoul 02841, Republic of Korea.,Department of Chemistry, Korea University, Seoul 02841, Republic of Korea
| |
Collapse
|
25
|
Westermayr J, Marquetand P. Machine Learning for Electronically Excited States of Molecules. Chem Rev 2021; 121:9873-9926. [PMID: 33211478 PMCID: PMC8391943 DOI: 10.1021/acs.chemrev.0c00749] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Indexed: 12/11/2022]
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
26
|
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
27
|
Zobel JP, González L. The Quest to Simulate Excited-State Dynamics of Transition Metal Complexes. JACS AU 2021; 1:1116-1140. [PMID: 34467353 PMCID: PMC8397362 DOI: 10.1021/jacsau.1c00252] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Indexed: 05/15/2023]
Abstract
This Perspective describes current computational efforts in the field of simulating photodynamics of transition metal complexes. We present the typical workflows and feature the strengths and limitations of the different contemporary approaches. From electronic structure methods suitable to describe transition metal complexes to approaches able to simulate their nuclear dynamics under the effect of light, we give particular attention to build a bridge between theory and experiment by critically discussing the different models commonly adopted in the interpretation of spectroscopic experiments and the simulation of particular observables. Thereby, we review all the studies of excited-state dynamics on transition metal complexes, both in gas phase and in solution from reduced to full dimensionality.
Collapse
Affiliation(s)
- J. Patrick Zobel
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstr. 19, 1090 Vienna Austria
| | - Leticia González
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währingerstr. 19, 1090 Vienna Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währingerstr. 19, 1090 Vienna Austria
| |
Collapse
|
28
|
AI-based spectroscopic monitoring of real-time interactions between SARS-CoV-2 and human ACE2. Proc Natl Acad Sci U S A 2021; 118:2025879118. [PMID: 34185681 PMCID: PMC8256048 DOI: 10.1073/pnas.2025879118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
The COVID-19 caused by SARS-CoV-2 virus has posed a tremendous threat to human health. The interactions between human angiotensin-converting enzyme 2 and the spike glycoprotein of SARS-CoV-2 hold the key to understanding the molecular mechanism to develop treatment and vaccines. However, the simulation of these interactions in fluctuating surroundings is challenging because it requires many electronic structure calculations at the quantum mechanics level for a large number of representative configurations. We report a machine learning protocol that can efficiently predict the IR spectra of SARS-CoV-2 with high efficiency and characterize fine changes in IR spectra associated with variations of protein secondary structures. Machine learning provides a cost-effective tool for monitoring of real-time interactions between the SARS-CoV-2 and human ACE2. The novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), invades a human cell via human angiotensin-converting enzyme 2 (hACE2) as the entry, causing the severe coronavirus disease (COVID-19). The interactions between hACE2 and the spike glycoprotein (S protein) of SARS-CoV-2 hold the key to understanding the molecular mechanism to develop treatment and vaccines, yet the dynamic nature of these interactions in fluctuating surroundings is very challenging to probe by those structure determination techniques requiring the structures of samples to be fixed. Here we demonstrate, by a proof-of-concept simulation of infrared (IR) spectra of S protein and hACE2, that time-resolved spectroscopy may monitor the real-time structural information of the protein−protein complexes of interest, with the help of machine learning. Our machine learning protocol is able to identify fine changes in IR spectra associated with variation of the secondary structures of S protein of the coronavirus. Further, it is three to four orders of magnitude faster than conventional quantum chemistry calculations. We expect our machine learning protocol would accelerate the development of real-time spectroscopy study of protein dynamics.
Collapse
|
29
|
Ren H, Li H, Zhang Q, Liang L, Guo W, Huang F, Luo Y, Jiang J. A machine learning vibrational spectroscopy protocol for spectrum prediction and spectrum-based structure recognition. FUNDAMENTAL RESEARCH 2021. [DOI: 10.1016/j.fmre.2021.05.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
|
30
|
Zhang J, Ye S, Zhong K, Zhang Y, Chong Y, Zhao L, Zhou H, Guo S, Zhang G, Jiang B, Mukamel S, Jiang J. A Machine-Learning Protocol for Ultraviolet Protein-Backbone Absorption Spectroscopy under Environmental Fluctuations. J Phys Chem B 2021; 125:6171-6178. [PMID: 34086461 DOI: 10.1021/acs.jpcb.1c03296] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Ultraviolet (UV) absorption spectra are commonly used for characterizing the global structure of proteins. However, the theoretical interpretation of UV spectra is hindered by the large number of required expensive ab initio calculations of excited states spanning a huge conformation space. We present a machine-learning (ML) protocol for far-UV (FUV) spectra of proteins, which can predict FUV spectra of proteins with comparable accuracy to density functional theory (DFT) calculations but with 3-4 orders of magnitude reduced computational cost. It further shows excellent predictive power and transferability that can be used to probe structural mutations and protein folding pathways.
Collapse
Affiliation(s)
- Jinxiao Zhang
- Guangxi Key Laboratory of Electrochemical and Magneto-chemical Functional Materials, College of Chemistry and Bioengineering, Guilin University of Technology, Guilin 541006, China
| | - Sheng Ye
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Kai Zhong
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yaolong Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yuanyuan Chong
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Luyuan Zhao
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Huiting Zhou
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Sibei Guo
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Guozhen Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shaul Mukamel
- Departments of Chemistry and Physics & Astronomy, University of California, Irvine, California 92697, United States
| | - Jun Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
31
|
Asakura M, Okuno M. Hyper-Raman Spectroscopic Investigation of Amide Bands of N-Methylacetamide in Liquid/Solution Phase. J Phys Chem Lett 2021; 12:4780-4785. [PMID: 33988365 DOI: 10.1021/acs.jpclett.1c01215] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We have demonstrated hyper-Raman (HR) spectroscopy of N-methylacetamide (NMA) for the first time. Fundamental knowledge of amide bands in HR spectra has been obtained. HR spectra of NMA exhibit various amide bands with different intensity patterns from Raman and IR spectra. The amide III and II signals were strongly observed, suggesting the possible application of HR spectroscopy to analyze secondary structures, complementary to IR and Raman spectroscopy. The peak positions of HR amide bands sharply reflect the hydrogen-bonding environment around the molecule. The depolarization ratios of the amide II and III bands at 532 nm excitation suggest the resonance HR effect via the π-π* transition. In contrast, that of the amide I band of neat NMA indicates the contribution of high energy transitions to its signal enhancement. This work proposes that HR spectroscopy can be a powerful tool for studying the molecular structure and environment of biomolecules with peptide bonds.
Collapse
Affiliation(s)
- Masaya Asakura
- Department of Basic Science, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo 153-8902 Japan
| | - Masanari Okuno
- Department of Basic Science, Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo 153-8902 Japan
| |
Collapse
|
32
|
Joung J, Han M, Hwang J, Jeong M, Choi DH, Park S. Deep Learning Optical Spectroscopy Based on Experimental Database: Potential Applications to Molecular Design. JACS AU 2021; 1:427-438. [PMID: 34467305 PMCID: PMC8395663 DOI: 10.1021/jacsau.1c00035] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Indexed: 06/13/2023]
Abstract
Accurate and reliable prediction of the optical and photophysical properties of organic compounds is important in various research fields. Here, we developed deep learning (DL) optical spectroscopy using a DL model and experimental database to predict seven optical and photophysical properties of organic compounds, namely, the absorption peak position and bandwidth, extinction coefficient, emission peak position and bandwidth, photoluminescence quantum yield (PLQY), and emission lifetime. Our DL model included the chromophore-solvent interaction to account for the effect of local environments on the optical and photophysical properties of organic compounds and was trained using an experimental database of 30 094 chromophore/solvent combinations. Our DL optical spectroscopy made it possible to reliably and quickly predict the aforementioned properties of organic compounds in solution, gas phase, film, and powder with the root mean squared errors of 26.6 and 28.0 nm for absorption and emission peak positions, 603 and 532 cm-1 for absorption and emission bandwidths, and 0.209, 0.371, and 0.262 for the logarithm of the extinction coefficient, PLQY, and emission lifetime, respectively. Finally, we demonstrated how a blue emitter with desired optical and photophysical properties could be efficiently virtually screened and developed by DL optical spectroscopy. DL optical spectroscopy can be efficiently used for developing chromophores and fluorophores in various research areas.
Collapse
Affiliation(s)
| | | | - Jinhyo Hwang
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Minseok Jeong
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Dong Hoon Choi
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Sungnam Park
- Department of Chemistry and
Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| |
Collapse
|
33
|
Dong SS, Govoni M, Galli G. Machine learning dielectric screening for the simulation of excited state properties of molecules and materials. Chem Sci 2021; 12:4970-4980. [PMID: 34163744 PMCID: PMC8179553 DOI: 10.1039/d1sc00503k] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 02/12/2021] [Indexed: 11/21/2022] Open
Abstract
Accurate and efficient calculations of absorption spectra of molecules and materials are essential for the understanding and rational design of broad classes of systems. Solving the Bethe-Salpeter equation (BSE) for electron-hole pairs usually yields accurate predictions of absorption spectra, but it is computationally expensive, especially if thermal averages of spectra computed for multiple configurations are required. We present a method based on machine learning to evaluate a key quantity entering the definition of absorption spectra: the dielectric screening. We show that our approach yields a model for the screening that is transferable between multiple configurations sampled during first principles molecular dynamics simulations; hence it leads to a substantial improvement in the efficiency of calculations of finite temperature spectra. We obtained computational gains of one to two orders of magnitude for systems with 50 to 500 atoms, including liquids, solids, nanostructures, and solid/liquid interfaces. Importantly, the models of dielectric screening derived here may be used not only in the solution of the BSE but also in developing functionals for time-dependent density functional theory (TDDFT) calculations of homogeneous and heterogeneous systems. Overall, our work provides a strategy to combine machine learning with electronic structure calculations to accelerate first principles simulations of excited-state properties.
Collapse
Affiliation(s)
- Sijia S Dong
- Materials Science Division and Center for Molecular Engineering, Argonne National Laboratory Lemont IL 60439 USA
- Pritzker School of Molecular Engineering, The University of Chicago Chicago IL 60637 USA
| | - Marco Govoni
- Materials Science Division and Center for Molecular Engineering, Argonne National Laboratory Lemont IL 60439 USA
- Pritzker School of Molecular Engineering, The University of Chicago Chicago IL 60637 USA
| | - Giulia Galli
- Materials Science Division and Center for Molecular Engineering, Argonne National Laboratory Lemont IL 60439 USA
- Pritzker School of Molecular Engineering, The University of Chicago Chicago IL 60637 USA
| |
Collapse
|
34
|
He H, Yan S, Lyu D, Xu M, Ye R, Zheng P, Lu X, Wang L, Ren B. Deep Learning for Biospectroscopy and Biospectral Imaging: State-of-the-Art and Perspectives. Anal Chem 2021; 93:3653-3665. [PMID: 33599125 DOI: 10.1021/acs.analchem.0c04671] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
With the advances in instrumentation and sampling techniques, there is an explosive growth of data from molecular and cellular samples. The call to extract more information from the large data sets has greatly challenged the conventional chemometrics method. Deep learning, which utilizes very large data sets for finding hidden features therein and for making accurate predictions for a wide range of applications, has been applied in an unbelievable pace in biospectroscopy and biospectral imaging in the recent 3 years. In this Feature, we first introduce the background and basic knowledge of deep learning. We then focus on the emerging applications of deep learning in the data preprocessing, feature detection, and modeling of the biological samples for spectral analysis and spectroscopic imaging. Finally, we highlight the challenges and limitations in deep learning and the outlook for future directions.
Collapse
Affiliation(s)
- Hao He
- School of Aerospace Engineering, Xiamen University, Xiamen, 361000, China
| | - Sen Yan
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Danya Lyu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Mengxi Xu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Ruiqian Ye
- School of Aerospace Engineering, Xiamen University, Xiamen, 361000, China
| | - Peng Zheng
- School of Aerospace Engineering, Xiamen University, Xiamen, 361000, China
| | - Xinyu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Lei Wang
- School of Aerospace Engineering, Xiamen University, Xiamen, 361000, China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
35
|
Ye S, Zhong K, Zhang J, Hu W, Hirst JD, Zhang G, Mukamel S, Jiang J. A Machine Learning Protocol for Predicting Protein Infrared Spectra. J Am Chem Soc 2020; 142:19071-19077. [PMID: 33126795 DOI: 10.1021/jacs.0c06530] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Infrared (IR) absorption provides important chemical fingerprints of biomolecules. Protein secondary structure determination from IR spectra is tedious since its theoretical interpretation requires repeated expensive quantum-mechanical calculations in a fluctuating environment. Herein we present a novel machine learning protocol that uses a few key structural descriptors to rapidly predict amide I IR spectra of various proteins and agrees well with experiment. Its transferability enabled us to distinguish protein secondary structures, probe atomic structure variations with temperature, and monitor protein folding. This approach offers a cost-effective tool to model the relationship between protein spectra and their biological/chemical properties.
Collapse
Affiliation(s)
- Sheng Ye
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Kai Zhong
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Jinxiao Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Wei Hu
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Jonathan D Hirst
- School of Chemistry, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| | - Guozhen Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Shaul Mukamel
- Departments of Chemistry, and Physics & Astronomy, University of California, Irvine, California 92697, United States
| | - Jun Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| |
Collapse
|
36
|
Westermayr J, Marquetand P. Deep learning for UV absorption spectra with SchNarc: First steps toward transferability in chemical compound space. J Chem Phys 2020; 153:154112. [DOI: 10.1063/5.0021915] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- J. Westermayr
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
| | - P. Marquetand
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Faculty of Chemistry, Data Science @ Uni Vienna, University of Vienna, Währinger Str. 29, 1090 Vienna, Austria
| |
Collapse
|
37
|
Chen MS, Zuehlsdorff TJ, Morawietz T, Isborn CM, Markland TE. Exploiting Machine Learning to Efficiently Predict Multidimensional Optical Spectra in Complex Environments. J Phys Chem Lett 2020; 11:7559-7568. [PMID: 32808797 DOI: 10.1021/acs.jpclett.0c02168] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The excited-state dynamics of chromophores in complex environments determine a range of vital biological and energy capture processes. Time-resolved, multidimensional optical spectroscopies provide a key tool to investigate these processes. Although theory has the potential to decode these spectra in terms of the electronic and atomistic dynamics, the need for large numbers of excited-state electronic structure calculations severely limits first-principles predictions of multidimensional optical spectra for chromophores in the condensed phase. Here, we leverage the locality of chromophore excitations to develop machine learning models to predict the excited-state energy gap of chromophores in complex environments for efficiently constructing linear and multidimensional optical spectra. By analyzing the performance of these models, which span a hierarchy of physical approximations, across a range of chromophore-environment interaction strengths, we provide strategies for the construction of machine learning models that greatly accelerate the calculation of multidimensional optical spectra from first principles.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Tim J Zuehlsdorff
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Tobias Morawietz
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Christine M Isborn
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
38
|
Abstract
We present a machine learning (ML) method to accelerate the nuclear ensemble approach (NEA) for computing absorption cross sections. ML-NEA is used to calculate cross sections on vast ensembles of nuclear geometries to reduce the error due to insufficient statistical sampling. The electronic properties-excitation energies and oscillator strengths-are calculated with a reference electronic structure method only for a relatively few points in the ensemble. The KREG model (kernel-ridge-regression-based ML combined with the RE descriptor) as implemented in MLatom is used to predict these properties for the remaining tens of thousands of points in the ensemble without incurring much of additional computational cost. We demonstrate for two examples, benzene and a 9-dicyanomethylene derivative of acridine, that ML-NEA can produce statistically converged cross sections even for very challenging cases and even with as few as several hundreds of training points.
Collapse
Affiliation(s)
- Bao-Xin Xue
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | | | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| |
Collapse
|
39
|
Zhang Y, Ye S, Zhang J, Hu C, Jiang J, Jiang B. Efficient and Accurate Simulations of Vibrational and Electronic Spectra with Symmetry-Preserving Neural Network Models for Tensorial Properties. J Phys Chem B 2020; 124:7284-7290. [PMID: 32786714 DOI: 10.1021/acs.jpcb.0c06926] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Machine learning has revolutionized the high-dimensional representations for molecular properties such as potential energy. However, there are scarce machine learning models targeting tensorial properties, which are rotationally covariant. Here, we propose tensorial neural network (NN) models to learn both tensorial response and transition properties in which atomic coordinate vectors are multiplied with scalar NN outputs or their derivatives to preserve the rotationally covariant symmetry. This strategy keeps structural descriptors symmetry invariant so that the resulting tensorial NN models are as efficient as their scalar counterparts. We validate the performance and universality of this approach by learning response properties of water oligomers and liquid water and transition dipole moment of a model structural unit of proteins. Machine-learned tensorial models have enabled efficient simulations of vibrational spectra of liquid water and ultraviolet spectra of realistic proteins, promising feasible and accurate spectroscopic simulations for biomolecules and materials.
Collapse
Affiliation(s)
- Yaolong Zhang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Sheng Ye
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Chinese Academy of Sciences Center for Excellence in Nanoscience, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jinxiao Zhang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Chinese Academy of Sciences Center for Excellence in Nanoscience, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Ce Hu
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jun Jiang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Chinese Academy of Sciences Center for Excellence in Nanoscience, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Jiang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.,Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
40
|
Yu H, Wang Y, Wang X, Zhang J, Ye S, Huang Y, Luo Y, Sharman E, Chen S, Jiang J. Using Machine Learning to Predict the Dissociation Energy of Organic Carbonyls. J Phys Chem A 2020; 124:3844-3850. [PMID: 32315178 DOI: 10.1021/acs.jpca.0c01280] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Bond dissociation energy (BDE), an indicator of the strength of chemical bonds, exhibits great potential for evaluating and screening high-performance materials and catalysts, which are of critical importance in industrial applications. However, the measurement or computation of BDE via conventional experimental or theoretical methods is usually costly and involved, substantially preventing the BDE from being applied to large-scale and high-throughput studies. Therefore, a potentially more efficient approach for estimating BDE is highly desirable. To this end, we combined first-principles calculations and machine learning techniques, including neural networks and random forest, to explore the inner relationships between carbonyl structure and its BDE. Results show that machine learning can not only effectively reproduce the computed BDEs of carbonyls but also in turn serve as guidance for the rational design of carbonyl structure aimed at optimizing performance.
Collapse
Affiliation(s)
- Haishan Yu
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Ying Wang
- Key Laboratory of Cluster Science of Ministry of Education, School of Chemistry and Chemical Engineering, Beijing Institute of Technology, Beijing 100081, China
| | - Xijun Wang
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh 27606, North Carolina, United States
| | - Jinxiao Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Sheng Ye
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Yan Huang
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Yi Luo
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Edward Sharman
- Department of Neurology, University of California, Irvine 92697, California, United States
| | - Shilu Chen
- Key Laboratory of Cluster Science of Ministry of Education, School of Chemistry and Chemical Engineering, Beijing Institute of Technology, Beijing 100081, China
| | - Jun Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, Department of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China
| |
Collapse
|
41
|
Carbone MR, Topsakal M, Lu D, Yoo S. Machine-Learning X-Ray Absorption Spectra to Quantitative Accuracy. PHYSICAL REVIEW LETTERS 2020; 124:156401. [PMID: 32357067 DOI: 10.1103/physrevlett.124.156401] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 03/30/2020] [Indexed: 05/13/2023]
Abstract
Simulations of excited state properties, such as spectral functions, are often computationally expensive and therefore not suitable for high-throughput modeling. As a proof of principle, we demonstrate that graph-based neural networks can be used to predict the x-ray absorption near-edge structure spectra of molecules to quantitative accuracy. Specifically, the predicted spectra reproduce nearly all prominent peaks, with 90% of the predicted peak locations within 1 eV of the ground truth. Besides its own utility in spectral analysis and structure inference, our method can be combined with structure search algorithms to enable high-throughput spectrum sampling of the vast material configuration space, which opens up new pathways to material design and discovery.
Collapse
Affiliation(s)
- Matthew R Carbone
- Department of Chemistry, Columbia University, New York, New York 10027, USA
| | - Mehmet Topsakal
- Nuclear Science and Technology Department, Brookhaven National Laboratory, Upton, New York 11973, USA
| | - Deyu Lu
- Center for Functional Nanomaterials, Brookhaven National Laboratory, Upton, New York 11973, USA
| | - Shinjae Yoo
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, USA
| |
Collapse
|
42
|
Wang X, Ye S, Hu W, Sharman E, Liu R, Liu Y, Luo Y, Jiang J. Electric Dipole Descriptor for Machine Learning Prediction of Catalyst Surface-Molecular Adsorbate Interactions. J Am Chem Soc 2020; 142:7737-7743. [PMID: 32297511 DOI: 10.1021/jacs.0c01825] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The challenge of evaluating catalyst surface-molecular adsorbate interactions holds the key for rational design of catalysts. Finding an experimentally measurable and theoretically computable descriptor for evaluating surface-adsorbate interactions is a significant step toward achieving this goal. Here we show that the electric dipole moment can serve as a convenient yet accurate descriptor for establishing structure-property relationships for molecular adsorbates on metal catalyst surfaces. By training a machine learning neural network with a large data set of first-principles calculations, we achieve quick and accurate predictions of molecular adsorption energy and transferred charge. The training model using NO/CO@Au(111) can be extended to study additional substrates such as Au(001) or Ag(111), thus exhibiting extraordinary transferability. These findings validate the effectiveness of the electric dipole descriptor, providing an efficient modality for future catalyst design.
Collapse
Affiliation(s)
- Xijun Wang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China.,Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Sheng Ye
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Wei Hu
- Shandong Provincial Key Laboratory of Molecular Engineering, School of Chemistry and Pharmaceutical Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250353, People's Republic of China
| | - Edward Sharman
- Department of Neurology, University of California, Irvine, California 92697, United States
| | - Ran Liu
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Yan Liu
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Yi Luo
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | - Jun Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| |
Collapse
|
43
|
Zhao W, Li Q, Huang XH, Bie LH, Gao J. Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression. Front Chem 2020; 8:162. [PMID: 32296675 PMCID: PMC7136535 DOI: 10.3389/fchem.2020.00162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 02/24/2020] [Indexed: 11/13/2022] Open
Abstract
The random forest regression (RFR) model was introduced to predict the multiple spin state charges of a heme model, which is important for the molecular dynamic simulation of the spin crossover phenomenon. In this work, a multiple spin state structure data set with 39,368 structures of the simplified heme-oxygen binding model was built from the non-adiabatic dynamic simulation trajectories. The ESP charges of each atom were calculated and used as the real-valued response. The conformational adapted charge model (CAC) of three spin states was constructed by an RFR model using symmetry functions. The results show that our RFR model can effectively predict the on the fly atomic charges with the varying conformations as well as the atomic charge of different spin states in the same conformation, thus achieving the balance of accuracy and efficiency. The average mean absolute error of the predicted charges of each spin state is <0.02 e. The comparison studies on descriptors showed a maximum 0.06 e improvement in prediction of the charge of Fe 2+ by using 11 manually selected structural parameters. We hope that this model can not only provide variable parameters for developing the force field of the multi-spin state but also facilitate automation, thus enabling large-scale simulations of atomistic systems.
Collapse
Affiliation(s)
| | | | | | - Li-Hua Bie
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Jun Gao
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
44
|
Abstract
As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
45
|
Hu W, Ye S, Zhang Y, Li T, Zhang G, Luo Y, Mukamel S, Jiang J. Machine Learning Protocol for Surface-Enhanced Raman Spectroscopy. J Phys Chem Lett 2019; 10:6026-6031. [PMID: 31538788 DOI: 10.1021/acs.jpclett.9b02517] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Surface-enhanced Raman spectroscopy (SERS) is a powerful technique that can capture the electronic-vibrational "fingerprint" of molecules on surfaces. Ab initio prediction of Raman response is a long-standing challenge because of the diversified interfacial structures. Here we show that a cost-effective machine learning (ML) random forest method can predict SERS signals of a trans-1,2-bis (4-pyridyl) ethylene (BPE) molecule adsorbed on a gold substrate. Using geometric descriptors extracted from quantum chemistry simulations of thousands of ab initio molecular dynamics conformations, the ML protocol predicts vibrational frequencies and Raman intensities. The resulting spectra agree with density functional theory calculations and experiment. Predicted SERS responses of the molecule on different surfaces, or under external fields of electric fields and solvent environment, demonstrate the good transferability of the protocol.
Collapse
Affiliation(s)
- Wei Hu
- Shandong Provincial Key Laboratory of Molecular Engineering, School of Chemistry and Pharmaceutical Engineering , Qilu University of Technology , Jinan , Shandong 250353 , P.R. China
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science , University of Science and Technology of China , Hefei , Anhui 230026 , P.R. China
| | - Sheng Ye
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science , University of Science and Technology of China , Hefei , Anhui 230026 , P.R. China
| | - Yujin Zhang
- School of Electronic and Information Engineering (Department of Physics) , Qilu University of Technology , Jinan , Shandong 250353 , P.R. China
| | - Tianduo Li
- Shandong Provincial Key Laboratory of Molecular Engineering, School of Chemistry and Pharmaceutical Engineering , Qilu University of Technology , Jinan , Shandong 250353 , P.R. China
| | - Guozhen Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science , University of Science and Technology of China , Hefei , Anhui 230026 , P.R. China
| | - Yi Luo
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science , University of Science and Technology of China , Hefei , Anhui 230026 , P.R. China
| | - Shaul Mukamel
- Departments of Chemistry and Physics and Astronomy , University of California , Irvine , California 92697 , United States
| | - Jun Jiang
- Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science , University of Science and Technology of China , Hefei , Anhui 230026 , P.R. China
| |
Collapse
|