1
|
Mahato KD, Kumar U. Optimized Machine learning techniques Enable prediction of organic dyes photophysical Properties: Absorption Wavelengths, emission Wavelengths, and quantum yields. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 308:123768. [PMID: 38134661 DOI: 10.1016/j.saa.2023.123768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/05/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023]
Abstract
Applications of organic dyes, ranging from basic research to industry, are functions of their photophysical properties. Two important aspects- (1) knowledge of the photophysical properties of existing dyes long before real applications and (2) discovery of new organic dyes with desired photophysical properties for either upgradation of existing or development of new applications-are needed to be addressed. These two cases are coupled together with the common goal of estimating photophysical properties with high accuracy at the minimum cost of time and money long before the hard-core laboratory experiment. For this purpose, machine learning-based techniques are the most suitable approach. In this study, we used optimized machine-learning techniques to assess a dataset of 3066 organic dyes, which were evaluated using three evaluation parameters: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2). The Quadratic Support Vector Machine (QSVM) was the best predictive model for RMSE-16.614, MAE-10.837, and R2-0.961 for absorption wavelengths and RMSE-23.636, MAE-16.278, and R2-0.929 for emission wavelengths. These R2 values are 0.7% and 0.4% greater than the Gradient Boost Regression Tree (GBRT) model's recently reported values of 0.954 and 0.925 for absorption and emission wavelengths, respectively. Furthermore, we estimated the quantum yield and found that the Coarse Gaussian Support Vector Machine (CGSVM) outperformed all examined models. For more validation of these models, we compared the predicted results with the experimental results of selective dyes. The proposed automated approach can be used for predicting photophysical properties without much computer programming knowledge.
Collapse
Affiliation(s)
- Kapil Dev Mahato
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India.
| | - Uday Kumar
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India
| |
Collapse
|
2
|
Hung SH, Ye ZR, Cheng CF, Chen B, Tsai MK. Enhanced Predictions for the Experimental Photophysical Data Using the Featurized Schnet-Bondstep Approach. J Chem Theory Comput 2023. [PMID: 37126224 DOI: 10.1021/acs.jctc.3c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
An assessment of modifying the SchNET model for the predictions of experimental molecular photophysical properties, including absorption energy (ΔEabs), emission energy (ΔEemi), and photoluminescence quantum yield (PLQY), was reported. The solution environment was properly introduced outside the interaction layers of SchNET for not overly amplifying the solute-solvent interactions, particularly being supported by the changes of prediction errors between the presence and absence of the solvent effect. Two featurization schemes under the framework of the Schnet-bondstep approach, with featuring the concepts of reduced-atomic-number and reduced-atomic-neighbor, were demonstrated. These featurized models can consequently provide fine predictions for ΔEabs and ΔEemi with errors less than 0.1 eV. The corresponding predictions of PLQY were shown to be comparable to the previous graph convolution network model.
Collapse
Affiliation(s)
- Sheng-Hsuan Hung
- Department of Chemistry, National Taiwan Normal University, Taipei 11677, Taiwan
| | - Zong-Rong Ye
- Department of Chemistry, National Taiwan Normal University, Taipei 11677, Taiwan
| | - Chi-Feng Cheng
- Department of Chemistry, National Taiwan Normal University, Taipei 11677, Taiwan
| | - Berlin Chen
- Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei 11677, Taiwan
| | - Ming-Kang Tsai
- Department of Chemistry, National Taiwan Normal University, Taipei 11677, Taiwan
- Department of Chemistry, Fu-Jen Catholic University, New Taipei City 24205, Taiwan
| |
Collapse
|
3
|
Ksenofontov AA, Isaev YI, Lukanov MM, Makarov DM, Eventova VA, Khodov IA, Berezin MB. Accurate prediction of 11B NMR chemical shift of BODIPYs via machine learning. Phys Chem Chem Phys 2023; 25:9472-9481. [PMID: 36935644 DOI: 10.1039/d3cp00253e] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
Abstract
In this article, we present the results of developing a model based on an RFR machine learning method using the ISIDA fragment descriptors for predicting the 11B NMR chemical shift of BODIPYs. The model is freely available at https://ochem.eu/article/146458. The model demonstrates the high quality of predicting the 11B NMR chemical shift (RMSE, 5CV (FINALE training set) = 0.40 ppm, RMSE (TEST set) = 0.14 ppm). In addition, we compared the "cost" and the user-friendliness for calculations using the quantum-chemical model with the DFT/GIAO approach. The 11B NMR chemical shift prediction accuracy (RMSE) of the model considered is more than three times higher and tremendously faster than the DFT/GIAO calculations. As a result, we provide a convenient tool and database that we collected for all researchers, that allows them to predict the 11B NMR chemical shift of boron-containing dyes. We believe that the new model will make it easier for researchers to correctly interpret the 11B NMR chemical shifts experimentally determined and to select more optimal conditions to perform an NMR experiment.
Collapse
Affiliation(s)
- Alexander A Ksenofontov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Yaroslav I Isaev
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia. .,Ivanovo State University of Chemistry and Technology, 7, Sheremetevskiy Avenue, Ivanovo 153000, Russia
| | - Michail M Lukanov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Dmitry M Makarov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Varvara A Eventova
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia. .,Ivanovo State University of Chemistry and Technology, 7, Sheremetevskiy Avenue, Ivanovo 153000, Russia
| | - Ilya A Khodov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Mechail B Berezin
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| |
Collapse
|
4
|
Ksenofontov AA, Lukanov MM, Bocharov PS. Can machine learning methods accurately predict the molar absorption coefficient of different classes of dyes? SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 279:121442. [PMID: 35660154 DOI: 10.1016/j.saa.2022.121442] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/25/2022] [Accepted: 05/26/2022] [Indexed: 06/15/2023]
Abstract
In this article, we provide a convenient tool for all researchers to predict the value of the molar absorption coefficient for a wide number of dyes without any computer costs. The new model is based on RFR method (ALogPS, OEstate + Fragmentor + QNPR) and is able to predict the molar absorption coefficient with an accuracy (5-fold cross-validation RMSE) of 0.26 log unit. This accuracy was achieved due to the fact that the model was trained on data for more than 20,000 unique dye molecules. To our knowledge, this is the first model for predicting the molar absorption coefficient trained on such a large and diverse set of dyes. The model is available at https://ochem.eu/article/145413. We hope that the new model will allow researchers to predict dyes with practically significant spectral characteristics and verify existing experimental data.
Collapse
Affiliation(s)
- Alexander A Ksenofontov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Michail M Lukanov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia; Ivanovo State University of Chemistry and Technology, 7, Sheremetevskiy Avenue, Ivanovo 153000, Russia
| | - Pavel S Bocharov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia
| |
Collapse
|
5
|
Joung JF, Han M, Jeong M, Park S. Beyond Woodward-Fieser Rules: Design Principles of Property-Oriented Chromophores Based on Explainable Deep Learning Optical Spectroscopy. J Chem Inf Model 2022; 62:2933-2942. [PMID: 35476584 DOI: 10.1021/acs.jcim.2c00173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
An adequate understanding of molecular structure-property relationships is important for developing new molecules with desired properties. Although deep learning optical spectroscopy (DLOS) has been successfully applied to predict the optical and photophysical properties of organic chromophores, how specific functional groups and solvents affect the optical properties is not clearly understood. Here, we employed an explainable DLOS method by applying the integrated gradients method to DLOS. The integrated gradients method allows us to obtain attributions, indicating how much the functional group contributes to the optical properties including the absorption wavelength and bandwidth, extinction coefficients, emission wavelength and bandwidth, photoluminescence quantum yield, and lifetime. The attributions of 54 functional groups and 9 solvent molecules to seven optical properties are quantified and can be used to estimate the optical properties of chromophores as in the Woodward-Fieser rule. Unlike the Woodward-Fieser rule for only the absorption wavelength, the attributions obtained in this work can be applied to estimate all seven optical properties, which makes a significant extension of the Woodward-Fieser rules. In addition, we demonstrated a strategy for utilizing the attributions in the design of molecules and in tuning the optical properties of the molecules. The design of molecular structures using attributions can revolutionize the development of optimal molecules.
Collapse
Affiliation(s)
- Joonyoung F Joung
- Department of Chemistry and Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Minhi Han
- Department of Chemistry and Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Minseok Jeong
- Department of Chemistry and Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| | - Sungnam Park
- Department of Chemistry and Research Institute for Natural Science, Korea University, Seoul 02841, Korea
| |
Collapse
|
6
|
Rusanov AI, Dmitrieva OA, Mamardashvili NZ, Tetko IV. More Is Not Always Better: Local Models Provide Accurate Predictions of Spectral Properties of Porphyrins. Int J Mol Sci 2022. [DOI: https://doi.org/10.3390/ijms23031201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
The development of new functional materials based on porphyrins requires fast and accurate prediction of their spectral properties. The available models in the literature for absorption wavelength and extinction coefficient of the Soret band have low accuracy for this class of compounds. We collected spectral data for porphyrins to extend the literature set and compared the performance of global and local models for their modelling using different machine learning methods. Interestingly, extension of the public database contributed models with lower accuracies compared to the models, which we built using porphyrins only. The later model calculated acceptable RMSE = 2.61 for prediction of the absorption band of 335 porphyrins synthesized in our laboratory, but had a low accuracy (RMSE = 0.52) for extinction coefficient. A development of models using only compounds from our laboratory significantly decreased errors for these compounds (RMSE = 0.5 and 0.042 for absorption band and extinction coefficient, respectively), but limited their applicability only to these homologous series. When developing models, one should clearly keep in mind their potential use and select a strategy that could contribute the most accurate predictions for the target application. The models and data are publicly available.
Collapse
|
7
|
Rusanov AI, Dmitrieva OA, Mamardashvili NZ, Tetko IV. More Is Not Always Better: Local Models Provide Accurate Predictions of Spectral Properties of Porphyrins. Int J Mol Sci 2022; 23:ijms23031201. [PMID: 35163123 PMCID: PMC8835262 DOI: 10.3390/ijms23031201] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 01/19/2022] [Indexed: 02/05/2023] Open
Abstract
The development of new functional materials based on porphyrins requires fast and accurate prediction of their spectral properties. The available models in the literature for absorption wavelength and extinction coefficient of the Soret band have low accuracy for this class of compounds. We collected spectral data for porphyrins to extend the literature set and compared the performance of global and local models for their modelling using different machine learning methods. Interestingly, extension of the public database contributed models with lower accuracies compared to the models, which we built using porphyrins only. The later model calculated acceptable RMSE = 2.61 for prediction of the absorption band of 335 porphyrins synthesized in our laboratory, but had a low accuracy (RMSE = 0.52) for extinction coefficient. A development of models using only compounds from our laboratory significantly decreased errors for these compounds (RMSE = 0.5 and 0.042 for absorption band and extinction coefficient, respectively), but limited their applicability only to these homologous series. When developing models, one should clearly keep in mind their potential use and select a strategy that could contribute the most accurate predictions for the target application. The models and data are publicly available.
Collapse
Affiliation(s)
- Aleksey I. Rusanov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 153045 Ivanovo, Russia; (A.I.R.); (O.A.D.); (N.Z.M.)
| | - Olga A. Dmitrieva
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 153045 Ivanovo, Russia; (A.I.R.); (O.A.D.); (N.Z.M.)
| | - Nugzar Zh. Mamardashvili
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 153045 Ivanovo, Russia; (A.I.R.); (O.A.D.); (N.Z.M.)
| | - Igor V. Tetko
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, 153045 Ivanovo, Russia; (A.I.R.); (O.A.D.); (N.Z.M.)
- Helmholtz Munich, Institute of Structural Biology, Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), D-85764 Neuherberg, Germany
- BIGCHEM GmbH, D-85716 Unterschleißheim, Germany
- Correspondence: ; Tel.: +49-89-3187-3575
| |
Collapse
|
8
|
Rusanov AI, Dmitrieva OA, Mamardashvili NZ, Tetko IV. More Is Not Always Better: Local Models Provide Accurate Predictions of Spectral Properties of Porphyrins. Int J Mol Sci 2022. [DOI: https:/doi.org/10.3390/ijms23031201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The development of new functional materials based on porphyrins requires fast and accurate prediction of their spectral properties. The available models in the literature for absorption wavelength and extinction coefficient of the Soret band have low accuracy for this class of compounds. We collected spectral data for porphyrins to extend the literature set and compared the performance of global and local models for their modelling using different machine learning methods. Interestingly, extension of the public database contributed models with lower accuracies compared to the models, which we built using porphyrins only. The later model calculated acceptable RMSE = 2.61 for prediction of the absorption band of 335 porphyrins synthesized in our laboratory, but had a low accuracy (RMSE = 0.52) for extinction coefficient. A development of models using only compounds from our laboratory significantly decreased errors for these compounds (RMSE = 0.5 and 0.042 for absorption band and extinction coefficient, respectively), but limited their applicability only to these homologous series. When developing models, one should clearly keep in mind their potential use and select a strategy that could contribute the most accurate predictions for the target application. The models and data are publicly available.
Collapse
|