1
|
Duprat F, Ploix JL, Dreyfus G. Can Graph Machines Accurately Estimate 13C NMR Chemical Shifts of Benzenic Compounds? Molecules 2024; 29:3137. [PMID: 38999091 PMCID: PMC11243075 DOI: 10.3390/molecules29133137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/27/2024] [Accepted: 06/28/2024] [Indexed: 07/14/2024] Open
Abstract
In the organic laboratory, the 13C nuclear magnetic resonance (NMR) spectrum of a newly synthesized compound remains an essential step in elucidating its structure. For the chemist, the interpretation of such a spectrum, which is a set of chemical-shift values, is made easier if he/she has a tool capable of predicting with sufficient accuracy the carbon-shift values from the structure he/she intends to prepare. As there are few open-source methods for accurately estimating this property, we applied our graph-machine approach to build models capable of predicting the chemical shifts of carbons. For this study, we focused on benzene compounds, building an optimized model derived from training a database of 10,577 chemical shifts originating from 2026 structures that contain up to ten types of non-carbon atoms, namely H, O, N, S, P, Si, and halogens. It provides a training root-mean-squared relative error (RMSRE) of 0.5%, i.e., a root-mean-squared error (RMSE) of 0.6 ppm, and a mean absolute error (MAE) of 0.4 ppm for estimating the chemical shifts of the 10k carbons. The predictive capability of the graph-machine model is also compared with that of three commercial packages on a dataset of 171 original benzenic structures (1012 chemical shifts). The graph-machine model proves to be very efficient in predicting chemical shifts, with an RMSE of 0.9 ppm, and compares favorably with the RMSEs of 3.4, 1.8, and 1.9 ppm computed with the ChemDraw v. 23.1.1.3, ACD v. 11.01, and MestReNova v. 15.0.1-35756 packages respectively. Finally, a Docker-based tool is proposed to predict the carbon chemical shifts of benzenic compounds solely from their SMILES codes.
Collapse
Affiliation(s)
- François Duprat
- Chimie Moléculaire, Macromoléculaire, Matériaux, ESPCI Paris, PSL University, 10 Rue Vauquelin, 75005 Paris, France
| | - Jean-Luc Ploix
- Chimie Moléculaire, Macromoléculaire, Matériaux, ESPCI Paris, PSL University, 10 Rue Vauquelin, 75005 Paris, France
| | - Gérard Dreyfus
- Chimie Moléculaire, Macromoléculaire, Matériaux, ESPCI Paris, PSL University, 10 Rue Vauquelin, 75005 Paris, France
| |
Collapse
|
2
|
Han C, Zhang D, Xia S, Zhang Y. Accurate Prediction of NMR Chemical Shifts: Integrating DFT Calculations with Three-Dimensional Graph Neural Networks. J Chem Theory Comput 2024; 20:5250-5258. [PMID: 38842505 PMCID: PMC11209944 DOI: 10.1021/acs.jctc.4c00422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/25/2024] [Accepted: 05/29/2024] [Indexed: 06/07/2024]
Abstract
Computer prediction of NMR chemical shifts plays an increasingly important role in molecular structure assignment and elucidation for organic molecule studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) have established a framework to predict NMR chemical shifts but often at a significant computational expense with a limited prediction accuracy. Recent advancements in deep learning methods, especially graph neural networks (GNNs), have shown promise in improving the accuracy of predicting experimental chemical shifts, either by using 2D molecular topological features or 3D conformational representation. This study presents a new 3D GNN model to predict 1H and 13C chemical shifts, CSTShift, that combines atomic features with DFT-calculated shielding tensor descriptors, capturing both isotropic and anisotropic shielding effects. Utilizing the NMRShiftDB2 data set and conducting DFT optimization and GIAO calculations at the B3LYP/6-31G(d) level, we prepared the NMRShiftDB2-DFT data set of high-quality 3D structures and shielding tensors with corresponding experimentally measured 1H and 13C chemical shifts. The developed CSTShift models achieve the state-of-the-art prediction performance on both the NMRShiftDB2-DFT test data set and external CHESHIRE data set. Further case studies on identifying correct structures from two groups of constitutional isomers show its capability for structure assignment and elucidation. The source code and data are accessible at https://yzhang.hpc.nyu.edu/IMA.
Collapse
Affiliation(s)
- Chao Han
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Dongdong Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
3
|
Sajed T, Sayeeda Z, Lee BL, Berjanskii M, Wang F, Gautam V, Wishart DS. Accurate Prediction of 1H NMR Chemical Shifts of Small Molecules Using Machine Learning. Metabolites 2024; 14:290. [PMID: 38786767 PMCID: PMC11123270 DOI: 10.3390/metabo14050290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/11/2024] [Accepted: 05/16/2024] [Indexed: 05/25/2024] Open
Abstract
NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, "solvent-aware" experimental dataset can be used to predict 1H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of <0.10 ppm) predict 1H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced "prosper") has also been used to predict 1H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.
Collapse
Affiliation(s)
- Tanvir Sajed
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Zinat Sayeeda
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Brian L. Lee
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Mark Berjanskii
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Fei Wang
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
- Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada
| |
Collapse
|
4
|
Venetos MC, Elkin M, Delaney C, Hartwig JF, Persson KA. Deconvolution and Analysis of the 1H NMR Spectra of Crude Reaction Mixtures. J Chem Inf Model 2024; 64:3008-3020. [PMID: 38573053 DOI: 10.1021/acs.jcim.3c01864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
Nuclear magnetic resonance (NMR) spectroscopy is an important analytical technique in synthetic organic chemistry, but its integration into high-throughput experimentation workflows has been limited by the necessity of manually analyzing the NMR spectra of new chemical entities. Current efforts to automate the analysis of NMR spectra rely on comparisons to databases of reported spectra for known compounds and, therefore, are incompatible with the exploration of new chemical space. By reframing the NMR spectrum of a reaction mixture as a joint probability distribution, we have used Hamiltonian Monte Carlo Markov Chain and density functional theory to fit the predicted NMR spectra to those of crude reaction mixtures. This approach enables the deconvolution and analysis of the spectra of mixtures of compounds without relying on reported spectra. The utility of our approach to analyze crude reaction mixtures is demonstrated with the experimental spectra of reactions that generate a mixture of isomers, such as Wittig olefination and C-H functionalization reactions. The correct identification of compounds in a reaction mixture and their relative concentrations is achieved with a mean absolute error as low as 1%.
Collapse
Affiliation(s)
- Maxwell C Venetos
- Department of Materials Science and Engineering, University of California, Berkeley, California 94720, United States
| | - Masha Elkin
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Connor Delaney
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - John F Hartwig
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Kristin A Persson
- Department of Materials Science and Engineering, University of California, Berkeley, California 94720, United States
- Molecular Foundry, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| |
Collapse
|
5
|
Dwivedi R, Maurya AK, Ahmed H, Farrag M, Pomin VH. Nuclear magnetic resonance-based structural elucidation of novel marine glycans and derived oligosaccharides. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2024; 62:269-285. [PMID: 37439410 DOI: 10.1002/mrc.5377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/23/2023] [Accepted: 06/26/2023] [Indexed: 07/14/2023]
Abstract
Marine glycans of defined structures are unique representatives among all kinds of structurally complex glycans endowed with important biological actions. Besides their unique biological properties, these marine sugars also enable advanced structure-activity relationship (SAR) studies given their distinct and defined structures. However, the natural high molecular weights (MWs) of these marine polysaccharides, sometimes even bigger than 100 kDa, pose a problem in many biophysical and analytical studies. Hence, the preparation of low MW oligosaccharides becomes a strategy to overcome the problem. Regardless of the polymeric or oligomeric lengths of these molecules, structural elucidation is mandatory for SAR studies. For this, nuclear magnetic resonance (NMR) spectroscopy plays a pivotal role. Here, we revisit the NMR-based structural elucidation of a series of marine sulfated poly/oligosaccharides discovered in our laboratory within the last 2 years. This set of structures includes the α-glucan extracted from the bivalve Marcia hiantina; the two sulfated galactans extracted from the red alga Botryocladia occidentalis; the fucosylated chondroitin sulfate isolated from the sea cucumber Pentacta pygmaea; the oligosaccharides produced from the fucosylated chondroitin sulfates from this sea cucumber species and from another species, Holothuria floridana; and the sulfated fucan from this later species. Specific 1H and 13C chemical shifts, generated by various 1D and 2D homonuclear and heteronuclear NMR spectra, are exploited as the primary source of information in the structural elucidation of these marine glycans.
Collapse
Affiliation(s)
- Rohini Dwivedi
- Department of BioMolecular Sciences, University of Mississippi, University, Mississippi, USA
| | - Antim K Maurya
- Department of BioMolecular Sciences, University of Mississippi, University, Mississippi, USA
| | - Hoda Ahmed
- Department of BioMolecular Sciences, University of Mississippi, University, Mississippi, USA
| | - Marwa Farrag
- Department of BioMolecular Sciences, University of Mississippi, University, Mississippi, USA
| | - Vitor H Pomin
- Department of BioMolecular Sciences, University of Mississippi, University, Mississippi, USA
- Research Institute of Pharmaceutical Sciences, School of Pharmacy, University of Mississippi, University, Mississippi, USA
| |
Collapse
|
6
|
Rahman M, Dannatt HRW, Blundell CD, Hughes LP, Blade H, Carson J, Tatman BP, Johnston ST, Brown SP. Polymorph Identification for Flexible Molecules: Linear Regression Analysis of Experimental and Calculated Solution- and Solid-State NMR Data. J Phys Chem A 2024; 128:1793-1816. [PMID: 38427685 PMCID: PMC10945485 DOI: 10.1021/acs.jpca.3c07732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/06/2024] [Accepted: 02/07/2024] [Indexed: 03/03/2024]
Abstract
The Δδ regression approach of Blade et al. [ J. Phys. Chem. A 2020, 124(43), 8959-8977] for accurately discriminating between solid forms using a combination of experimental solution- and solid-state NMR data with density functional theory (DFT) calculation is here extended to molecules with multiple conformational degrees of freedom, using furosemide polymorphs as an exemplar. As before, the differences in measured 1H and 13C chemical shifts between solution-state NMR and solid-state magic-angle spinning (MAS) NMR (Δδexperimental) are compared to those determined by gauge-including projector augmented wave (GIPAW) calculations (Δδcalculated) by regression analysis and a t-test, allowing the correct furosemide polymorph to be precisely identified. Monte Carlo random sampling is used to calculate solution-state NMR chemical shifts, reducing computation times by avoiding the need to systematically sample the multidimensional conformational landscape that furosemide occupies in solution. The solvent conditions should be chosen to match the molecule's charge state between the solution and solid states. The Δδ regression approach indicates whether or not correlations between Δδexperimental and Δδcalculated are statistically significant; the approach is differently sensitive to the popular root mean squared error (RMSE) method, being shown to exhibit a much greater dynamic range. An alternative method for estimating solution-state NMR chemical shifts by approximating the measured solution-state dynamic 3D behavior with an ensemble of 54 furosemide crystal structures (polymorphs and cocrystals) from the Cambridge Structural Database (CSD) was also successful in this case, suggesting new avenues for this method that may overcome its current dependency on the prior determination of solution dynamic 3D structures.
Collapse
Affiliation(s)
- Mohammed Rahman
- Department
of Physics, University of Warwick, Coventry CV4 7AL, U.K.
- Department
of Chemistry, University of Warwick, Coventry CV4 7AL, U.K.
| | | | | | - Leslie P. Hughes
- Oral
Product Development, Pharmaceutical Technology & Development, Operations, AstraZeneca, Macclesfield SK10 2NA, U.K.
| | - Helen Blade
- Oral
Product Development, Pharmaceutical Technology & Development, Operations, AstraZeneca, Macclesfield SK10 2NA, U.K.
| | - Jake Carson
- Mathematics
Institute at Warwick, University of Warwick, Coventry CV4 7AL, U.K.
| | - Ben P. Tatman
- Department
of Physics, University of Warwick, Coventry CV4 7AL, U.K.
- Department
of Chemistry, University of Warwick, Coventry CV4 7AL, U.K.
| | | | - Steven P. Brown
- Department
of Physics, University of Warwick, Coventry CV4 7AL, U.K.
| |
Collapse
|
7
|
Ai WJ, Li J, Cao D, Liu S, Yuan YY, Li Y, Tan GS, Xu KP, Yu X, Kang F, Zou ZX, Wang WX. A Very Deep Graph Convolutional Network for 13C NMR Chemical Shift Calculations with Density Functional Theory Level Performance for Structure Assignment. JOURNAL OF NATURAL PRODUCTS 2024. [PMID: 38359467 DOI: 10.1021/acs.jnatprod.3c00862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Nuclear magnetic resonance (NMR) chemical shift calculations are powerful tools for structure elucidation and have been extensively employed in both natural product and synthetic chemistry. However, density functional theory (DFT) NMR chemical shift calculations are usually time-consuming, while fast data-driven methods often lack reliability, making it challenging to apply them to computationally intensive tasks with a high requirement on quality. Herein, we have constructed a 54-layer-deep graph convolutional network for 13C NMR chemical shift calculations, which achieved high accuracy with low time-cost and performed competitively with DFT NMR chemical shift calculations on structure assignment benchmarks. Our model utilizes a semiempirical method, GFN2-xTB, and is compatible with a broad variety of organic systems, including those composed of hundreds of atoms or elements ranging from H to Rn. We used this model to resolve the controversial J/K ring junction problem of maitotoxin, which is the largest whole molecule assigned by NMR calculations to date. This model has been developed into user-friendly software, providing a useful tool for routine rapid structure validation and assignation as well as a new approach to elucidate the large structures that were previously unsuitable for NMR calculations.
Collapse
Affiliation(s)
- Wen-Jing Ai
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Jing Li
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Shao Liu
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Yi-Yun Yuan
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Yan Li
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Gui-Shan Tan
- Department of Pharmacy, National Clinical Research Center for Geriatric Disorder, in Xiangya Hospital, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Kang-Ping Xu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Xia Yu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Fenghua Kang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Zhen-Xing Zou
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
| | - Wen-Xuan Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, People's Republic of China
- Hunan Prima Drug Research Center Co., Ltd, Hunan Research Center for Drug Safety Evaluation, Hunan Key Laboratory of Pharmacodynamics and Safety Evaluation of New Drugs, Changsha, Hunan 410331, People's Republic of China
| |
Collapse
|
8
|
Maste S, Sharma B, Pongratz T, Grabe B, Hiller W, Erlach MB, Kremer W, Kalbitzer HR, Marx D, Kast SM. The accuracy limit of chemical shift predictions for species in aqueous solution. Phys Chem Chem Phys 2024; 26:6386-6395. [PMID: 38315169 DOI: 10.1039/d3cp05471c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Interpreting NMR experiments benefits from first-principles predictions of chemical shifts. Reaching the accuracy limit of theory is relevant for unambiguous structural analysis and dissecting theoretical approximations. Since accurate chemical shift measurements are based on using internal reference compounds such as trimethylsilylpropanesulfonate (DSS), a detailed comparison of experimental with theoretical data requires simultaneous consideration of both target and reference species ensembles in the same solvent environment. Here we show that ab initio molecular dynamics simulations to generate liquid-state ensembles of target and reference compounds, including explicitly their short-range solvation environments and combined with quantum-mechanical solvation models, allows for predicting highly accurate 1H (∼0.1-0.5 ppm) and aliphatic 13C (∼1.5 ppm) chemical shifts for aqueous solutions of the model compounds trimethylamine N-oxide (TMAO) and N-methylacetamide (NMA), referenced to DSS without any system-specific adjustments. This encompasses the two peptide bond conformations of NMA identified by NMR. The results are used to derive a general-purpose guideline set for predictive NMR chemical shift calculations of NMA in the liquid state and to identify artifacts of force field models. Accurate predictions are only obtained if a sufficient number of explicit water molecules is included in the quantum-mechanical calculations, disproving a purely electrostatic model of the solvent effect on chemical shifts.
Collapse
Affiliation(s)
- Stefan Maste
- Fakultät für Chemie und Chemische Biologie, Technische Universität Dortmund, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany.
| | - Bikramjit Sharma
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| | - Tim Pongratz
- Fakultät für Chemie und Chemische Biologie, Technische Universität Dortmund, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany.
| | - Bastian Grabe
- Fakultät für Chemie und Chemische Biologie, Technische Universität Dortmund, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany.
| | - Wolf Hiller
- Fakultät für Chemie und Chemische Biologie, Technische Universität Dortmund, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany.
| | - Markus Beck Erlach
- Fakultät für Biologie und Vorklinische Medizin, Universität Regensburg, 93040 Regensburg, Germany
| | - Werner Kremer
- Fakultät für Biologie und Vorklinische Medizin, Universität Regensburg, 93040 Regensburg, Germany
| | - Hans Robert Kalbitzer
- Fakultät für Biologie und Vorklinische Medizin, Universität Regensburg, 93040 Regensburg, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| | - Stefan M Kast
- Fakultät für Chemie und Chemische Biologie, Technische Universität Dortmund, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany.
| |
Collapse
|
9
|
Kuhn S, Kolshorn H, Steinbeck C, Schlörer N. Twenty years of nmrshiftdb2: A case study of an open database for analytical chemistry. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2024; 62:74-83. [PMID: 38112483 DOI: 10.1002/mrc.5418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 11/10/2023] [Accepted: 11/10/2023] [Indexed: 12/21/2023]
Abstract
In October 2003, 20 years ago, the open-source and open-content database NMRshiftDB was announced. Since then, the database, renamed as nmrshiftdb2 later, has been continuously available and is one of the longer-running projects in the field of open data in chemistry. After 20 years, we evaluate the success of the project and present lessons learnt for similar projects.
Collapse
Affiliation(s)
- Stefan Kuhn
- Institute of Computer Science, University of Tartu Tartu Estonia and School of Computer Science and Informatics, De Montfort University, Leicester, UK
| | - Heinz Kolshorn
- Department Chemie, Johannes Gutenberg-Universität Mainz, Mainz, Germany
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-Universität Jena, Jena, Germany
| | - Nils Schlörer
- NMR-Plattform, Friedrich-Schiller-Universität Jena, Jena, Germany
| |
Collapse
|
10
|
Hack J, Jordan M, Schmitt A, Raru M, Zorn HS, Seyfarth A, Eulenberger I, Geitner R. Ilm-NMR-P31: an open-access 31P nuclear magnetic resonance database and data-driven prediction of 31P NMR shifts. J Cheminform 2023; 15:122. [PMID: 38111059 PMCID: PMC10729349 DOI: 10.1186/s13321-023-00792-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 12/07/2023] [Indexed: 12/20/2023] Open
Abstract
This publication introduces a novel open-access 31P Nuclear Magnetic Resonance (NMR) shift database. With 14,250 entries encompassing 13,730 distinct molecules from 3,648 references, this database offers a comprehensive repository of organic and inorganic compounds. Emphasizing single-phosphorus atom compounds, the database facilitates data mining and machine learning endeavors, particularly in signal prediction and Computer-Assisted Structure Elucidation (CASE) systems. Additionally, the article compares different models for 31P NMR shift prediction, showcasing the database's potential utility. Hierarchically Ordered Spherical Environment (HOSE) code-based models and Graph Neural Networks (GNNs) perform exceptionally well with a mean squared error of 11.9 and 11.4 ppm respectively, achieving accuracy comparable to quantum chemical calculations.
Collapse
Affiliation(s)
- Jasmin Hack
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Moritz Jordan
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Alina Schmitt
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Melissa Raru
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Hannes Sönke Zorn
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Alex Seyfarth
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Isabel Eulenberger
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany
| | - Robert Geitner
- Institute of Chemistry and Bioengineering, Group of Physical Chemistry/Catalysis, Technical University Ilmenau, Weimarer Str. 32, 98693, Ilmenau, Germany.
| |
Collapse
|
11
|
Beran GJO. Frontiers of molecular crystal structure prediction for pharmaceuticals and functional organic materials. Chem Sci 2023; 14:13290-13312. [PMID: 38033897 PMCID: PMC10685338 DOI: 10.1039/d3sc03903j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
The reliability of organic molecular crystal structure prediction has improved tremendously in recent years. Crystal structure predictions for small, mostly rigid molecules are quickly becoming routine. Structure predictions for larger, highly flexible molecules are more challenging, but their crystal structures can also now be predicted with increasing rates of success. These advances are ushering in a new era where crystal structure prediction drives the experimental discovery of new solid forms. After briefly discussing the computational methods that enable successful crystal structure prediction, this perspective presents case studies from the literature that demonstrate how state-of-the-art crystal structure prediction can transform how scientists approach problems involving the organic solid state. Applications to pharmaceuticals, porous organic materials, photomechanical crystals, organic semi-conductors, and nuclear magnetic resonance crystallography are included. Finally, efforts to improve our understanding of which predicted crystal structures can actually be produced experimentally and other outstanding challenges are discussed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California Riverside Riverside CA 92521 USA
| |
Collapse
|
12
|
Rull H, Fischer M, Kuhn S. NMR shift prediction from small data quantities. J Cheminform 2023; 15:114. [PMID: 38012793 PMCID: PMC10683292 DOI: 10.1186/s13321-023-00785-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 11/16/2023] [Indexed: 11/29/2023] Open
Abstract
Prediction of chemical shift in NMR using machine learning methods is typically done with the maximum amount of data available to achieve the best results. In some cases, such large amounts of data are not available, e.g. for heteronuclei. We demonstrate a novel machine learning model that is able to achieve better results than other models for relevant datasets with comparatively low amounts of data. We show this by predicting [Formula: see text] and [Formula: see text] NMR chemical shifts of small molecules in specific solvents.
Collapse
Affiliation(s)
- Herman Rull
- Department of Computer Science, Tartu University, Narva mnt 18, Tartu, 51009, Tartumaa, Estonia
| | - Markus Fischer
- Institute for Medical Physics and Biophysics, Leipzig University, Härtelstr. 16-18, 04107, Leipzig, Sachsen, Germany
| | - Stefan Kuhn
- Department of Computer Science, Tartu University, Narva mnt 18, Tartu, 51009, Tartumaa, Estonia.
| |
Collapse
|
13
|
Hu G, Qiu M. Machine learning-assisted structure annotation of natural products based on MS and NMR data. Nat Prod Rep 2023; 40:1735-1753. [PMID: 37519196 DOI: 10.1039/d3np00025g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Covering: up to March 2023Machine learning (ML) has emerged as a popular tool for analyzing the structures of natural products (NPs). This review presents a summary of the recent advancements in ML-assisted mass spectrometry (MS) and nuclear magnetic resonance (NMR) data analysis to establish the chemical structures of NPs. First, ML-based MS/MS analyses that rely on library matching are discussed, which involves the utilization of ML algorithms to calculate similarity, predict the MS/MS fragments, and form molecular fingerprint. Then, ML assisted MS/MS structural annotation without library matching is reviewed. Furthermore, the cases of ML algorithms in assisting structural studies of NPs based on NMR are discussed from four perspectives: NMR prediction, functional group identification, structural categorization and quantum chemical calculation. Finally, the review concludes with a discussion of the challenges and the trends associated with the structural establishment of NPs based on ML algorithms.
Collapse
Affiliation(s)
- Guilin Hu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Minghua Qiu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| |
Collapse
|
14
|
Ke Z, Weng J, Xu X. Calculating 13 C NMR chemical shifts of large molecules using the eXtended ONIOM method at high accuracy with a low cost. J Comput Chem 2023; 44:2347-2357. [PMID: 37572044 DOI: 10.1002/jcc.27201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/15/2023] [Accepted: 07/24/2023] [Indexed: 08/14/2023]
Abstract
Fragmentation-based methods for nuclear magnetic resonance (NMR) chemical shift calculations have become more and more popular in first-principles calculations of large molecules. However, there are many options for a fragmentation-based method to select, such as theoretical methods, fragmentation schemes, the number of levels of theory, etc. It is important to study the optimal combination of the options to achieve a good balance between accuracy and efficiency. Here we investigate different combinations of options used by a fragmentation-based method, the eXtended ONIOM (XO) method, for 13 C chemical shift calculations on a set of organic and biological molecules. We found that: (1) introducing Hartree-Fock exchange into density functional theory (DFT) could reduce the calculation error due to fragmentation in contrast to pure DFT functionals, while a hybrid functional, xOPBE, is generally recommended; (2) fragmentation schemes generated from the molecular tailoring approach (MTA) with small level parameter n, for example, n = 2 and the degree-based fragmentation method (DBFM) with n = 1, are sufficient to achieve satisfactory accuracy; (3) the two-level XO (XO2) NMR calculation is superior to the calculation with only one level of theory, as the second level (i.e., low level) of theory provides a way to well describe the long-range effect. These findings are beneficial to practical applications of fragmentation-based methods for NMR chemical shift calculations of large molecules.
Collapse
Affiliation(s)
- Zhipeng Ke
- Institute of Photochemistry and Photofunctional Materials, University of Shanghai for Science and Technology, Shanghai, China
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Ministry of Education Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Jingwei Weng
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Ministry of Education Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Xin Xu
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Ministry of Education Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
- Hefei National Laboratory, Hefei, China
| |
Collapse
|
15
|
Góñez KV, García JS, Sardina FJ, Pazos Y, Saá Á, Martín Pastor M. J-filter: An experiment to simplify and isolate specific signals in 1 H NMR spectra of complex mixtures based on scalar coupling constants. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2023; 61:615-622. [PMID: 37727038 DOI: 10.1002/mrc.5396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 09/04/2023] [Accepted: 09/07/2023] [Indexed: 09/21/2023]
Abstract
One-dimensional selective NMR experiments relying on a J-filter element are proposed to isolate specific signals in crowded 1 H spectral regions. The J-filter allows the edition or filtering of signals in a region of interest of the spectrum by exploiting the specific values of their 1 H-1 H coupling constants and certain parameters of protons coupled to them that appear in less congested parts of the spectrum (chemical shifts and coupling constants). The new experiments permitted the isolation of specific peaks of phytosterol components in a sample obtained from a liquid nutraceutical recommended for lowering blood cholesterol levels in regions with complete overlap in the 1 H spectrum.
Collapse
Affiliation(s)
- Karen V Góñez
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares, (CIQUS), Universidade de Santiago de Compostela, A Coruña, Spain
| | - Juan Suárez García
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares, (CIQUS), Universidade de Santiago de Compostela, A Coruña, Spain
| | - F Javier Sardina
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares, (CIQUS), Universidade de Santiago de Compostela, A Coruña, Spain
| | - Yolanda Pazos
- Grupo de Investigación Traslacional en Enfermedades del Aparato Digestivo, Instituto de Investigación Sanitaria de Santiago (IDIS), Complejo Hospitalario Universitario de Santiago (CHUS), Servicio Gallego de Salud (SERGAS), A Coruña, Spain
| | - Ángela Saá
- Mestrelab Research S.L., A Coruña, Santiago de Compostela, Spain
| | - Manuel Martín Pastor
- Unidade de Resonancia Magnética, Área de Infraestructuras de Investigación, CACTUS, Universidade de Santiago de Compostela, A Coruña, Spain
| |
Collapse
|
16
|
Cortés I, Sarotti AM. Road Map Toward Computer-Guided Total Synthesis of Natural Products. The Dysiherbol A Case Study: What if Serendipity Hadn't Intervened? J Org Chem 2023; 88:14156-14164. [PMID: 37728229 DOI: 10.1021/acs.joc.3c01738] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
We present a computational study inspired by the story of dysiherbol A, a natural product whose putative structure was found incorrect through synthesis by a completely fortuitous event. While the carbon connectivity and chemical environment between both structures remain similar, the real dysiherbol A has a different molecular weight than that reported for the natural product. Had the synthesis groups not been favored by fortune, it could be speculated that a substantial amount of time and effort would have been required to solve the structural puzzle. Within the realm of computer-guided total synthesis of natural products, the question arose whether a synthesis group could have in silico reassigned the structure before embarking on the experimental adventure. To address this query, we evaluated some state-of-the-art computational procedures based on their computational demand and ease of implementation for nonexpert users with basic skills in computational chemistry (including HOSE, CASCADE, ANN-PRA, ML-J-DP4, DP4, and DP4+). While discussing the strengths and limitations of these methods, this case study provides a roadmap of what could be done before venturing into complex and time-demanding total synthesis projects.
Collapse
Affiliation(s)
- Iván Cortés
- Instituto de Química Rosario (CONICET), Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario, Suipacha 531, 2000 Rosario, Argentina
| | - Ariel M Sarotti
- Instituto de Química Rosario (CONICET), Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario, Suipacha 531, 2000 Rosario, Argentina
| |
Collapse
|
17
|
Xue X, Sun H, Yang M, Liu X, Hu HY, Deng Y, Wang X. Advances in the Application of Artificial Intelligence-Based Spectral Data Interpretation: A Perspective. Anal Chem 2023; 95:13733-13745. [PMID: 37688541 DOI: 10.1021/acs.analchem.3c02540] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
The interpretation of spectral data, including mass, nuclear magnetic resonance, infrared, and ultraviolet-visible spectra, is critical for obtaining molecular structural information. The development of advanced sensing technology has multiplied the amount of available spectral data. Chemical experts must use basic principles corresponding to the spectral information generated by molecular fragments and functional groups. This is a time-consuming process that requires a solid professional knowledge base. In recent years, the rapid development of computer science and its applications in cheminformatics and the emergence of computer-aided expert systems have greatly reduced the difficulty in analyzing large quantities of data. For expert systems, however, the problem-solving strategy must be known in advance or extracted by human experts and translated into algorithms. Gratifyingly, the development of artificial intelligence (AI) methods has shown great promise for solving such problems. Traditional algorithms, including the latest neural network algorithms, have shown great potential for both extracting useful information and processing massive quantities of data. This Perspective highlights recent innovations covering all of the emerging AI-based spectral interpretation techniques. In addition, the main limitations and current obstacles are presented, and the corresponding directions for further research are proposed. Moreover, this Perspective gives the authors' personal outlook on the development and future applications of spectral interpretation.
Collapse
Affiliation(s)
- Xi Xue
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hai-Yu Hu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
| |
Collapse
|
18
|
Chhaganlal MN, Underhaug J, Mjøs SA. Evaluation of NMR predictors for accuracy and ability to reveal trends in 1 H NMR spectra of fatty acids. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2023; 61:318-332. [PMID: 36759332 DOI: 10.1002/mrc.5336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 02/04/2023] [Accepted: 02/07/2023] [Indexed: 06/18/2023]
Abstract
Four different nuclear magnetic resonance (NMR) predictors have been evaluated for their ability to predict 600-MHz 1 H spectra of free fatty acids and fatty acid methyl esters of 20 common fatty acids. The predictors were evaluated on two main criteria: (1) their accuracy in direct prediction of the spectra (absolute accuracy) and (2) the ability to reveal trends or predict the change that occurs in the spectra as a result of a change in the fatty acid carbon chain, or by esterification of the free fatty acids to methyl esters (relative accuracy). The absolute accuracy in chemical shift prediction for fatty acids was good, compared with previous reports on a broader range of compounds. All four predictors had median prediction errors for chemical shifts of the signals in fatty acid methyl esters well below 0.1 ppm and as low as 0.015 ppm for one of the predictors. However, all predictors also had outliers with errors far above the upper interquartile range. In general, they also fail to reproduce trends of diagnostic value that were observed in the experimental data or properly predict the result of a minor change in molecular structure. All four predictors depend on experimental data from different origins. This may be a limiting factor for the relative accuracy of the predictors.
Collapse
Affiliation(s)
| | - Jarl Underhaug
- Department of Chemistry, University of Bergen, Bergen, Norway
| | - Svein A Mjøs
- Department of Chemistry, University of Bergen, Bergen, Norway
| |
Collapse
|
19
|
Hoyt EM, Smith LO, Crittenden DL. Simple, accurate, adjustable-parameter-free prediction of NMR shifts for molecules in solution. Phys Chem Chem Phys 2023; 25:9952-9957. [PMID: 36951928 DOI: 10.1039/d3cp00721a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
Accurate prediction of NMR shifts is invaluable for interpreting and assigning NMR spectra, especially for complex applications such as determining the identity of unknown substances or resolving stereochemical assignments. Statistical linear regression models have proven effective for accurately correlating density functional theory predictions of chemical shieldings with experimentally-measured shifts, but lack transferability - they must be reparameterised using a reasonably extensive training set at each level of theory and for each choice of NMR solvent. We have previously introduced a novel two-point "shift-and-scale" correction procedure for gas phase shieldings that overcomes these limitations without significant loss of accuracy. In this work, we demonstrate that this approach is equally applicable for predicting solution-phase shifts from computed gas phase shieldings, using acetaldehyde as an experimentally and computationally convenient reference system. We also present all of the required experimental reference data to enable this approach to be used for any target analyte in a range of commonly used NMR solvents (chloroform, dichloromethane, acetonitrile, methanol, acetone, DMSO, D2O, benzene, pyridine).
Collapse
Affiliation(s)
- Emlyn M Hoyt
- School of Physical and Chemical Sciences, University of Canterbury, Christchurch 8140, New Zealand.
| | - Lachlan O Smith
- School of Physical and Chemical Sciences, University of Canterbury, Christchurch 8140, New Zealand.
| | - Deborah L Crittenden
- School of Physical and Chemical Sciences, University of Canterbury, Christchurch 8140, New Zealand.
| |
Collapse
|
20
|
Gadikota V, Govindapur RR, Reddy DS, Roseman HJ, Williamson RT, Raab JG. Anomalous 1 H NMR chemical shift behavior of substituted benzoic acid esters. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2023; 61:248-252. [PMID: 36416132 DOI: 10.1002/mrc.5326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 10/05/2022] [Accepted: 11/21/2022] [Indexed: 06/16/2023]
Abstract
Benzoic acid esters represent key building blocks for many drug discovery and development programs and have been advanced as potent PDE4 inhibitors for inhaled administration for treatment of respiratory diseases. This class of compounds has also been employed in myriad industrial processes and as common food preservatives. Recent work directed toward the synthesis of intermediates for a proprietary medicinal chemistry program led us to observe that the 1 H NMR chemical shifts of substituents ortho to the benzoic acid ester moiety defied conventional iterative chemical shift prediction protocols. To explore these unexpected results, we initiated a detailed computational study employing density functional theory (DFT) calculations to better understand the unexpectedly large variance in expected versus experimental NMR chemical shifts.
Collapse
Affiliation(s)
- Vidya Gadikota
- A1 BioChem Labs LLC, Wilmington, North Carolina, 28409, USA
| | | | | | | | - R Thomas Williamson
- Department of Chemistry and Biochemistry, University of North Carolina Wilmington, Wilmington, North Carolina, 28409, USA
| | - Jeffrey G Raab
- Department of Chemistry and Biochemistry, University of North Carolina Wilmington, Wilmington, North Carolina, 28409, USA
| |
Collapse
|
21
|
Iuliucci RJ, Hartman JD, Beran GJO. Do Models beyond Hybrid Density Functionals Increase the Agreement with Experiment for Predicted NMR Chemical Shifts or Electric Field Gradient Tensors in Organic Solids? J Phys Chem A 2023; 127:2846-2858. [PMID: 36940431 DOI: 10.1021/acs.jpca.2c07657] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2023]
Abstract
Ab initio predictions of chemical shifts and electric field gradient (EFG) tensor components are frequently used to help interpret solid-state nuclear magnetic resonance (NMR) experiments. Typically, these predictions employ density functional theory (DFT) with generalized gradient approximation (GGA) functionals, though hybrid functionals have been shown to improve accuracy relative to experiment. Here, the performance of a dozen models beyond the GGA approximation are examined for the prediction of solid-state NMR observables, including meta-GGA, hybrid, and double-hybrid density functionals and second-order Møller-Plesset perturbation theory (MP2). These models are tested on organic molecular crystal data sets containing 169 experimental 13C and 15N chemical shifts and 114 17O and 14N EFG tensor components. To make these calculations affordable, gauge-including projector augmented wave (GIPAW) Perdew-Burke-Ernzerhof (PBE) calculations with periodic boundary conditions are combined with a local intramolecular correction computed at the higher level of theory. Within the context of typical NMR property calculations performed on a static, DFT-optimized crystal structure, the benchmarking finds that the double-hybrid DFT functionals produce errors versus experiment that are no smaller than those of hybrid functionals in the best cases, and they can be larger. MP2 errors versus experiment are even bigger. Overall, no practical advantages are found for using any of the tested double-hybrid functionals or MP2 to predict experimental solid-state NMR chemical shifts and EFG tensor components for routine organic crystals, especially given the higher computational cost of those methods. This finding likely reflects error cancellation benefiting the hybrid functionals. Improving the accuracy of the predicted chemical shifts and EFG tensors relative to experiment would probably require more robust treatments of the crystal structures, their dynamics, and other factors.
Collapse
Affiliation(s)
- Robbie J Iuliucci
- Department of Chemistry, Washington and Jefferson College, Washington, Pennsylvania 15301 United States
| | - Joshua D Hartman
- Department of Chemistry, University of California, Riverside, California 92521 United States
| | - Gregory J O Beran
- Department of Chemistry, University of California, Riverside, California 92521 United States
| |
Collapse
|
22
|
Fischetti G, Schmid N, Bruderer S, Caldarelli G, Scarso A, Henrici A, Wilhelm D. Automatic classification of signal regions in 1H Nuclear Magnetic Resonance spectra. Front Artif Intell 2023; 5:1116416. [PMID: 36714208 PMCID: PMC9874632 DOI: 10.3389/frai.2022.1116416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 12/22/2022] [Indexed: 01/12/2023] Open
Abstract
The identification and characterization of signal regions in Nuclear Magnetic Resonance (NMR) spectra is a challenging but crucial phase in the analysis and determination of complex chemical compounds. Here, we present a novel supervised deep learning approach to perform automatic detection and classification of multiplets in 1H NMR spectra. Our deep neural network was trained on a large number of synthetic spectra, with complete control over the features represented in the samples. We show that our model can detect signal regions effectively and minimize classification errors between different types of resonance patterns. We demonstrate that the network generalizes remarkably well on real experimental 1H NMR spectra.
Collapse
Affiliation(s)
- Giulia Fischetti
- Dipartimento di Scienze Molecolari e Nanosistemi, Ca' Foscari Università di Venezia, Venice, Italy
| | - Nicolas Schmid
- Zürcher Hochschule für Angewandte Wissenschaften (ZHAW), Zurich, Switzerland
- Institute for Computational Science, Universität Zürich (UZH), Zurich, Switzerland
| | | | - Guido Caldarelli
- Dipartimento di Scienze Molecolari e Nanosistemi, Ca' Foscari Università di Venezia, Venice, Italy
| | - Alessandro Scarso
- Dipartimento di Scienze Molecolari e Nanosistemi, Ca' Foscari Università di Venezia, Venice, Italy
| | - Andreas Henrici
- Zürcher Hochschule für Angewandte Wissenschaften (ZHAW), Zurich, Switzerland
| | - Dirk Wilhelm
- Zürcher Hochschule für Angewandte Wissenschaften (ZHAW), Zurich, Switzerland
| |
Collapse
|
23
|
Abstract
Glycans, carbohydrate molecules in the realm of biology, are present as biomedically important glycoconjugates and a characteristic aspect is that their structures in many instances are branched. In determining the primary structure of a glycan, the sugar components including the absolute configuration and ring form, anomeric configuration, linkage(s), sequence, and substituents should be elucidated. Solution state NMR spectroscopy offers a unique opportunity to resolve all these aspects at atomic resolution. During the last two decades, advancement of both NMR experiments and spectrometer hardware have made it possible to unravel carbohydrate structure more efficiently. These developments applicable to glycans include, inter alia, NMR experiments that reduce spectral overlap, use selective excitations, record tilted projections of multidimensional spectra, acquire spectra by multiple receivers, utilize polarization by fast-pulsing techniques, concatenate pulse-sequence modules to acquire several spectra in a single measurement, acquire pure shift correlated spectra devoid of scalar couplings, employ stable isotope labeling to efficiently obtain homo- and/or heteronuclear correlations, as well as those that rely on dipolar cross-correlated interactions for sequential information. Refined computer programs for NMR spin simulation and chemical shift prediction aid the structural elucidation of glycans, which are notorious for their limited spectral dispersion. Hardware developments include cryogenically cold probes and dynamic nuclear polarization techniques, both resulting in enhanced sensitivity as well as ultrahigh field NMR spectrometers with a 1H NMR resonance frequency higher than 1 GHz, thus improving resolution of resonances. Taken together, the developments have made and will in the future make it possible to elucidate carbohydrate structure in great detail, thereby forming the basis for understanding of how glycans interact with other molecules.
Collapse
Affiliation(s)
- Carolina Fontana
- Departamento
de Química del Litoral, CENUR Litoral Norte, Universidad de la República, Paysandú 60000, Uruguay
| | - Göran Widmalm
- Department
of Organic Chemistry, Arrhenius Laboratory, Stockholm University, S-106 91 Stockholm, Sweden,
| |
Collapse
|
24
|
Cohen RD, Wang X, Sherer EC, Martin GE. Application of 1,1-ADEQUATE and DFT to correct 13 C misassignments of carbonyl chemical shifts for carbapenem antibiotics. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2022; 60:963-969. [PMID: 35781893 DOI: 10.1002/mrc.5297] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 06/15/2023]
Abstract
Prior to the development of sensitive proton-detected 2D NMR experiments, assigning 13 C signals could be a significant challenge, and mistakes have occurred even for prominent compound classes. In this study, 1,1-ADEQUATE data were used to unambiguously reassign the 13 C chemical shifts for the β-lactam carbonyl at the C-7 position and the proximal carboxylate at the C-10 position of the carbapenems, meropenem and imipenem. Density functional theory (DFT) was then investigated to provide sufficiently accurate 13 C chemical shift predictions, allowing for the carbonyl signal reassignment of thienamycin.
Collapse
Affiliation(s)
| | - Xiao Wang
- Merck & Co., Inc., Rahway, New Jersey, USA
| | | | - Gary E Martin
- Chemistry and Biochemistry, Seton Hall University, South Orange, New Jersey, USA
| |
Collapse
|
25
|
Sridharan B, Mehta S, Pathak Y, Priyakumar UD. Deep Reinforcement Learning for Molecular Inverse Problem of Nuclear Magnetic Resonance Spectra to Molecular Structure. J Phys Chem Lett 2022; 13:4924-4933. [PMID: 35635003 DOI: 10.1021/acs.jpclett.2c00624] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Spectroscopy is the study of how matter interacts with electromagnetic radiation. The spectra of any molecule are highly information-rich, yet the inverse relation of spectra to the corresponding molecular structure is still an unsolved problem. Nuclear magnetic resonance (NMR) spectroscopy is one such critical technique in the scientists' toolkit to characterize molecules. In this work, a novel machine learning framework is proposed that attempts to solve this inverse problem by navigating the chemical space to find the correct structure given an NMR spectra. The proposed framework uses a combination of online Monte Carlo tree search (MCTS) and a set of graph convolution networks to build a molecule iteratively. Our method can predict the structure of the molecule ∼80% of the time in its top 3 guesses for molecules with <10 heavy atoms. We believe that the proposed framework is a significant step in solving the inverse design problem of NMR spectra.
Collapse
Affiliation(s)
- Bhuvanesh Sridharan
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Sarvesh Mehta
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Yashaswi Pathak
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - U Deva Priyakumar
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| |
Collapse
|
26
|
Regression Machine Learning Models Used to Predict DFT-Computed NMR Parameters of Zeolites. COMPUTATION 2022. [DOI: 10.3390/computation10050074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Machine learning approaches can drastically decrease the computational time for the predictions of spectroscopic properties in materials, while preserving the quality of the computational approaches. We studied the performance of kernel-ridge regression (KRR) and gradient boosting regressor (GBR) models trained on the isotropic shielding values, computed with density-functional theory (DFT), in a series of different known zeolites containing out-of-frame metal cations or fluorine anion and organic structure-directing cations. The smooth overlap of atomic position descriptors were computed from the DFT-optimised Cartesian coordinates of each atoms in the zeolite crystal cells. The use of these descriptors as inputs in both machine learning regression methods led to the prediction of the DFT isotropic shielding values with mean errors within 0.6 ppm. The results showed that the GBR model scales better than the KRR model.
Collapse
|
27
|
Kontogianni VG, Gerothanassis IP. Analytical and Structural Tools of Lipid Hydroperoxides: Present State and Future Perspectives. Molecules 2022; 27:2139. [PMID: 35408537 PMCID: PMC9000705 DOI: 10.3390/molecules27072139] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 03/20/2022] [Accepted: 03/22/2022] [Indexed: 11/17/2022] Open
Abstract
Mono- and polyunsaturated lipids are particularly susceptible to peroxidation, which results in the formation of lipid hydroperoxides (LOOHs) as primary nonradical-reaction products. LOOHs may undergo degradation to various products that have been implicated in vital biological reactions, and thus in the pathogenesis of various diseases. The structure elucidation and qualitative and quantitative analysis of lipid hydroperoxides are therefore of great importance. The objectives of the present review are to provide a critical analysis of various methods that have been widely applied, and more specifically on volumetric methods, applications of UV-visible, infrared, Raman/surface-enhanced Raman, fluorescence and chemiluminescence spectroscopies, chromatographic methods, hyphenated MS techniques, NMR and chromatographic methods, NMR spectroscopy in mixture analysis, structural investigations based on quantum chemical calculations of NMR parameters, applications in living cells, and metabolomics. Emphasis will be given to analytical and structural methods that can contribute significantly to the molecular basis of the chemical process involved in the formation of lipid hydroperoxides without the need for the isolation of the individual components. Furthermore, future developments in the field will be discussed.
Collapse
Affiliation(s)
- Vassiliki G. Kontogianni
- Section of Organic Chemistry and Biochemistry, Department of Chemistry, University of Ioannina, GR-45110 Ioannina, Greece
| | - Ioannis P. Gerothanassis
- Section of Organic Chemistry and Biochemistry, Department of Chemistry, University of Ioannina, GR-45110 Ioannina, Greece
- International Center for Chemical and Biological Sciences, H.E.J. Research Institute of Chemistry, University of Karachi, Karachi 75270, Pakistan
| |
Collapse
|
28
|
Miaskiewicz S, Weibel JM, Pale P, Blanc A. A gold( i)-catalysed approach towards harmalidine an elusive alkaloid from Peganum harmala. RSC Adv 2022; 12:26966-26974. [PMID: 36275169 PMCID: PMC9490519 DOI: 10.1039/d2ra05685b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 09/13/2022] [Indexed: 11/21/2022] Open
Abstract
Upon gold catalysis, the 2,3-dihydropyrrolo[1,2-a]indole motif, encountered in few but interesting bioactive natural products, was efficiently obtained from N-aryl 2-alkynylazetidine derivatives. In an attempt to apply this methodology to the synthesis of harmalidine, isolated from the seeds of Peganum harmala, advanced amino 2,3-hydropyrrolo[1,2-a]indol(one) derivatives were readily obtained in only 11 steps from but-3-yn-1-ol. While the reported structure of harmalidine could not be reached from these intermediates, a surprising 12-membered diimino dimer was isolated. Extensive comparison of the reported harmalidine NMR data to the experimental and calculated data of our synthetic molecules, harmaline or the synthetised N-methylharmaline show discrepancies with the proposed natural product structure. Upon gold catalysis, the 2,3-dihydropyrrolo[1,2-a]indole motif, encountered in few but interesting bioactive natural products, was efficiently obtained from N-aryl 2-alkynylazetidine derivatives.![]()
Collapse
Affiliation(s)
- Solène Miaskiewicz
- Laboratoire de Synthèse, Réactivité Organiques et Catalyse, Institut de Chimie, UMR 7177 - CNRS, Université de Strasbourg, 4 Rue Blaise Pascal, 67070 Strasbourg, France
| | - Jean-Marc Weibel
- Laboratoire de Synthèse, Réactivité Organiques et Catalyse, Institut de Chimie, UMR 7177 - CNRS, Université de Strasbourg, 4 Rue Blaise Pascal, 67070 Strasbourg, France
| | - Patrick Pale
- Laboratoire de Synthèse, Réactivité Organiques et Catalyse, Institut de Chimie, UMR 7177 - CNRS, Université de Strasbourg, 4 Rue Blaise Pascal, 67070 Strasbourg, France
| | - Aurélien Blanc
- Laboratoire de Synthèse, Réactivité Organiques et Catalyse, Institut de Chimie, UMR 7177 - CNRS, Université de Strasbourg, 4 Rue Blaise Pascal, 67070 Strasbourg, France
| |
Collapse
|
29
|
Han J, Kang H, Kang S, Kwon Y, Lee D, Choi YS. Scalable graph neural network for NMR chemical shift prediction. Phys Chem Chem Phys 2022; 24:26870-26878. [DOI: 10.1039/d2cp04542g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We present a scalable graph neural network (GNN) with improved message passing and readout functions for the fast and accurate prediction of nuclear magnetic resonance (NMR) chemical shifts.
Collapse
Affiliation(s)
- Jongmin Han
- Department of Industrial Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Hyungu Kang
- Department of Industrial Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Seokho Kang
- Department of Industrial Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Youngchun Kwon
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., Yeongtong-gu, Suwon 16678, Republic of Korea
- Department of Computer Science and Engineering, Seoul National University, Gwanak-gu, Seoul 08826, Republic of Korea
| | - Dongseon Lee
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., Yeongtong-gu, Suwon 16678, Republic of Korea
| | - Youn-Suk Choi
- Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd., Yeongtong-gu, Suwon 16678, Republic of Korea
| |
Collapse
|