1
|
Judge MT, Ebbels TMD. Problems, principles and progress in computational annotation of NMR metabolomics data. Metabolomics 2022; 18:102. [PMID: 36469142 PMCID: PMC9722819 DOI: 10.1007/s11306-022-01962-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/18/2022] [Indexed: 12/08/2022]
Abstract
BACKGROUND Compound identification remains a critical bottleneck in the process of exploiting Nuclear Magnetic Resonance (NMR) metabolomics data, especially for 1H 1-dimensional (1H 1D) data. As databases of reference compound spectra have grown, workflows have evolved to rely heavily on their search functions to facilitate this process by generating lists of potential metabolites found in complex mixture data, facilitating annotation and identification. However, approaches for validating and communicating annotations are most often guided by expert knowledge, and therefore are highly variable despite repeated efforts to align practices and define community standards. AIM OF REVIEW This review is aimed at broadening the application of automated annotation tools by discussing the key ideas of spectral matching and beginning to describe a set of terms to classify this information, thus advancing standards for communicating annotation confidence. Additionally, we hope that this review will facilitate the growing collaboration between chemical data scientists, software developers and the NMR metabolomics community aiding development of long-term software solutions. KEY SCIENTIFIC CONCEPTS OF REVIEW We begin with a brief discussion of the typical untargeted NMR identification workflow. We differentiate between annotation (hypothesis generation, filtering), and identification (hypothesis testing, verification), and note the utility of different NMR data features for annotation. We then touch on three parts of annotation: (1) generation of queries, (2) matching queries to reference data, and (3) scoring and confidence estimation of potential matches for verification. In doing so, we highlight existing approaches to automated and semi-automated annotation from the perspective of the structural information they utilize, as well as how this information can be represented computationally.
Collapse
Affiliation(s)
- Michael T Judge
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College, 131 Sir Alexander Fleming Building, South Kensington Campus, London, UK
| | - Timothy M D Ebbels
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College, 131 Sir Alexander Fleming Building, South Kensington Campus, London, UK.
| |
Collapse
|
2
|
Lumley JA, Sharman G, Wilkin T, Hirst M, Cobas C, Goebel M. A KNIME Workflow for Automated Structure Verification. SLAS DISCOVERY 2020; 25:950-956. [PMID: 32081066 DOI: 10.1177/2472555220907091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Adequate characterization of chemical entities made for biological screening in the drug discovery context is critical. Incorrectly characterized structures lead to mistakes in the interpretation of structure-activity relationships and confuse an already multidimensional optimization problem. Mistakes in the later use of these compounds waste money and valuable resources in a discovery process already under cost pressure. Left unidentified, these errors lead to problems in project data packages during quality review. At worst, they put intellectual property and patent integrity at risk. We describe a KNIME workflow for the early and automated identification of these errors during registration of a new chemical entity into the corporate screening catalog. This Automated Structure Verification workflow provides early identification (within 24 hours) of missing or inconsistent analytical data and therefore reduces any mistakes that inevitably get made. Automated identification removes the burden of work from the chemist submitting the compound into the registration system. No additional work is required unless a problem is identified and the submitter alerted. Before implementation, 14% of samples within the existing sample catalog were missing data on initial pass. A year after implementation, only 0.2% were missing data.
Collapse
Affiliation(s)
- James A Lumley
- Research IT, Eli Lilly and Company, Windlesham, Surrey, UK
| | - Gary Sharman
- Analytical Technologies, Eli Lilly and Company, Windlesham, Surrey, UK
| | - Thomas Wilkin
- Research IT, Eli Lilly and Company, Windlesham, Surrey, UK
| | - Matthew Hirst
- Research IT, Eli Lilly and Company, Windlesham, Surrey, UK
| | - Carlos Cobas
- Mestrelab Research, S.L., Santiago de Compostela, Galicia, Spain
| | - Michael Goebel
- Mestrelab Research, S.L., Santiago de Compostela, Galicia, Spain
| |
Collapse
|
3
|
Richardson J, Sharman G, Martínez-Olid F, Cañellas S, Gomez JE. Unlocking the potential of late-stage functionalisation: an accurate and fully automated method for the rapid characterisation of multiple regioisomeric products. REACT CHEM ENG 2020. [DOI: 10.1039/c9re00431a] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
An automated pipeline for structure determination is outlined that will help unlock the potential of late-stage functionalisation (LSF).
Collapse
Affiliation(s)
| | - Gary Sharman
- Discovery Research and Technologies
- Eli Lilly and Company
- Surrey
- UK
| | - Francisco Martínez-Olid
- Discovery Research and Technologies
- Eli Lilly and Company
- Centro de Investigación Lilly
- 28108 Alcobendas-Madrid
- Spain
| | - Santiago Cañellas
- Institute of Chemical Research of Catalonia (ICIQ)
- The Barcelona Institute of Science and Technology
- E-43007 Tarragona
- Spain
| | - Jose Enrique Gomez
- Institute of Chemical Research of Catalonia (ICIQ)
- The Barcelona Institute of Science and Technology
- E-43007 Tarragona
- Spain
| |
Collapse
|
4
|
Reibarkh M, Wyche TP, Saurí J, Bugni TS, Martin GE, Thomas Williamson R. Structure elucidation of uniformly (13)C labeled small molecule natural products. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2015; 53:996-1002. [PMID: 26768304 DOI: 10.1002/mrc.4333] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/04/2015] [Accepted: 08/18/2015] [Indexed: 06/05/2023]
Abstract
Utilization of (2)H, (13)C, and (15)N isotopically labeled proteins and peptides is now routine in biomolecular NMR investigations. The widespread availability of inexpensive, uniformly (13) C enriched glucose now makes it possible to isolate uniformly (13)C labeled natural products from microbial fermentation. We now wish to describe an approach for the rapid structural characterization of uniformly (13)C labeled natural products that avoids the pitfalls of relying on parameters typically employed in biomolecular NMR studies.
Collapse
Affiliation(s)
- Mikhail Reibarkh
- Process and Analytical Chemistry, NMR Structure Elucidation Group, Merck Research Laboratories, Rahway, NJ, United States
| | - Thomas P Wyche
- School of Pharmacy, University of Wisconsin-Madison, Madison, WI, United States
| | - Josep Saurí
- Process and Analytical Chemistry, NMR Structure Elucidation Group, Merck Research Laboratories, Rahway, NJ, United States
| | - Tim S Bugni
- School of Pharmacy, University of Wisconsin-Madison, Madison, WI, United States
| | - Gary E Martin
- Process and Analytical Chemistry, NMR Structure Elucidation Group, Merck Research Laboratories, Rahway, NJ, United States
| | - R Thomas Williamson
- Process and Analytical Chemistry, NMR Structure Elucidation Group, Merck Research Laboratories, Rahway, NJ, United States
| |
Collapse
|
5
|
Clark AM, Williams AJ, Ekins S. Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data. J Cheminform 2015; 7:9. [PMID: 25798198 PMCID: PMC4369291 DOI: 10.1186/s13321-015-0057-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 02/23/2015] [Indexed: 11/12/2022] Open
Abstract
The current rise in the use of open lab notebook techniques means that there are an increasing number of scientists who make chemical information freely and openly available to the entire community as a series of micropublications that are released shortly after the conclusion of each experiment. We propose that this trend be accompanied by a thorough examination of data sharing priorities. We argue that the most significant immediate benefactor of open data is in fact chemical algorithms, which are capable of absorbing vast quantities of data, and using it to present concise insights to working chemists, on a scale that could not be achieved by traditional publication methods. Making this goal practically achievable will require a paradigm shift in the way individual scientists translate their data into digital form, since most contemporary methods of data entry are designed for presentation to humans rather than consumption by machine learning algorithms. We discuss some of the complex issues involved in fixing current methods, as well as some of the immediate benefits that can be gained when open data is published correctly using unambiguous machine readable formats. Lab notebook entries must target both visualisation by scientists and use by machine learning algorithms ![]()
Collapse
Affiliation(s)
- Alex M Clark
- Molecular Materials Informatics, 1900 St. Jacques #302, Montreal, H3J 2S1, QC Canada
| | - Antony J Williams
- Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC 27587 USA
| | - Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526 USA ; Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010 USA
| |
Collapse
|
6
|
Castillo AM, Bernal A, Patiny L, Wist J. A new method for the comparison of 1H NMR predictors based on tree-similarity of spectra. J Cheminform 2014; 6:9. [PMID: 24666427 PMCID: PMC3987679 DOI: 10.1186/1758-2946-6-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Accepted: 03/11/2014] [Indexed: 11/10/2022] Open
Abstract
A methodology based on spectral similarity is presented that allows to compare NMR predictors without the recourse to assigned experimental spectra, thereby making the task of benchmarking NMR predictors less tedious, faster, and less prone to human error. This approach was used to compare four popular NMR predictors using a dataset of 1000 molecules and their corresponding experimental spectra. The results found were consistent with those obtained by directly comparing deviations between predicted and experimental shifts.
Collapse
Affiliation(s)
| | | | | | - Julien Wist
- Chemistry Department, Universidad del Valle, AA 25360 Cali, Valle, Colombia.
| |
Collapse
|
7
|
Gao J, Ma R, Wang W, Wang N, Sasaki R, Snyderman D, Wu J, Ruan K. Automated NMR fragment based screening identified a novel interface blocker to the LARG/RhoA complex. PLoS One 2014; 9:e88098. [PMID: 24505392 PMCID: PMC3914932 DOI: 10.1371/journal.pone.0088098] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 01/06/2014] [Indexed: 02/03/2023] Open
Abstract
The small GTPase cycles between the inactive GDP form and the activated GTP form, catalyzed by the upstream guanine exchange factors. The modulation of such process by small molecules has been proven to be a fruitful route for therapeutic intervention to prevent the over-activation of the small GTPase. The fragment based approach emerging in the past decade has demonstrated its paramount potential in the discovery of inhibitors targeting such novel and challenging protein-protein interactions. The details regarding the procedure of NMR fragment screening from scratch have been rarely disclosed comprehensively, thus restricts its wider applications. To achieve a consistent screening applicable to a number of targets, we developed a highly automated protocol to cover every aspect of NMR fragment screening as possible, including the construction of small but diverse libray, determination of the aqueous solubility by NMR, grouping compounds with mutual dispersity to a cocktail, and the automated processing and visualization of the ligand based screening spectra. We exemplified our streamlined screening in RhoA alone and the complex of the small GTPase RhoA and its upstream guanine exchange factor LARG. Two hits were confirmed from the primary screening in cocktail and secondary screening over individual hits for LARG/RhoA complex, while one of them was also identified from the screening for RhoA alone. HSQC titration of the two hits over RhoA and LARG alone, respectively, identified one compound binding to RhoA.GDP at a 0.11 mM affinity, and perturbed the residues at the switch II region of RhoA. This hit blocked the formation of the LARG/RhoA complex, validated by the native gel electrophoresis, and the titration of RhoA to ¹⁵N labeled LARG in the absence and presence the compound, respectively. It therefore provides us a starting point toward a more potent inhibitor to RhoA activation catalyzed by LARG.
Collapse
Affiliation(s)
- Jia Gao
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Rongsheng Ma
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Wei Wang
- Pfizer Worldwide Research and Development, San Diego, California, United States of America
| | - Na Wang
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Ryan Sasaki
- Advanced Chemistry Development Inc., Toronto, Ontario, Canada
| | - David Snyderman
- Advanced Chemistry Development Inc., Toronto, Ontario, Canada
| | - Jihui Wu
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Ke Ruan
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Science, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
8
|
Cobas C, Seoane F, Vaz E, Bernstein MA, Dominguez S, Pérez M, Sýkora S. Automatic assignment of 1H-NMR spectra of small molecules. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2013; 51:649-654. [PMID: 24038382 DOI: 10.1002/mrc.3995] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 07/01/2013] [Accepted: 07/11/2013] [Indexed: 06/02/2023]
Abstract
A novel data-evaluation procedure for the automatic atom to peak or multiplet assignment of 1H-NMR spectra of small molecules has been developed using a fast and robust expert system. The applicability and reliability of the method are demonstrated by comparison of a manually assigned database of 1H-NMR spectra with the assignments produced by the automatic procedure. The results of this analysis show an excellent success ratio, indicating that this new algorithm can have a major impact as a time saving tool for the organic chemist. A new graphical feature used to illustrate both the stability and quality of the elementary assignments is also introduced.
Collapse
|
9
|
Plainchont B, Nuzillard JM. Structure verification through computer-assisted spectral assignment of NMR spectra. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2013. [PMID: 23208516 DOI: 10.1002/mrc.3908] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The validation of a molecular organic structure on the basis of 1D and 2D HSQC, COSY and HMBC NMR spectra is proposed as an alternative to the methods that are mainly based on chemical shift prediction. The CCASA software was written for this purpose. It provides an updated and improved implementation of the preceding computer-assisted spectral assignment software. CCASA can be downloaded freely from http://www.univ-reims.fr/LSD/JmnSoft/CASA. Two bioactive natural products, a triterpene and a benzophenone, were selected from literature data as examples. The tentative matching between the structure and the NMR data interpretation of the triterpene unexpectedly leads to the hypothesis of an incorrect structure. The LSD software was used to find an alternative structure that improved the 2D NMR data interpretation and the carbon-13 chemical shift matching between experimental values and those produced by the nmrshiftdb2 prediction tool. The benzophenone example showed that signal assignment by means of chemical shift prediction can be replaced by elementary user-supplied chemical shift and multiplicity constraints.
Collapse
Affiliation(s)
- Bertrand Plainchont
- Institut de Chimie Moléculaire de Reims, CNRS UMR 7312, Université de Reims-Champagne-Ardenne, BP 1039, 51687, Reims Cedex 2, France
| | | |
Collapse
|
10
|
Golotvin SS, Pol R, Sasaki RR, Nikitina A, Keyes P. Concurrent combined verification: reducing false positives in automated NMR structure verification through the evaluation of multiple challenge control structures. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2012; 50:429-435. [PMID: 22549844 DOI: 10.1002/mrc.3818] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Revised: 02/20/2012] [Accepted: 03/17/2012] [Indexed: 05/31/2023]
Abstract
Automated structure verification using (1)H NMR data or a combination of (1)H and heteronuclear single-quantum correlation (HSQC) data is gaining more interest as a routine application for qualitative evaluation of large compound libraries produced by synthetic chemistry. The goal of this automated software method is to identify a manageable subset of compounds and data that require human review. In practice, the automated method will flag structure and data combinations that exhibit some inconsistency (i.e. strange chemical shifts, conflicts in multiplicity, or overestimated and underestimated integration values) and validate those that appear consistent. One drawback of this approach is that no automated system can guarantee that all passing structures are indeed correct structures. The major reason for this is that approaches using only (1)H or even (1)H and HSQC spectra often do not provide sufficient information to properly distinguish between similar structures. Therefore, current implementations of automated structure verification systems allow, in principle, false positive results. Presented in this work is a method that greatly reduces the probability of an automated validation system passing incorrect structures (i.e. false positives). This novel method was applied to automatically validate 127 non-proprietary compounds from several commercial sources. Presented also is the impact of this approach on false positive and false negative results.
Collapse
Affiliation(s)
- Sergey S Golotvin
- Advanced Chemistry Development, Ltd, Moscow Department, Moscow, Russia
| | | | | | | | | |
Collapse
|
11
|
Ruan K, Yang S, Van Sant KA, Likos JJ. Application of Hadamard spectroscopy to automated structure verification in high-throughput NMR. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:693-700. [PMID: 19496061 DOI: 10.1002/mrc.2459] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Combined verification using 1-D proton and HSQC has been proved to be quite successful; the acquisition time of HSQC spectra, however, can be limiting in its high-throughput applications. The replacement with Hadamard HSQC can significantly enhance the throughput. We hereby propose a protocol to optimize the grouping of the predicted carbon chemical shifts from the proposed structure and the associated Hadamard frequencies and bandwidths. The resulting Hadamard HSQC spectra compare favorably with their Fourier-transformed counterparts, and have demonstrated to perform equivalently in terms of combined verification, but with several fold enhancement in throughput, as illustrated for 21 commercial available molecules and 16 prototypical drug compounds. Further improvement of the verification accuracy can be achieved by the cross validation from Hadamard TOCSY, which can be acquired without much sacrifice in throughput.
Collapse
Affiliation(s)
- Ke Ruan
- Pfizer Global Research & Development, 700 Chesterfield Parkway, St. Louis, MO 63017, USA
| | | | | | | |
Collapse
|
12
|
Keyes P, Hernandez G, Cianchetta G, Robinson J, Lefebvre B. Automated compound verification using 2D-NMR HSQC data in an open-access environment. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:38-52. [PMID: 18991323 DOI: 10.1002/mrc.2347] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Since the introduction of NMR prediction software, medicinal chemists have imagined submitting their compounds to corporate compound registration systems that would ultimately display a simplified pass/fail result. We initially implemented such a system based on HPLC and liquid chromatography mass spectrometry (LCMS) data that is embedded within our industry standard sample submission and registration process. By using gradient-heteronuclear single quantum coherence (HSQC) experiments, we have extended this concept to NMR data through a comparison of experimentally acquired data against predicted (1)H and (13)C NMR data. Integration of our compound registration system with our analytical instruments now provides our chemists unattended and automated NMR verification for collections of submitted compounds. The benefits achieved from automated processing and interpretation of results produced enhanced confidence in our compound library and released the chemists from the tedium of manipulating large amounts of data. This allows scientists to focus more of their attention to the drug discovery process.
Collapse
Affiliation(s)
- Philip Keyes
- Lexicon Pharmaceuticals, Analytical Chemistry, 350 Carter Road, Princeton, New Jersey, 08540, USA.
| | | | | | | | | |
Collapse
|
13
|
Griffiths L, Beeley HH, Horton R. Towards the automatic analysis of NMR spectra: part 7. Assignment of 1H by employing both 1H and 1H/13C correlation spectra. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2008; 46:818-827. [PMID: 18561211 DOI: 10.1002/mrc.2257] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
A reliable method of automatically assigning one-dimensional proton spectra is described. The method relies on the alignment of the proton spectrum with an associated heteronuclear single-quantum coherence (HSQC) spectrum, transferring the stoichiometry and couplings to the HSQC. The HSQC spectrum is then assigned using a linear assignment procedure in which a fitness function incorporating (1)H chemical shifts, (1)H couplings and (13)C shifts are employed. The method uniquely employs a sequential procedure in which only correlations of like stoichiometry are assigned at the same time.
Collapse
Affiliation(s)
- Lee Griffiths
- AstraZeneca, Mereside, Alderley Park, Macclesfield, Cheshire, SK10 4TG, UK.
| | | | | |
Collapse
|
14
|
Binev Y, Marques MMB, Aires-de-Sousa J. Prediction of 1H NMR Coupling Constants with Associative Neural Networks Trained for Chemical Shifts. J Chem Inf Model 2007; 47:2089-97. [DOI: 10.1021/ci700172n] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yuri Binev
- REQUIMTE and CQFB, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| | - Maria M. B. Marques
- REQUIMTE and CQFB, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| | - João Aires-de-Sousa
- REQUIMTE and CQFB, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| |
Collapse
|
15
|
Golotvin SS, Vodopianov E, Pol R, Lefebvre BA, Williams AJ, Rutkowske RD, Spitzer TD. Automated structure verification based on a combination of 1D (1)H NMR and 2D (1)H - (13)C HSQC spectra. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2007; 45:803-13. [PMID: 17694570 DOI: 10.1002/mrc.2034] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
A method for structure validation based on the simultaneous analysis of a 1D (1)H NMR and 2D (1)H - (13)C single-bond correlation spectrum such as HSQC or HMQC is presented here. When compared with the validation of a structure by a 1D (1)H NMR spectrum alone, the advantage of including a 2D HSQC spectrum in structure validation is that it adds not only the information of (13)C shifts, but also which proton shifts they are directly coupled to, and an indication of which methylene protons are diastereotopic. The lack of corresponding peaks in the 2D spectrum that appear in the 1D (1)H spectrum, also gives a clear picture of which protons are attached to heteroatoms. For all these benefits, combined NMR verification was expected and found by all metrics to be superior to validation by 1D (1)H NMR alone. Using multiple real-life data sets of chemical structures and the corresponding 1D and 2D data, it was possible to unambiguously identify at least 90% of the correct structures. As part of this test, challenging incorrect structures, mostly regioisomers, were also matched with each spectrum set. For these incorrect structures, the false positive rate was observed as low as 6%.
Collapse
Affiliation(s)
- Sergey S Golotvin
- Advanced Chemistry Development, Inc., Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | | | | | |
Collapse
|
16
|
Dunkel R, Wu X. Identification of organic molecules from a structure database using proton and carbon NMR analysis results. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2007; 188:97-110. [PMID: 17631401 PMCID: PMC2096635 DOI: 10.1016/j.jmr.2007.06.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2007] [Revised: 06/04/2007] [Accepted: 06/14/2007] [Indexed: 05/16/2023]
Abstract
A compound is identified by matching its proton and/or carbon NMR spectra to NIH PubChem molecular structures. The matching process involves analyzing 1D proton, 1D carbon, DEPT, and/or HSQC spectra, and comparing the number of NMR resonances, detected proton and carbon shifts, likely number of methyl- and methoxy-groups, and an optionally specified molecular formula to predicted proton and carbon shifts of PubChem structures. A structure verification module rates the consistency between experimental spectral analysis results and a proposed structure (not limited to PubChem structures) and assigns observed shifts to the proposed structure. The spectral analysis, structure identification, and structure verification are largely automated in a software package and can be performed in minutes.
Collapse
Affiliation(s)
- Reinhard Dunkel
- ScienceSoft LLC, 9934 Pinehurst Drive, Sandy, UT 84092, USA.
| | | |
Collapse
|