1
|
Ratziu V, Hompesch M, Petitjean M, Serdjebi C, Iyer JS, Parwani AV, Tai D, Bugianesi E, Cusi K, Friedman SL, Lawitz E, Romero-Gómez M, Schuppan D, Loomba R, Paradis V, Behling C, Sanyal AJ. Artificial intelligence-assisted digital pathology for non-alcoholic steatohepatitis: current status and future directions. J Hepatol 2024; 80:335-351. [PMID: 37879461 DOI: 10.1016/j.jhep.2023.10.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Academic Contribution Register] [Received: 03/07/2023] [Revised: 08/28/2023] [Accepted: 10/09/2023] [Indexed: 10/27/2023]
Abstract
The worldwide prevalence of non-alcoholic steatohepatitis (NASH) is increasing, causing a significant medical burden, but no approved therapeutics are currently available. NASH drug development requires histological analysis of liver biopsies by expert pathologists for trial enrolment and efficacy assessment, which can be hindered by multiple issues including sample heterogeneity, inter-reader and intra-reader variability, and ordinal scoring systems. Consequently, there is a high unmet need for accurate, reproducible, quantitative, and automated methods to assist pathologists with histological analysis to improve the precision around treatment and efficacy assessment. Digital pathology (DP) workflows in combination with artificial intelligence (AI) have been established in other areas of medicine and are being actively investigated in NASH to assist pathologists in the evaluation and scoring of NASH histology. DP/AI models can be used to automatically detect, localise, quantify, and score histological parameters and have the potential to reduce the impact of scoring variability in NASH clinical trials. This narrative review provides an overview of DP/AI tools in development for NASH, highlights key regulatory considerations, and discusses how these advances may impact the future of NASH clinical management and drug development. This should be a high priority in the NASH field, particularly to improve the development of safe and effective therapeutics.
Collapse
Affiliation(s)
- Vlad Ratziu
- Sorbonne Université, ICAN Institute for Cardiometabolism and Nutrition, Hospital Pitié-Salpêtrière, INSERM UMRS 1138 CRC, Paris, France.
| | | | | | | | | | - Anil V Parwani
- Department of Pathology, The Ohio State University, Columbus, OH, USA
| | | | | | - Kenneth Cusi
- Division of Endocrinology, Diabetes and Metabolism, University of Florida, Gainesville, FL, USA
| | - Scott L Friedman
- Division of Liver Diseases, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eric Lawitz
- Texas Liver Institute, University of Texas Health San Antonio, San Antonio, TX, USA
| | - Manuel Romero-Gómez
- Hospital Universitario Virgen del Rocío, CiberEHD, Insituto de Biomedicina de Sevilla (HUVR/CSIC/US), Universidad de Sevilla, Seville, Spain
| | - Detlef Schuppan
- Institute of Translational Immunology and Department of Medicine, University Medical Center, Mainz, Germany; Department of Hepatology and Gastroenterology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Rohit Loomba
- NAFLD Research Center, University of California at San Diego, San Diego, CA, USA
| | - Valérie Paradis
- Université Paris Cité, Service d'Anatomie Pathologique, Hôpital Beaujon, Paris, France
| | | | - Arun J Sanyal
- Division of Gastroenterology, Hepatology and Nutrition, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
2
|
ACD/Structure Elucidator: 20 Years in the History of Development. Molecules 2021; 26:molecules26216623. [PMID: 34771032 PMCID: PMC8588187 DOI: 10.3390/molecules26216623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/19/2021] [Revised: 10/19/2021] [Accepted: 10/28/2021] [Indexed: 12/04/2022] Open
Abstract
The first methods associated with the Computer-Assisted Structure Elucidation (CASE) of small molecules were published over fifty years ago when spectroscopy and computer science were both in their infancy. The incredible leaps in both areas of technology could not have been envisaged at that time, but both have enabled CASE expert systems to achieve performance levels that in their present state can outperform many scientists in terms of speed to solution. The computer-assisted analysis of enormous matrices of data exemplified 1D and 2D high-resolution NMR spectroscopy datasets can easily solve what just a few years ago would have been deemed to be complex structures. While not a panacea, the application of such tools can provide support to even the most skilled spectroscopist. By this point the structures of a great number of molecular skeletons, including hundreds of complex natural products, have been elucidated using such programs. At this juncture, the expert system ACD/Structure Elucidator is likely the most advanced CASE system available and, being a commercial software product, is installed and used in many organizations. This article will provide an overview of the research and development required to pursue the lofty goals set almost two decades ago to facilitate highly automated approaches to solving complex structures from analytical spectroscopy data, using NMR as the primary data-type.
Collapse
|
3
|
Valli M, Russo HM, Pilon AC, Pinto MEF, Dias NB, Freire RT, Castro-Gamboa I, Bolzani VDS. Computational methods for NMR and MS for structure elucidation I: software for basic NMR. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 01/30/2023]
Abstract
Abstract
Structure elucidation is an important and sometimes time-consuming step for natural products research. This step has evolved in the past few years to a faster and more automated process due to the development of several computational programs and analytical techniques. In this paper, the topics of NMR prediction and CASE programs are addressed. Furthermore, the elucidation of natural peptides is discussed.
Collapse
|
4
|
Wolfender JL, Nuzillard JM, van der Hooft JJJ, Renault JH, Bertrand S. Accelerating Metabolite Identification in Natural Product Research: Toward an Ideal Combination of Liquid Chromatography–High-Resolution Tandem Mass Spectrometry and NMR Profiling, in Silico Databases, and Chemometrics. Anal Chem 2018; 91:704-742. [DOI: 10.1021/acs.analchem.8b05112] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 02/07/2023]
Affiliation(s)
- Jean-Luc Wolfender
- School of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, CMU, 1 Rue Michel Servet, 1211 Geneva 4, Switzerland
| | - Jean-Marc Nuzillard
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | | | - Jean-Hugues Renault
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | - Samuel Bertrand
- Groupe Mer, Molécules, Santé-EA 2160, UFR des Sciences Pharmaceutiques et Biologiques, Université de Nantes, 44035 Nantes, France
- ThalassOMICS Metabolomics Facility, Plateforme Corsaire, Biogenouest, 44035 Nantes, France
| |
Collapse
|
5
|
Nuzillard JM, Plainchont B. Tutorial for the structure elucidation of small molecules by means of the LSD software. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2018; 56:458-468. [PMID: 28543725 DOI: 10.1002/mrc.4612] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 04/03/2017] [Revised: 05/09/2017] [Accepted: 05/12/2017] [Indexed: 05/12/2023]
Abstract
Automatic structure elucidation of small molecules by means of the "logic for structure elucidation" (LSD) software is introduced in the context of the automatic exploitation of chemical shift correlation data and with minimal input from chemical shift values. The first step in solving a structural problem by means of LSD is the extraction of pertinent data from the 1D and 2D spectra. This operation requires the labeling of the resonances and of their correlations; its reliability highly depends on the quality of the spectra. The combination of COSY, HSQC, and HMBC spectra results in proximity relationships between nonhydrogen atoms that are associated in order to build the possible solutions of a problem. A simple molecule, camphor, serves as an example for the writing of an LSD input file and to show how solution structures are obtained. An input file for LSD must contain a nonambiguous description of each atom, or atom status, which includes the chemical element symbol, the hybridization state, the number of bound hydrogen atoms and the formal electric charge. In case of atom status ambiguity, the pyLSD program performs clarification by systematically generating the status of the atoms. PyLSD also proposes the use of the nmrshiftdb algorithm in order to rank the solutions of a problem according to the quality of the fit between the experimental carbon-13 chemical shifts, and the ones predicted from the proposed structures. To conclude, some hints toward future uses and developments of computer-assisted structure elucidation by LSD are proposed.
Collapse
|
6
|
Perez M. Autonomous driving in NMR. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2017; 55:15-21. [PMID: 27785822 DOI: 10.1002/mrc.4546] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 08/22/2016] [Revised: 10/20/2016] [Accepted: 10/24/2016] [Indexed: 06/06/2023]
Abstract
The automatic analysis of NMR data has been a much-desired endeavour for the last six decades, as it is the case with any other analytical technique. This need for automation has only grown as advances in hardware; pulse sequences and automation have opened new research areas to NMR and increased the throughput of data. Full automatic analysis is a worthy, albeit hard, challenge, but in a world of artificial intelligence, instant communication and big data, it seems that this particular fight is happening with only one technique at a time (let this be NMR, MS, IR, UV or any other), when the reality of most laboratories is that there are several types of analytical instrumentation present. Data aggregation, verification and elucidation by using complementary techniques (e.g. MS and NMR) is a desirable outcome to pursue, although a time-consuming one if performed manually; hence, the use of automation to perform the heavy lifting for users is required to make the approach attractive for scientists. Many of the decisions and workflows that could be implemented under automation will depend on the two-way communication with databases that understand analytical data, because it is desirable not only to query these databases but also to grow them in as much of an automatic manner as possible. How these databases are designed, set up and the data inside classified will determine what workflows can be implemented. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Manuel Perez
- Mestrelab Research, S.L. Feliciano Barrera 9B-Baixo, Santiago de Compostela, Spain
| |
Collapse
|
7
|
Buevich AV, Elyashberg ME. Synergistic Combination of CASE Algorithms and DFT Chemical Shift Predictions: A Powerful Approach for Structure Elucidation, Verification, and Revision. JOURNAL OF NATURAL PRODUCTS 2016; 79:3105-3116. [PMID: 28006916 DOI: 10.1021/acs.jnatprod.6b00799] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 06/06/2023]
Abstract
Structure elucidation of complex natural products and new organic compounds remains a challenging problem. To support this endeavor, CASE (computer-assisted structure elucidation) expert systems were developed. These systems are capable of generating a set of all possible structures consistent with an ensemble of 2D NMR data followed by selection of the most probable structure on the basis of empirical NMR chemical shift prediction. However, in some cases, empirical chemical shift prediction is incapable of distinguishing the correct structure. Herein, we demonstrate for the first time that the combination of CASE and density functional theory (DFT) methods for NMR chemical shift prediction allows the determination of the correct structure even in difficult situations. An expert system, ACD/Structure Elucidator, was used for the CASE analysis. This approach has been tested on three challenging natural products: aquatolide, coniothyrione, and chiral epoxyroussoenone. This work has demonstrated that the proposed synergistic approach is an unbiased, reliable, and very efficient structure verification and de novo structure elucidation method that can be applied to difficult structural problems when other experimental methods would be difficult or impossible to use.
Collapse
Affiliation(s)
- Alexei V Buevich
- Department of Discovery and Preclinical Sciences, Process Research and Development, NMR Structure Elucidation, Merck & Co., Inc. , Kenilworth, New Jersey 07033, United States
| | - Mikhail E Elyashberg
- Advanced Chemistry Development (ACD/Laboratories) , Akademik Bakulev Street 6, 117513 Moscow, Russian Federation
| |
Collapse
|
8
|
Gaudêncio SP, Pereira F. Dereplication: racing to speed up the natural products discovery process. Nat Prod Rep 2015; 32:779-810. [PMID: 25850681 DOI: 10.1039/c4np00134f] [Citation(s) in RCA: 167] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 12/23/2022]
Abstract
Covering: 1993-2014 (July)To alleviate the dereplication holdup, which is a major bottleneck in natural products discovery, scientists have been conducting their research efforts to add tools to their "bag of tricks" aiming to achieve faster, more accurate and efficient ways to accelerate the pace of the drug discovery process. Consequently dereplication has become a hot topic presenting a huge publication boom since 2012, blending multidisciplinary fields in new ways that provide important conceptual and/or methodological advances, opening up pioneering research prospects in this field.
Collapse
Affiliation(s)
- Susana P Gaudêncio
- LAQV, REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
| | | |
Collapse
|
9
|
Metabolomics for unknown plant metabolites. Anal Bioanal Chem 2013; 405:5005-11. [DOI: 10.1007/s00216-013-6869-2] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/30/2013] [Revised: 02/20/2013] [Accepted: 02/25/2013] [Indexed: 12/29/2022]
|
10
|
Moser A, Elyashberg ME, Williams AJ, Blinov KA, Dimartino JC. Blind trials of computer-assisted structure elucidation software. J Cheminform 2012; 4:5. [PMID: 22321892 PMCID: PMC3349476 DOI: 10.1186/1758-2946-4-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/29/2011] [Accepted: 02/09/2012] [Indexed: 11/15/2022] Open
Abstract
Background One of the largest challenges in chemistry today remains that of efficiently mining through vast amounts of data in order to elucidate the chemical structure for an unknown compound. The elucidated candidate compound must be fully consistent with the data and any other competing candidates efficiently eliminated without doubt by using additional data if necessary. It has become increasingly necessary to incorporate an in silico structure generation and verification tool to facilitate this elucidation process. An effective structure elucidation software technology aims to mimic the skills of a human in interpreting the complex nature of spectral data while producing a solution within a reasonable amount of time. This type of software is known as computer-assisted structure elucidation or CASE software. A systematic trial of the ACD/Structure Elucidator CASE software was conducted over an extended period of time by analysing a set of single and double-blind trials submitted by a global audience of scientists. The purpose of the blind trials was to reduce subjective bias. Double-blind trials comprised of data where the candidate compound was unknown to both the submitting scientist and the analyst. The level of expertise of the submitting scientist ranged from novice to expert structure elucidation specialists with experience in pharmaceutical, industrial, government and academic environments. Results Beginning in 2003, and for the following nine years, the algorithms and software technology contained within ACD/Structure Elucidator have been tested against 112 data sets; many of these were unique challenges. Of these challenges 9% were double-blind trials. The results of eighteen of the single-blind trials were investigated in detail and included problems of a diverse nature with many of the specific challenges associated with algorithmic structure elucidation such as deficiency in protons, structure symmetry, a large number of heteroatoms and poor quality spectral data. Conclusion When applied to a complex set of blind trials, ACD/Structure Elucidator was shown to be a very useful tool in advancing the computer's contribution to elucidating a candidate structure from a set of spectral data (NMR and MS) for an unknown. The synergistic interaction between humans and computers can be highly beneficial in terms of less biased approaches to elucidation as well as dramatic improvements in speed and throughput. In those cases where multiple candidate structures exist, ACD/Structure Elucidator is equipped to validate the correct structure and eliminate inconsistent candidates. Full elucidation can generally be performed in less than two hours; this includes the average spectral data processing time and data input.
Collapse
Affiliation(s)
- Arvin Moser
- Advanced Chemistry Development, Toronto Department, 110 Yonge Street, 14th floor, Toronto, Ontario, M5C 1T4, Canada.
| | | | | | | | | |
Collapse
|
11
|
Elyashberg M, Blinov K, Molodtsov S, Williams A. Elucidating 'undecipherable' chemical structures using computer-assisted structure elucidation approaches. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2012; 50:22-27. [PMID: 22259196 DOI: 10.1002/mrc.2849] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 07/23/2011] [Revised: 09/14/2011] [Accepted: 10/13/2011] [Indexed: 05/31/2023]
Abstract
Structure elucidation using 2D NMR data and application of traditional methods of structure elucidation are known to fail for certain problems. In this work, it is shown that computer-assisted structure elucidation methods are capable of solving such problems. We conclude that it is now impossible to evaluate the capabilities of novel NMR experimental techniques in isolation from expert systems developed for processing fuzzy, incomplete and contradictory information obtained from 2D NMR spectra.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow, 117513, Russia
| | | | | | | |
Collapse
|
12
|
|
13
|
Elyashberg ME, Blinov KA, Molodtsov SG, Smurnyi ED. New computer-assisted methods for the elucidation of molecular structure from 2-D spectra. JOURNAL OF ANALYTICAL CHEMISTRY 2011. [DOI: 10.1134/s1061934808010036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 11/23/2022]
|
14
|
Elyashberg M, Blinov K, Smurnyy Y, Churanova T, Williams A. Empirical and DFT GIAO quantum-mechanical methods of (13)C chemical shifts prediction: competitors or collaborators? MAGNETIC RESONANCE IN CHEMISTRY : MRC 2010; 48:219-229. [PMID: 20108257 DOI: 10.1002/mrc.2571] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/28/2023]
Abstract
The accuracy of (13)C chemical shift prediction by both DFT GIAO quantum-mechanical (QM) and empirical methods was compared using 205 structures for which experimental and QM-calculated chemical shifts were published in the literature. For these structures, (13)C chemical shifts were calculated using HOSE code and neural network (NN) algorithms developed within our laboratory. In total, 2531 chemical shifts were analyzed and statistically processed. It has been shown that, in general, QM methods are capable of providing similar but inferior accuracy to the empirical approaches, but quite frequently they give larger mean average error values. For the structural set examined in this work, the following mean absolute errors (MAEs) were found: MAE(HOSE) = 1.58 ppm, MAE(NN) = 1.91 ppm and MAE(QM) = 3.29 ppm. A strategy of combined application of both the empirical and DFT GIAO approaches is suggested. The strategy could provide a synergistic effect if the advantages intrinsic to each method are exploited.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev St, 117513 Moscow, Russian Federation
| | | | | | | | | |
Collapse
|
15
|
Elyashberg M, Williams AJ, Blinov K. Structural revisions of natural products by Computer-Assisted Structure Elucidation (CASE) systems. Nat Prod Rep 2010; 27:1296-328. [DOI: 10.1039/c002332a] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/21/2022]
|
16
|
Elyashberg M, Blinov K, Williams A. A systematic approach for the generation and verification of structural hypotheses. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:371-389. [PMID: 19197914 DOI: 10.1002/mrc.2397] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/27/2023]
Abstract
During the process of molecular structure elucidation the selection of the most probable structural hypothesis may be based on chemical shift prediction. The prediction is carried out using either empirical or quantum-mechanical (QM) methods. When QM methods are used, NMR prediction commonly utilizes the GIAO option of the DFT approximation. In this approach the structural hypotheses are expected to be investigated by scientist. In this article we hope to show that the most rational manner by which to create structural hypotheses is actually by the application of an expert system capable of deducing all potential structures consistent with the experimental spectral data and specifically using 2D NMR data. When an expert system is used the best structure(s) can be distinguished using chemical shift prediction, which is best performed either by an incremental or neural net algorithm. The time-consuming QM calculations can then be applied, if necessary, to one or more of the 'best' structures to confirm the suggested solution.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | |
Collapse
|
17
|
Elyashberg ME, Blinov KA, Williams AJ. The application of empirical methods of (13)C NMR chemical shift prediction as a filter for determining possible relative stereochemistry. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:333-341. [PMID: 19206140 DOI: 10.1002/mrc.2396] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/27/2023]
Abstract
The reliable determination of stereocenters contained within chemical structures usually requires utilization of NMR data, chemical derivatization, molecular modeling, quantum-mechanical (QM) calculations and, if available, X-ray analysis. In this article, we show that the number of stereoisomers which need to be thoroughly verified, can be significantly reduced by the application of NMR chemical shift calculation to the full stereoisomer set of possibilities using a fragmental approach based on HOSE codes. The applicability of this suggested method is illustrated using experimental data published for a series of complex chemical structures.
Collapse
Affiliation(s)
- Mikhail E Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | |
Collapse
|
18
|
Elyashberg M, Blinov K, Molodtsov S, Smurnyy Y, Williams AJ, Churanova T. Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream. J Cheminform 2009; 1:3. [PMID: 20142986 PMCID: PMC2816863 DOI: 10.1186/1758-2946-1-3] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/09/2009] [Accepted: 03/17/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND This article coincides with the 40 year anniversary of the first published works devoted to the creation of algorithms for computer-aided structure elucidation (CASE). The general principles on which CASE methods are based will be reviewed and the present state of the art in this field will be described using, as an example, the expert system Structure Elucidator. RESULTS The developers of CASE systems have been forced to overcome many obstacles hindering the development of a software application capable of drastically reducing the time and effort required to determine the structures of newly isolated organic compounds. Large complex molecules of up to 100 or more skeletal atoms with topological peculiarity can be quickly identified using the expert system Structure Elucidator based on spectral data. Logical analysis of 2D NMR data frequently allows for the detection of the presence of COSY and HMBC correlations of "nonstandard" length. Fuzzy structure generation provides a possibility to obtain the correct solution even in those cases when an unknown number of nonstandard correlations of unknown length are present in the spectra. The relative stereochemistry of big rigid molecules containing many stereocenters can be determined using the StrucEluc system and NOESY/ROESY 2D NMR data for this purpose. CONCLUSION The StrucEluc system continues to be developed in order to expand the general applicability, provide improved workflows, usability of the system and increased reliability of the results. It is expected that expert systems similar to that described in this paper will receive increasing acceptance in the next decade and will ultimately be integrated directly to analytical instruments for the purpose of organic analysis. Work in this direction is in progress. In spite of the fact that many difficulties have already been overcome to deliver on the spectroscopist's dream of "fully automated structure elucidation" there is still work to do. Nevertheless, as the efficiency of expert systems is enhanced the solution of increasingly complex structural problems will be achievable.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | | | |
Collapse
|
19
|
Williams AJ, Elyashberg ME, Blinov KA, Lankin DC, Martin GE, Reynolds WF, Porco JA, Singleton CA, Su S. Applying computer-assisted structure elucidation algorithms for the purpose of structure validation: revisiting the NMR assignments of hexacyclinol. JOURNAL OF NATURAL PRODUCTS 2008; 71:581-588. [PMID: 18257535 DOI: 10.1021/np070557t] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/25/2023]
Abstract
Computer-assisted structure elucidation (CASE) using a combination of 1D and 2D NMR data has been available for a number of years. These algorithms can be considered as "logic machines" capable of deriving all plausible structures from a set of structural constraints or "axioms", defined by the spectroscopic data and associated chemical information or prior knowledge. CASE programs allow the spectroscopist not only to determine structures from spectroscopic data but also to study the dependence of the proposed structure on changes to the set of axioms. In this article, we describe the application of the ACD/Structure Elucidator expert system to help resolve the conflict between two different hypothetical hexacyclinol structures derived by different researchers from the NMR spectra of this complex natural product. It has been shown that the combination of algorithms for both structure elucidation and structure validation delivered by the expert system enables the identification of the most probable structure as well as the associated chemical shift assignments.
Collapse
Affiliation(s)
- A J Williams
- ChemZoo, 904 Tamaras Circle, Wake Forest, North Carolina 27587, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Araya-Maturana R, Pessoa-Mahana H, Weiss-López B. Very Long-Range Correlations ( nJC,H n > 3) in HMBC Spectra. Nat Prod Commun 2008. [DOI: 10.1177/1934578x0800300321] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/15/2022] Open
Abstract
The structural elucidation of natural products and complex organic molecules relies heavily on the application of proton detected heteronuclear NMR. Among these techniques, the HMBC NMR experiment remains as the most popular among the methods that sample long range coupling constants. The HMBC (C-H) experiment allows the assignment of structural fragments through correlations between protons and carbons separated by more than one bond, usually two or three (2 JC,H and 3 JC,H). It is also possible to obtain valuable information, sometimes crucial, through very long-range, or nonstandard correlations, n JC,H n>3; they can, surprisingly, appear in standard HMBC spectra, or looked for by performing several HMBC experiments with different long-range delays and using a deeper threshold in the contour plot.
Collapse
Affiliation(s)
- Ramiro Araya-Maturana
- Department of Organic and Physical Chemistry, Faculty of Chemical and Pharmaceutical Sciences, University of Chile, P.O. Box, 233, Santiago 1, Chile
| | - Hernán Pessoa-Mahana
- Department of Organic and Physical Chemistry, Faculty of Chemical and Pharmaceutical Sciences, University of Chile, P.O. Box, 233, Santiago 1, Chile
| | - Boris Weiss-López
- Department of Chemistry, Faculty of Sciences, University of Chile, Santiago 1, Chile
| |
Collapse
|
21
|
Blinov KA, Smurnyy YD, Elyashberg ME, Churanova TS, Kvasha M, Steinbeck C, Lefebvre BA, Williams AJ. Performance validation of neural network based (13)c NMR prediction using a publicly available data source. J Chem Inf Model 2008; 48:550-5. [PMID: 18293952 DOI: 10.1021/ci700363r] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
The validation of the performance of a neural network based 13C NMR prediction algorithm using a test set available from an open source publicly available database, NMRShiftDB, is described. The validation was performed using a version of the database containing ca. 214,000 chemical shifts as well as for two subsets of the database to compare performance when overlap with the training set is taken into account. The first subset contained ca. 93,000 chemical shifts that were absent from the ACD\CNMR DB, the "excluded shift set" used for training of the neural network and the ACD\CNMR prediction algorithm, while the second contained ca. 121,000 shifts that were present in the ACD\CNMR DB training set, the "included shift set". This work has shown that the mean error between experimental and predicted shifts for the entire database is 1.59 ppm, while the mean deviation for the subset with included shifts is 1.47 and 1.74 ppm for excluded shifts. Since similar work has been reported online for another algorithm we compared the results with the errors determined using Robien's CNMR Neural Network Predictor using the entire NMRShiftDB for program validation.
Collapse
Affiliation(s)
- K A Blinov
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Smurnyy YD, Blinov KA, Churanova TS, Elyashberg ME, Williams AJ. Toward more reliable 13C and 1H chemical shift prediction: a systematic comparison of neural-network and least-squares regression based approaches. J Chem Inf Model 2007; 48:128-34. [PMID: 18052244 DOI: 10.1021/ci700256n] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
The efficacy of neural network (NN) and partial least-squares (PLS) methods is compared for the prediction of NMR chemical shifts for both 1H and 13C nuclei using very large databases containing millions of chemical shifts. The chemical structure description scheme used in this work is based on individual atoms rather than functional groups. The performances of each of the methods were optimized in a systematic manner described in this work. Both of the methods, least-squares and neural network analyses, produce results of a very similar quality, but the least-squares algorithm is approximately 2--3 times faster.
Collapse
Affiliation(s)
- Yegor D Smurnyy
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation, and Advanced Chemistry Development, Inc., 110 Yonge Street, 14th Floor, Toronto, Ontario, Canada M5C 1T4
| | | | | | | | | |
Collapse
|
23
|
Elyashberg ME, Blinov KA, Molodtsov SG, Williams AJ, Martin GE. Fuzzy Structure Generation: A New Efficient Tool for Computer-Aided Structure Elucidation (CASE). J Chem Inf Model 2007; 47:1053-66. [PMID: 17385849 DOI: 10.1021/ci600528g] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
Contemporary Computer-Aided Structure Elucidation (CASE) systems are heavily based on the utilization of 2D NMR spectra. The utilization of HMBC/GHMBC and COSY/GCOSY correlations generally assumes that these correlations result from (2-3)JCH and (2-3)JHH spin-spin couplings, respectively, and consequently these values are used as the default setting in these systems. Our previous studies1,2 have shown that about half of the problems studied actually contain some correlations of 4-6 bonds, so-called "nonstandard" correlations. In such cases the initial 2D NMR data are contradictory, and the correct solution is therefore not directly attainable. Unfortunately nonstandard correlations and the number of intervening bonds usually cannot be identified experimentally. In this work we suggest a new approach that we term Fuzzy Structure Generation. This allows the solution of structural problems whose 2D NMR data contain an unknown number of nonstandard correlations having different and unknown lengths. Suggested methods for the application of Fuzzy Structure Generation are described, and their application is illustrated by a series of real-world examples. We conclude that Fuzzy Structure Generation is efficient, and there is no real alternative at present in terms of a universal practical method for the structure elucidation of organic molecules from 2D NMR data.
Collapse
Affiliation(s)
- Mikhail E Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | |
Collapse
|