1
|
Maglangit F, Wang S, Moser A, Kyeremeh K, Trembleau L, Zhou Y, Clark DJ, Tabudravu J, Deng H. Accraspiroketides A-B, Phenylnaphthacenoid-Derived Polyketides with Unprecedented [6 + 6+6 + 6] + [5 + 5] Spiro-Architecture. JOURNAL OF NATURAL PRODUCTS 2024; 87:831-836. [PMID: 38551509 DOI: 10.1021/acs.jnatprod.3c01012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 08/15/2024]
Abstract
Two novel polyketides, accraspiroketides A (1) and B (2), which feature unprecedented [6 + 6+6 + 6] + [5 + 5] spiro chemical architectures, were isolated from Streptomyces sp. MA37 ΔaccJ mutant strain. Compounds 1-2 exhibit excellent activity against Gram-positive bacteria (MIC = 1.5-6.3 μg/mL). Notably, 1 and 2 have superior activity against clinically isolated Enterococcus faecium K60-39 (MIC = 4.0 μg/mL and 4.7 μg/mL, respectively) than ampicillin (MIC = 25 μg/mL).
Collapse
Affiliation(s)
- Fleurdeliz Maglangit
- Department of Biology and Environmental Science, College of Science, University of the Philippines Cebu, Gorordo Ave., Lahug, Cebu City, 6000 Philippines
| | - Shan Wang
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao 266237, People's Republic of China
| | - Arvin Moser
- ACD/Laboratories, Advanced Chemistry Development, Toronto Department, 8 King Street East, Suite 107, Toronto, Ontario M5C 1B5, Canada
| | - Kwaku Kyeremeh
- Department of Chemistry, University of Ghana, Accra LG56, Ghana
| | - Laurent Trembleau
- Organic and Medicinal Chemistry, Marine Biodiscovery Centre and Laboratory of Supramolecular Chemistry, School of Natural and Computing Sciences, Aberdeen AB24 3UE, Scotland, U.K
| | - Yongjun Zhou
- Research Center for Marine Drugs, State Key Laboratory of Oncogenes and Related Genes, Department of Pharmacy, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200127, People's Republic of China
| | - David James Clark
- EastChem, School of Chemistry, University of Edinburgh, Edinburgh EH9 3FJ, Scotland, U.K
| | - Jioji Tabudravu
- School of Pharmacy and Biomedical Sciences, University of Central Lancashire, Preston, Lancashire PR1 2HE, England, U.K
| | - Hai Deng
- Department of Chemistry, University of Aberdeen, Aberdeen AB24 3UE, Scotland, U.K
| |
Collapse
|
2
|
Gaudêncio SP, Bayram E, Lukić Bilela L, Cueto M, Díaz-Marrero AR, Haznedaroglu BZ, Jimenez C, Mandalakis M, Pereira F, Reyes F, Tasdemir D. Advanced Methods for Natural Products Discovery: Bioactivity Screening, Dereplication, Metabolomics Profiling, Genomic Sequencing, Databases and Informatic Tools, and Structure Elucidation. Mar Drugs 2023; 21:md21050308. [PMID: 37233502 DOI: 10.3390/md21050308] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/23/2023] [Revised: 05/11/2023] [Accepted: 05/12/2023] [Indexed: 05/27/2023] Open
Abstract
Natural Products (NP) are essential for the discovery of novel drugs and products for numerous biotechnological applications. The NP discovery process is expensive and time-consuming, having as major hurdles dereplication (early identification of known compounds) and structure elucidation, particularly the determination of the absolute configuration of metabolites with stereogenic centers. This review comprehensively focuses on recent technological and instrumental advances, highlighting the development of methods that alleviate these obstacles, paving the way for accelerating NP discovery towards biotechnological applications. Herein, we emphasize the most innovative high-throughput tools and methods for advancing bioactivity screening, NP chemical analysis, dereplication, metabolite profiling, metabolomics, genome sequencing and/or genomics approaches, databases, bioinformatics, chemoinformatics, and three-dimensional NP structure elucidation.
Collapse
Affiliation(s)
- Susana P Gaudêncio
- Associate Laboratory i4HB-Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University Lisbon, 2819-516 Caparica, Portugal
- UCIBIO-Applied Molecular Biosciences Unit, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Engin Bayram
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Lada Lukić Bilela
- Department of Biology, Faculty of Science, University of Sarajevo, 71000 Sarajevo, Bosnia and Herzegovina
| | - Mercedes Cueto
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
| | - Ana R Díaz-Marrero
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
- Instituto Universitario de Bio-Orgánica (IUBO), Universidad de La Laguna, 38206 La Laguna, Spain
| | - Berat Z Haznedaroglu
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Carlos Jimenez
- CICA- Centro Interdisciplinar de Química e Bioloxía, Departamento de Química, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain
| | - Manolis Mandalakis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, HCMR Thalassocosmos, 71500 Gournes, Crete, Greece
| | - Florbela Pereira
- LAQV, REQUIMTE, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Fernando Reyes
- Fundación MEDINA, Avda. del Conocimiento 34, 18016 Armilla, Spain
| | - Deniz Tasdemir
- GEOMAR Centre for Marine Biotechnology (GEOMAR-Biotech), Research Unit Marine Natural Products Chemistry, GEOMAR Helmholtz Centre for Ocean Research Kiel, Am Kiel-Kanal 44, 24106 Kiel, Germany
- Faculty of Mathematics and Natural Science, Kiel University, Christian-Albrechts-Platz 4, 24118 Kiel, Germany
| |
Collapse
|
3
|
Sahayasheela VJ, Lankadasari MB, Dan VM, Dastager SG, Pandian GN, Sugiyama H. Artificial intelligence in microbial natural product drug discovery: current and emerging role. Nat Prod Rep 2022; 39:2215-2230. [PMID: 36017693 PMCID: PMC9931531 DOI: 10.1039/d2np00035k] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 12/30/2022]
Abstract
Covering: up to the end of 2022Microorganisms are exceptional sources of a wide array of unique natural products and play a significant role in drug discovery. During the golden era, several life-saving antibiotics and anticancer agents were isolated from microbes; moreover, they are still widely used. However, difficulties in the isolation methods and repeated discoveries of the same molecules have caused a setback in the past. Artificial intelligence (AI) has had a profound impact on various research fields, and its application allows the effective performance of data analyses and predictions. With the advances in omics, it is possible to obtain a wealth of information for the identification, isolation, and target prediction of secondary metabolites. In this review, we discuss drug discovery based on natural products from microorganisms with the help of AI and machine learning.
Collapse
Affiliation(s)
- Vinodh J Sahayasheela
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan.
| | - Manendra B Lankadasari
- Thoracic Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Vipin Mohan Dan
- Microbiology Division, Jawaharlal Nehru Tropical Botanic Garden and Research Institute, Thiruvananthapuram, Kerala, India
| | - Syed G Dastager
- NCIM Resource Centre, Division of Biochemical Sciences, CSIR - National Chemical Laboratory, Pune, Maharashtra, India
| | - Ganesh N Pandian
- Institute for Integrated Cell-Material Sciences (WPI-iCeMS), Kyoto University, Yoshida-Ushinomaecho, Sakyo-Ku, Kyoto 606-8501, Japan
| | - Hiroshi Sugiyama
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan.
- Institute for Integrated Cell-Material Sciences (WPI-iCeMS), Kyoto University, Yoshida-Ushinomaecho, Sakyo-Ku, Kyoto 606-8501, Japan
| |
Collapse
|
4
|
Moreira LMG, Junker J. Sampling CASE Application for the Quality Control of Published Natural Product Structures. Molecules 2021; 26:molecules26247543. [PMID: 34946623 PMCID: PMC8708086 DOI: 10.3390/molecules26247543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/22/2021] [Revised: 09/06/2021] [Accepted: 10/19/2021] [Indexed: 12/03/2022] Open
Abstract
Structure elucidation with NMR correlation data is dicey, as there is no way to tell how ambiguous the data set is and how reliably it will define a constitution. Many different software tools for computer assisted structure elucidation (CASE) have become available over the past decades, all of which could ensure a better quality of the elucidation process, but their use is still not common. Since 2011, WebCocon has integrated the possibility to generate theoretical NMR correlation data, starting from an existing structural proposal, allowing this theoretical data then to be used for CASE. Now, WebCocon can also read the recently presented NMReDATA format, allowing for uncomplicated access to CASE with experimental data. With these capabilities, WebCocon presents itself as an easily accessible Web-Tool for the quality control of proposed new natural products. Results of this application to several molecules from literature are shown and demonstrate how CASE can contribute to improve the reliability of Structure elucidation with NMR correlation data.
Collapse
|
5
|
ACD/Structure Elucidator: 20 Years in the History of Development. Molecules 2021; 26:molecules26216623. [PMID: 34771032 PMCID: PMC8588187 DOI: 10.3390/molecules26216623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/19/2021] [Revised: 10/19/2021] [Accepted: 10/28/2021] [Indexed: 12/04/2022] Open
Abstract
The first methods associated with the Computer-Assisted Structure Elucidation (CASE) of small molecules were published over fifty years ago when spectroscopy and computer science were both in their infancy. The incredible leaps in both areas of technology could not have been envisaged at that time, but both have enabled CASE expert systems to achieve performance levels that in their present state can outperform many scientists in terms of speed to solution. The computer-assisted analysis of enormous matrices of data exemplified 1D and 2D high-resolution NMR spectroscopy datasets can easily solve what just a few years ago would have been deemed to be complex structures. While not a panacea, the application of such tools can provide support to even the most skilled spectroscopist. By this point the structures of a great number of molecular skeletons, including hundreds of complex natural products, have been elucidated using such programs. At this juncture, the expert system ACD/Structure Elucidator is likely the most advanced CASE system available and, being a commercial software product, is installed and used in many organizations. This article will provide an overview of the research and development required to pursue the lofty goals set almost two decades ago to facilitate highly automated approaches to solving complex structures from analytical spectroscopy data, using NMR as the primary data-type.
Collapse
|
6
|
Köck M, Lindel T, Junker J. Incorporation of 4J-HMBC and NOE Data into Computer-Assisted Structure Elucidation with WebCocon. Molecules 2021; 26:molecules26164846. [PMID: 34443433 PMCID: PMC8398166 DOI: 10.3390/molecules26164846] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/22/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 01/13/2023] Open
Abstract
Over the past decades, different software programs have been developed for the Computer-Assisted Structure Elucidation (CASE) with NMR data using with various approaches. WebCocon is one of them that has been continuously improved over the past 20 years. Here, we present the inclusion of 4JCH correlations (4J-HMBC) in the HMBC interpretation of Cocon and NOE data in WebCocon. The 4J-HMBC data is used during the structure generation process, while the NOE data is used in post-processing of the results. The marine natural product oxocyclostylidol was selected to demonstrate WebCocon’s enhanced HMBC data processing capabilities. A systematic study of the 4JCH correlations of oxocyclostylidol was performed. The application of NOEs in CASE is demonstrated using the NOE correlations of the diterpene pyrone asperginol A known from the literature. As a result, we obtained a conformation that corresponds very well to the existing X-ray structure.
Collapse
Affiliation(s)
- Matthias Köck
- Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, 27570 Bremerhaven, Germany
- Correspondence: (M.K.); (J.J.)
| | - Thomas Lindel
- Institute of Organic Chemistry, Technical University of Braunschweig, 38106 Braunschweig, Germany;
| | - Jochen Junker
- Oswaldo Cruz Foundation–CDTS, Rio de Janeiro 21040-900, Brazil
- Correspondence: (M.K.); (J.J.)
| |
Collapse
|
7
|
Elyashberg M, Argyropoulos D. Computer Assisted Structure Elucidation (CASE): Current and future perspectives. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2021; 59:669-690. [PMID: 33197069 DOI: 10.1002/mrc.5115] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 08/20/2020] [Revised: 10/31/2020] [Accepted: 11/08/2020] [Indexed: 06/11/2023]
Abstract
The first efforts for the development of methods for Computer-Assisted Structure Elucidation (CASE) were published more than 50 years ago. CASE expert systems based on one-dimensional (1D) and two-dimensional (2D) Nuclear Magnetic Resonance (NMR) data have matured considerably by now. The structures of a great number of complex natural products have been elucidated and/or revised using such programs. In this article, we discuss the most likely directions in which CASE will evolve. We act on the premise that a synergistic interaction exists between CASE, new NMR experiments, and methods of computational chemistry, which are continuously being improved. The new developments in NMR experiments (long-range correlation experiments, pure-shift methods, coupling constants measurement and prediction, residual dipolar couplings [RDCs]), and residual chemical shift anisotropies [RCSAs], evolution of density functional theory (DFT), and machine learning algorithms will have an influence on CASE systems and vice versa. This is true also for new techniques for chemical analysis (Atomic Force Microscopy [AFM], "crystalline sponge" X-ray analysis, and micro-Electron Diffraction [micro-ED]), which will be used in combination with expert systems. We foresee that CASE will be utilized widely and become a routine tool for NMR spectroscopists and analysts in academic and industrial laboratories. We believe that the "golden age" of CASE is still in the future.
Collapse
|
8
|
Stonik VA, Kolesnikova SA. Malabaricane and Isomalabaricane Triterpenoids, Including Their Glycoconjugated Forms. Mar Drugs 2021; 19:327. [PMID: 34198756 PMCID: PMC8228503 DOI: 10.3390/md19060327] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/01/2021] [Revised: 05/25/2021] [Accepted: 06/03/2021] [Indexed: 12/22/2022] Open
Abstract
In this review, we discuss structural diversity, taxonomic distribution, biological activities, biogenesis, and synthesis of a rare group of terpenoids, the so-called malabaricane and isomalabaricane triterpenoids, as well as some compounds derived from them. Representatives of these groups were found in some higher and lower terrestrial plants, as well as in some fungi, and in a relatively small group of marine sponges. The skeletal systems of malabaricanes and isomalabaricanes are similar to each other, but differ principally in the stereochemistry of their tricyclic core fragments, consisting of two six-membered and one five-membered rings. Evolution of these triterpenoids provides variety of rearranged, oxidized, and glycoconjugated products. These natural compounds have attracted a lot of attention for their biosynthetic origin and biological activity, especially for their extremely high cytotoxicity against tumor cells as well as promising neuroprotective properties in nanomolar concentrations.
Collapse
Affiliation(s)
- Valentin A. Stonik
- G.B. Elyakov Pacific Institute of Bioorganic Chemistry, Far Eastern Branch of Russian Academy of Sciences, Pr. 100-let Vladivostoku 159, 690022 Vladivostok, Russia
- School of Natural Sciences, Far Eastern Federal University, Sukhanova Str. 8, 690000 Vladivostok, Russia
| | - Sophia A. Kolesnikova
- G.B. Elyakov Pacific Institute of Bioorganic Chemistry, Far Eastern Branch of Russian Academy of Sciences, Pr. 100-let Vladivostoku 159, 690022 Vladivostok, Russia
| |
Collapse
|
9
|
Burns DC, Reynolds WF. Minimizing the risk of deducing wrong natural product structures from NMR data. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2021; 59:500-533. [PMID: 33855734 DOI: 10.1002/mrc.4933] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 04/27/2019] [Revised: 07/31/2019] [Accepted: 08/01/2019] [Indexed: 06/12/2023]
Abstract
There continues to be a disturbing number of natural products reported in the literature whose structures are incorrect. At least in part, this reflects the fact that many natural product chemists have limited formal nuclear magnetic resonance training. Gaps in training and lack of awareness regarding the challenges and ambiguities associated with two-dimensional nuclear magnetic resonance data interpretation can easily lead to errors in structure elucidation. The purpose of this tutorial is to point out some of these issues, highlight the kinds of errors that have been made and provide specific advice on how to avoid these missteps such that the risk of reporting a wrong structure is minimized.
Collapse
Affiliation(s)
- Darcy C Burns
- Department of Chemistry, University of Toronto, Toronto, Ontario, Canada
| | - William F Reynolds
- Department of Chemistry, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Xiao J, Wang Y, Yang Y, Liu J, Chen G, Lin B, Hou Y, Li N. Natural potential neuroinflammatory inhibitors from Stephania epigaea H.S. Lo. Bioorg Chem 2020; 107:104597. [PMID: 33450546 DOI: 10.1016/j.bioorg.2020.104597] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/10/2020] [Revised: 12/22/2020] [Accepted: 12/22/2020] [Indexed: 12/23/2022]
Abstract
Stephania epigaea H. S. Lo is a folk medicine widely distributed in the south of China, especially in Yunnan and Guangxi province. An in vitro anti-neuroinflammatory study showed that total alkaloids of it can potently inhibit LPS-induced NO releasing of BV2 cells with an IC50 value of 10.05 ± 2.03 μg/mL (minocycline as the positive drug, IC50 15.49 ± 2.14 μM). The phytochemical investigation of the total alkaloids afforded three new phenanthrene (1-3), two lactams (4a, 4b), and nine aporphine derivatives (5-13). The final structure of 1 was identified by computer-assisted structure elucidation (ACD/Structure Elucidator software and the 13C NMR calculation with GIAO method) due to many possibilities of the substituent pattern. All isolates were evaluated for their anti-neuroinflammatory effects, and as a result, 5, 8, 10, and 11 exhibited stronger inhibitory activities than the minocycline. The results suggested S. epigaea could provide potential therapeutic agents for neurodegenerative diseases.
Collapse
Affiliation(s)
- Jiao Xiao
- School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Wenhua Road 103, Shenyang 110016, People's Republic of China
| | - Yingjie Wang
- School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Wenhua Road 103, Shenyang 110016, People's Republic of China
| | - Yanqiu Yang
- College of Life and Health Sciences, Northeastern University, Shenyang 110004, People's Republic of China
| | - Jingyu Liu
- College of Life and Health Sciences, Northeastern University, Shenyang 110004, People's Republic of China
| | - Gang Chen
- School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Wenhua Road 103, Shenyang 110016, People's Republic of China
| | - Bin Lin
- School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Yue Hou
- College of Life and Health Sciences, Northeastern University, Shenyang 110004, People's Republic of China.
| | - Ning Li
- School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Wenhua Road 103, Shenyang 110016, People's Republic of China.
| |
Collapse
|
11
|
Duan ZK, Lv TM, Song GS, Wang YX, Lin B, Huang XX. Structure reassignment of two triterpenes with CASE algorithms and DFT chemical shift predictions. Nat Prod Res 2020; 36:229-236. [PMID: 32524840 DOI: 10.1080/14786419.2020.1777122] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 01/29/2023]
Abstract
Two triterpenes (14S,17S,20S,24R)-25-hydroxy-14,17-cyclo-20,24-epoxy-malabarican-3-one (CEM, 1a) and (14S,17S,20S,24R)-20,24,25-trihydroxy-14,17-cyclomalabarican-3-one (CM, 2a) with a cyclobutane ring were reported, which have the same NMR data as ocotillone (1b) and gardaubryone C (2b), respectively. An incorrect structure might be reported. Therefore, the structure reanalysis of these triterpenes was achieved by CASE algorithm and DFT chemical shift predictions, and the results showed that the structures of CEM and CM might be incorrect. To further verify the structure of compound 1, the HMBC, 1H-1H COSY and HSQC-TOCSY spectra were employed. Herein, we revised the structure of CEM and CM, and our study also showed that CASE algorithm and DFT chemical shift predictions can hold the post of effective structure reassignment method.
Collapse
Affiliation(s)
- Zhi-Kang Duan
- Key Laboratory of Computational Chemistry-Based Natural Antitumor Drug Research & Development, Liaoning Province, School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, People's Republic of China
| | - Tian-Ming Lv
- Key Laboratory of Computational Chemistry-Based Natural Antitumor Drug Research & Development, Liaoning Province, School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, People's Republic of China
| | - Guan-Shan Song
- School of Pharmacy, China Pharmaceutical University, Nanjing, People's Republic of China
| | - Yu-Xi Wang
- Key Laboratory of Computational Chemistry-Based Natural Antitumor Drug Research & Development, Liaoning Province, School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, People's Republic of China
| | - Bin Lin
- School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang, People's Republic of China
| | - Xiao-Xiao Huang
- Key Laboratory of Computational Chemistry-Based Natural Antitumor Drug Research & Development, Liaoning Province, School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, People's Republic of China
| |
Collapse
|
12
|
Buevich AV, Elyashberg ME. Enhancing computer-assisted structure elucidation with DFT analysis of J-couplings. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2020; 58:594-606. [PMID: 31916609 DOI: 10.1002/mrc.4996] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 11/27/2019] [Revised: 01/06/2020] [Accepted: 01/07/2020] [Indexed: 06/10/2023]
Abstract
Computer-assisted structure elucidation (CASE) is the class of expert systems that derives molecular structures primarily from one-dimensional and two-dimensional nuclear magnetic resonance data. Contemporary CASE systems, including Advanced Chemistry Development/Structure Elucidator (ACD/SE), consider cross-peaks in heteronuclear multiple bond coherence (HMBC) and correlation spectroscopy (COSY) spectra as two- or three-bond correlations by default. However, four and more bond correlations (nonstandard correlations [NSCs]) could be present in these spectra too. The indiscriminate addition of NSCs to the CASE computations is prohibitively expensive. To address this problem, the ACD/SE program performs a logical analysis of observed correlations and determines the minimum number of NSCs. Guided by this information, a more efficient fuzzy structure generation (FSG) algorithm is subsequently applied. Until now, the FSG algorithm was utilized without any verification of the reliability of found NSCs. Here, we report a verification method for NSCs based on the relationship between NSCs and J-couplings computed with high accuracy density functional theory (DFT) methods. We used the example of strychnine to show that 41 (32%) of 8-Hz HMBC cross-peaks were NSCs and were consistent with 4-6 JCH couplings greater than 0.3 Hz. This cutoff value was largely confirmed by the analysis of NSCs in 11 real-world natural products elucidated by ACD/SE. Additionally, utilizing the example of the CASE study of cleospinol A, we showed that the DFT-computed J-couplings of NSCs can distinctively differentiate the correct structure among six proposed isomers. The proposed approach of NSC verification should further improve the robustness of CASE analysis and can help reveal potential problems with reported experimental data.
Collapse
Affiliation(s)
- Alexei V Buevich
- Department of Discovery and Preclinical Sciences, Process Research and Development, NMR Structure Elucidation, Merck & Co., Inc, Kenilworth, NJ
| | - Mikhail E Elyashberg
- Moscow Department, Advanced Chemistry Development (ACD/Laboratories), Moscow, Russia
| |
Collapse
|
13
|
Burns DC, Mazzola EP, Reynolds WF. The role of computer-assisted structure elucidation (CASE) programs in the structure elucidation of complex natural products. Nat Prod Rep 2019; 36:919-933. [DOI: 10.1039/c9np00007k] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 01/11/2023]
Abstract
Computer-assisted structure elucidation can help to determine the structures of complex natural products while minimizing the risk of structure errors.
Collapse
Affiliation(s)
- Darcy C. Burns
- Department of Chemistry
- University of Toronto
- Toronto
- Canada
| | - Eugene P. Mazzola
- Department of Chemistry & Biochemistry
- University of Maryland
- College Park
- USA
| | | |
Collapse
|
14
|
Pereira F, Aires-de-Sousa J. Computational Methodologies in the Exploration of Marine Natural Product Leads. Mar Drugs 2018; 16:md16070236. [PMID: 30011882 PMCID: PMC6070892 DOI: 10.3390/md16070236] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/14/2018] [Revised: 07/02/2018] [Accepted: 07/06/2018] [Indexed: 12/18/2022] Open
Abstract
Computational methodologies are assisting the exploration of marine natural products (MNPs) to make the discovery of new leads more efficient, to repurpose known MNPs, to target new metabolites on the basis of genome analysis, to reveal mechanisms of action, and to optimize leads. In silico efforts in drug discovery of NPs have mainly focused on two tasks: dereplication and prediction of bioactivities. The exploration of new chemical spaces and the application of predicted spectral data must be included in new approaches to select species, extracts, and growth conditions with maximum probabilities of medicinal chemistry novelty. In this review, the most relevant current computational dereplication methodologies are highlighted. Structure-based (SB) and ligand-based (LB) chemoinformatics approaches have become essential tools for the virtual screening of NPs either in small datasets of isolated compounds or in large-scale databases. The most common LB techniques include Quantitative Structure–Activity Relationships (QSAR), estimation of drug likeness, prediction of adsorption, distribution, metabolism, excretion, and toxicity (ADMET) properties, similarity searching, and pharmacophore identification. Analogously, molecular dynamics, docking and binding cavity analysis have been used in SB approaches. Their significance and achievements are the main focus of this review.
Collapse
Affiliation(s)
- Florbela Pereira
- LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
| | - Joao Aires-de-Sousa
- LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
| |
Collapse
|
15
|
Buevich AV, Elyashberg ME. Towards unbiased and more versatile NMR-based structure elucidation: A powerful combination of CASE algorithms and DFT calculations. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2018; 56:493-504. [PMID: 28833470 DOI: 10.1002/mrc.4645] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 06/30/2017] [Revised: 08/08/2017] [Accepted: 08/13/2017] [Indexed: 06/07/2023]
Abstract
Computer-assisted structure elucidation (CASE) is composed of two steps: (a) generation of all possible structural isomers for a given molecular formula and 2D NMR data (COSY, HSQC, and HMBC) and (b) selection of the correct isomer based on empirical chemical shift predictions. This method has been very successful in solving structural problems of small organic molecules and natural products. However, CASE applications are generally limited to structural isomer problems and can sometimes be inconclusive due to insufficient accuracy of empirical shift predictions. Here, we report a synergistic combination of a CASE algorithm and density functional theory calculations that broadens the range of amenable structural problems to encompass proton-deficient molecules, molecules with heavy elements (e.g., halogens), conformationally flexible molecules, and configurational isomers.
Collapse
Affiliation(s)
- Alexei V Buevich
- Discovery and Preclinical Sciences, Process and Analytical Chemistry, NMR Structure Elucidation, Merck & Co., Inc., Kenilworth, NJ, 07033, USA
| | | |
Collapse
|
16
|
Kikuchi J, Ito K, Date Y. Environmental metabolomics with data science for investigating ecosystem homeostasis. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2018; 104:56-88. [PMID: 29405981 DOI: 10.1016/j.pnmrs.2017.11.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 11/08/2017] [Revised: 11/19/2017] [Accepted: 11/19/2017] [Indexed: 05/08/2023]
Abstract
A natural ecosystem can be viewed as the interconnections between complex metabolic reactions and environments. Humans, a part of these ecosystems, and their activities strongly affect the environments. To account for human effects within ecosystems, understanding what benefits humans receive by facilitating the maintenance of environmental homeostasis is important. This review describes recent applications of several NMR approaches to the evaluation of environmental homeostasis by metabolic profiling and data science. The basic NMR strategy used to evaluate homeostasis using big data collection is similar to that used in human health studies. Sophisticated metabolomic approaches (metabolic profiling) are widely reported in the literature. Further challenges include the analysis of complex macromolecular structures, and of the compositions and interactions of plant biomass, soil humic substances, and aqueous particulate organic matter. To support the study of these topics, we also discuss sample preparation techniques and solid-state NMR approaches. Because NMR approaches can produce a number of data with high reproducibility and inter-institution compatibility, further analysis of such data using machine learning approaches is often worthwhile. We also describe methods for data pretreatment in solid-state NMR and for environmental feature extraction from heterogeneously-measured spectroscopic data by machine learning approaches.
Collapse
Affiliation(s)
- Jun Kikuchi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Graduate School of Bioagricultural Sciences, Nagoya University, 1 Furo-cho, Chikusa-ku, Nagoya, Aichi 464-0810, Japan.
| | - Kengo Ito
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yasuhiro Date
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan; Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
17
|
Navarro-Vázquez A, Gil RR, Blinov K. Computer-Assisted 3D Structure Elucidation (CASE-3D) of Natural Products Combining Isotropic and Anisotropic NMR Parameters. JOURNAL OF NATURAL PRODUCTS 2018; 81:203-210. [PMID: 29323895 DOI: 10.1021/acs.jnatprod.7b00926] [Citation(s) in RCA: 97] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/26/2023]
Abstract
A computer-assisted structural elucidation (CASE-3D) strategy based on the use of isotropic and/or anisotropic NMR data is proposed to elucidate relative configuration and preferred conformation in complex natural products. The methodology involves the selection of conformational models through the use of the Akaike Information Criterion and scoring of the different configurations. As illustrative examples, the methodology furnished the correct configuration of the already known compounds artemisinin (1) and homodimericin A (2). Revised structures (5 and 6), including their absolute configuration, for the recently reported curcusones I (3) and J (4) are proposed.
Collapse
Affiliation(s)
- Armando Navarro-Vázquez
- Departamento de Química Fundamental, Universidade Federal de Pernambuco , Avenida Professor Moraes Rego, 1235, Cidade Universitária, 50670-901 Recife, PE, Brazil
| | - Roberto R Gil
- Department of Chemistry, Carnegie Mellon University , 4400 Fifth Avenue, Pittsburgh, Pennsylvania 15213, United States
| | - Kirill Blinov
- MestReLab Research S. L. Feliciano Barrera , 9 Baixo, Santiago de Compostela, A Coruña, 15706 Spain
| |
Collapse
|
18
|
Harn YC, Su BH, Ku YL, Lin OA, Chou CF, Tseng YJ. NP-StructurePredictor: Prediction of Unknown Natural Products in Plant Mixtures. J Chem Inf Model 2017; 57:3138-3148. [PMID: 29131618 DOI: 10.1021/acs.jcim.7b00565] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 01/22/2023]
Abstract
Identification of the individual chemical constituents of a mixture, especially solutions extracted from medicinal plants, is a time-consuming task. The identification results are often limited by challenges such as the development of separation methods and the availability of known reference standards. A novel structure elucidation system, NP-StructurePredictor, is presented and used to accelerate the process of identifying chemical structures in a mixture based on a branch and bound algorithm combined with a large collection of natural product databases. NP-StructurePredictor requires only targeted molecular weights calculated from a list of m/z values from liquid chromatography-mass spectrometry (LC-MS) experiments as input information to predict the chemical structures of individual components matching the weights in a mixture. NP-StructurePredictor also provides the predicted structures with statistically calculated probabilities so that the most likely chemical structures of the natural products and their analogs can be proposed accordingly. Four data sets consisting of different Chinese herbs with mixtures containing known compounds were selected for validation studies, and all their components were correctly identified and highly predicted using NP-StructurePredictor. NP-StructurePredictor demonstrated its applicability for predicting the chemical structures of novel compounds by returning highly accurate results from four different validation case studies.
Collapse
Affiliation(s)
- Yeu-Chern Harn
- Graduate Institute of Networking and Multimedia, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan.,The Metabolomics Core Laboratory, NTU Center of Genomic Medicine , 7F, No. 2, Syujhou Road, Taipei 10055, Taiwan
| | - Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan
| | - Yuan-Ling Ku
- Medical and Pharmaceutical Industry Technology and Development Center , 7F, No. 9, Wuquan Road, Wugu District, New Taipei City 24886, Taiwan
| | - Olivia A Lin
- Graduate Institute of Biomedical Electronic and Bioinformatics, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan
| | - Cheng-Fu Chou
- Department of Computer Science and Information Engineering, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan
| | - Y Jane Tseng
- The Metabolomics Core Laboratory, NTU Center of Genomic Medicine , 7F, No. 2, Syujhou Road, Taipei 10055, Taiwan.,Department of Computer Science and Information Engineering, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan.,Graduate Institute of Biomedical Electronic and Bioinformatics, National Taiwan University , No. 1 Roosevelt Road Section 4, Taipei 10617, Taiwan.,Drug Research Center, National Taiwan University College of Medicine , No. 1 Jen Ai Road Section 1, Taipei 10051, Taiwan
| |
Collapse
|
19
|
Su BH, Shen MY, Harn YC, Wang SY, Schurz A, Lin C, Lin OA, Tseng YJ. An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm. J Cheminform 2017; 9:57. [PMID: 29143270 PMCID: PMC5688056 DOI: 10.1186/s13321-017-0244-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/21/2017] [Accepted: 11/01/2017] [Indexed: 11/25/2022] Open
Abstract
The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).
Collapse
Affiliation(s)
- Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Meng-Yu Shen
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Yeu-Chern Harn
- Graduate Institute of Networking and Multimedia, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - San-Yuan Wang
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Alioune Schurz
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Chieh Lin
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Olivia A Lin
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Yufeng J Tseng
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan. .,Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan.
| |
Collapse
|
20
|
Schurz A, Su BH, Tu YS, Lu TTY, Lin OA, Tseng YJ. G.A.M.E.: GPU-accelerated mixture elucidator. J Cheminform 2017; 9:50. [PMID: 29086161 PMCID: PMC5602814 DOI: 10.1186/s13321-017-0238-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/12/2017] [Accepted: 09/05/2017] [Indexed: 11/23/2022] Open
Abstract
GPU acceleration is useful in solving complex chemical information problems. Identifying unknown structures from the mass spectra of natural product mixtures has been a desirable yet unresolved issue in metabolomics. However, this elucidation process has been hampered by complex experimental data and the inability of instruments to completely separate different compounds. Fortunately, with current high-resolution mass spectrometry, one feasible strategy is to define this problem as extending a scaffold database with sidechains of different probabilities to match the high-resolution mass obtained from a high-resolution mass spectrum. By introducing a dynamic programming (DP) algorithm, it is possible to solve this NP-complete problem in pseudo-polynomial time. However, the running time of the DP algorithm grows by orders of magnitude as the number of mass decimal digits increases, thus limiting the boost in structural prediction capabilities. By harnessing the heavily parallel architecture of modern GPUs, we designed a “compute unified device architecture” (CUDA)-based GPU-accelerated mixture elucidator (G.A.M.E.) that considerably improves the performance of the DP, allowing up to five decimal digits for input mass data. As exemplified by four testing datasets with verified constitutions from natural products, G.A.M.E. allows for efficient and automatic structural elucidation of unknown mixtures for practical procedures.. ![]()
Collapse
Affiliation(s)
- Alioune Schurz
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Bo-Han Su
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Yi-Shu Tu
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Tony Tsung-Yu Lu
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Olivia A Lin
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Yufeng J Tseng
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan. .,Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan. .,Drug Research Center, National Taiwan University College of Medicine, No. 1 Sec. 1, Jen Ai Rord, Taipei, 106, Taiwan.
| |
Collapse
|
21
|
Troche‐Pesqueira E, Anklin C, Gil RR, Navarro‐Vázquez A. Computer‐Assisted 3D Structure Elucidation of Natural Products using Residual Dipolar Couplings. Angew Chem Int Ed Engl 2017. [DOI: 10.1002/ange.201612454] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/11/2022]
Affiliation(s)
| | - Clemens Anklin
- Bruker BioSpin Corp. 15 Fortune Dr. Billerica MA 01821 USA
| | - Roberto R. Gil
- Department of Chemistry Carnegie Mellon University 4400 Fifth Ave Pittsburgh PA 15213 USA
| | - Armando Navarro‐Vázquez
- Departamento de Química Fundamental, CCEN Universidade Federal de Pernambuco Brazil
- Departamento de Química Orgánica Universidade de Vigo 36310 Vigo Spain
| |
Collapse
|
22
|
Troche‐Pesqueira E, Anklin C, Gil RR, Navarro‐Vázquez A. Computer‐Assisted 3D Structure Elucidation of Natural Products using Residual Dipolar Couplings. Angew Chem Int Ed Engl 2017; 56:3660-3664. [DOI: 10.1002/anie.201612454] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/22/2016] [Indexed: 11/06/2022]
Affiliation(s)
| | - Clemens Anklin
- Bruker BioSpin Corp. 15 Fortune Dr. Billerica MA 01821 USA
| | - Roberto R. Gil
- Department of Chemistry Carnegie Mellon University 4400 Fifth Ave Pittsburgh PA 15213 USA
| | - Armando Navarro‐Vázquez
- Departamento de Química Fundamental, CCEN Universidade Federal de Pernambuco Brazil
- Departamento de Química Orgánica Universidade de Vigo 36310 Vigo Spain
| |
Collapse
|
23
|
Buevich AV, Elyashberg ME. Synergistic Combination of CASE Algorithms and DFT Chemical Shift Predictions: A Powerful Approach for Structure Elucidation, Verification, and Revision. JOURNAL OF NATURAL PRODUCTS 2016; 79:3105-3116. [PMID: 28006916 DOI: 10.1021/acs.jnatprod.6b00799] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 06/06/2023]
Abstract
Structure elucidation of complex natural products and new organic compounds remains a challenging problem. To support this endeavor, CASE (computer-assisted structure elucidation) expert systems were developed. These systems are capable of generating a set of all possible structures consistent with an ensemble of 2D NMR data followed by selection of the most probable structure on the basis of empirical NMR chemical shift prediction. However, in some cases, empirical chemical shift prediction is incapable of distinguishing the correct structure. Herein, we demonstrate for the first time that the combination of CASE and density functional theory (DFT) methods for NMR chemical shift prediction allows the determination of the correct structure even in difficult situations. An expert system, ACD/Structure Elucidator, was used for the CASE analysis. This approach has been tested on three challenging natural products: aquatolide, coniothyrione, and chiral epoxyroussoenone. This work has demonstrated that the proposed synergistic approach is an unbiased, reliable, and very efficient structure verification and de novo structure elucidation method that can be applied to difficult structural problems when other experimental methods would be difficult or impossible to use.
Collapse
Affiliation(s)
- Alexei V Buevich
- Department of Discovery and Preclinical Sciences, Process Research and Development, NMR Structure Elucidation, Merck & Co., Inc. , Kenilworth, New Jersey 07033, United States
| | - Mikhail E Elyashberg
- Advanced Chemistry Development (ACD/Laboratories) , Akademik Bakulev Street 6, 117513 Moscow, Russian Federation
| |
Collapse
|
24
|
Grimblat N, Sarotti AM. Computational Chemistry to the Rescue: Modern Toolboxes for the Assignment of Complex Molecules by GIAO NMR Calculations. Chemistry 2016; 22:12246-61. [DOI: 10.1002/chem.201601150] [Citation(s) in RCA: 144] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/10/2016] [Indexed: 12/14/2022]
Affiliation(s)
- Nicolas Grimblat
- Instituto de Química Rosario CONICET Facultad de Ciencias Bioquímicas y Farmacéuticas; Universidad Nacional de Rosario; Suipacha 531 Rosario 2000) Argentina
| | - Ariel M. Sarotti
- Instituto de Química Rosario CONICET Facultad de Ciencias Bioquímicas y Farmacéuticas; Universidad Nacional de Rosario; Suipacha 531 Rosario 2000) Argentina
| |
Collapse
|
25
|
Gaudêncio SP, Pereira F. Dereplication: racing to speed up the natural products discovery process. Nat Prod Rep 2015; 32:779-810. [PMID: 25850681 DOI: 10.1039/c4np00134f] [Citation(s) in RCA: 167] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 12/23/2022]
Abstract
Covering: 1993-2014 (July)To alleviate the dereplication holdup, which is a major bottleneck in natural products discovery, scientists have been conducting their research efforts to add tools to their "bag of tricks" aiming to achieve faster, more accurate and efficient ways to accelerate the pace of the drug discovery process. Consequently dereplication has become a hot topic presenting a huge publication boom since 2012, blending multidisciplinary fields in new ways that provide important conceptual and/or methodological advances, opening up pioneering research prospects in this field.
Collapse
Affiliation(s)
- Susana P Gaudêncio
- LAQV, REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
| | | |
Collapse
|
26
|
|
27
|
Reynolds WF, Mazzola EP. Nuclear magnetic resonance in the structural elucidation of natural products. PROGRESS IN THE CHEMISTRY OF ORGANIC NATURAL PRODUCTS 2015; 100:223-309. [PMID: 25632562 DOI: 10.1007/978-3-319-05275-5_3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 06/04/2023]
|
28
|
Abstract
Over the past 28 years there have been several thousand publications describing the use of 2D NMR to identify and characterize natural products. During this time period, the amount of sample needed for this purpose has decreased from the 20-50 mg range to under 1 mg. This has been due to both improvements in NMR hardware and methodology. This review will focus on mainly methodology improvements, particularly in pulse sequences, acquisition and processing methods which are particularly relevant to natural product research, with lesser discussion of hardware improvements.
Collapse
|
29
|
Elyashberg M, Blinov K, Molodtsov S, Williams AJ. Structure revision of asperjinone using computer-assisted structure elucidation methods. JOURNAL OF NATURAL PRODUCTS 2013; 76:113-116. [PMID: 23289877 DOI: 10.1021/np300218g] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 06/01/2023]
Abstract
The elucidated structure of asperjinone (1), a natural product isolated from thermophilic Aspergillus terreus, was revised using the expert system Structure Elucidator. The reliability of the revised structure (2) was confirmed using 180 structures containing the (3,3-dimethyloxiran-2-yl)methyl fragment (3) as a basis for comparison and whose chemical shifts contradict the suggested structure (1).
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | |
Collapse
|
30
|
Pletnev I, Erin A, McNaught A, Blinov K, Tchekhovskoi D, Heller S. InChIKey collision resistance: an experimental testing. J Cheminform 2012; 4:39. [PMID: 23256896 PMCID: PMC3558395 DOI: 10.1186/1758-2946-4-39] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 10/19/2012] [Accepted: 12/11/2012] [Indexed: 11/12/2022] Open
Abstract
InChIKey is a 27-character compacted (hashed) version of InChI which is intended for Internet and database searching/indexing and is based on an SHA-256 hash of the InChI character string. The first block of InChIKey encodes molecular skeleton while the second block represents various kinds of isomerism (stereo, tautomeric, etc.). InChIKey is designed to be a nearly unique substitute for the parent InChI. However, a single InChIKey may occasionally map to two or more InChI strings (collision). The appearance of collision itself does not compromise the signature as collision-free hashing is impossible; the only viable approach is to set and keep a reasonable level of collision resistance which is sufficient for typical applications. We tested, in computational experiments, how well the real-life InChIKey collision resistance corresponds to the theoretical estimates expected by design. For this purpose, we analyzed the statistical characteristics of InChIKey for datasets of variable size in comparison to the theoretical statistical frequencies. For the relatively short second block, an exhaustive direct testing was performed. We computed and compared to theory the numbers of collisions for the stereoisomers of Spongistatin I (using the whole set of 67,108,864 isomers and its subsets). For the longer first block, we generated, using custom-made software, InChIKeys for more than 3 × 1010 chemical structures. The statistical behavior of this block was tested by comparison of experimental and theoretical frequencies for the various four-letter sequences which may appear in the first block body. From the results of our computational experiments we conclude that the observed characteristics of InChIKey collision resistance are in good agreement with theoretical expectations.
Collapse
Affiliation(s)
- Igor Pletnev
- Department of Chemistry, Lomonosov Moscow State University, 119991, Moscow, Russia.
| | | | | | | | | | | |
Collapse
|
31
|
Dufour C, Wink J, Kurz M, Kogler H, Olivan H, Sablé S, Heyse W, Gerlitz M, Toti L, Nußer A, Rey A, Couturier C, Bauer A, Brönstrup M. Isolation and Structural Elucidation of Armeniaspirols A-C: Potent Antibiotics against Gram-Positive Pathogens. Chemistry 2012; 18:16123-8. [DOI: 10.1002/chem.201201635] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/09/2012] [Indexed: 02/04/2023]
|
32
|
Moser A, Elyashberg ME, Williams AJ, Blinov KA, Dimartino JC. Blind trials of computer-assisted structure elucidation software. J Cheminform 2012; 4:5. [PMID: 22321892 PMCID: PMC3349476 DOI: 10.1186/1758-2946-4-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/29/2011] [Accepted: 02/09/2012] [Indexed: 11/15/2022] Open
Abstract
Background One of the largest challenges in chemistry today remains that of efficiently mining through vast amounts of data in order to elucidate the chemical structure for an unknown compound. The elucidated candidate compound must be fully consistent with the data and any other competing candidates efficiently eliminated without doubt by using additional data if necessary. It has become increasingly necessary to incorporate an in silico structure generation and verification tool to facilitate this elucidation process. An effective structure elucidation software technology aims to mimic the skills of a human in interpreting the complex nature of spectral data while producing a solution within a reasonable amount of time. This type of software is known as computer-assisted structure elucidation or CASE software. A systematic trial of the ACD/Structure Elucidator CASE software was conducted over an extended period of time by analysing a set of single and double-blind trials submitted by a global audience of scientists. The purpose of the blind trials was to reduce subjective bias. Double-blind trials comprised of data where the candidate compound was unknown to both the submitting scientist and the analyst. The level of expertise of the submitting scientist ranged from novice to expert structure elucidation specialists with experience in pharmaceutical, industrial, government and academic environments. Results Beginning in 2003, and for the following nine years, the algorithms and software technology contained within ACD/Structure Elucidator have been tested against 112 data sets; many of these were unique challenges. Of these challenges 9% were double-blind trials. The results of eighteen of the single-blind trials were investigated in detail and included problems of a diverse nature with many of the specific challenges associated with algorithmic structure elucidation such as deficiency in protons, structure symmetry, a large number of heteroatoms and poor quality spectral data. Conclusion When applied to a complex set of blind trials, ACD/Structure Elucidator was shown to be a very useful tool in advancing the computer's contribution to elucidating a candidate structure from a set of spectral data (NMR and MS) for an unknown. The synergistic interaction between humans and computers can be highly beneficial in terms of less biased approaches to elucidation as well as dramatic improvements in speed and throughput. In those cases where multiple candidate structures exist, ACD/Structure Elucidator is equipped to validate the correct structure and eliminate inconsistent candidates. Full elucidation can generally be performed in less than two hours; this includes the average spectral data processing time and data input.
Collapse
Affiliation(s)
- Arvin Moser
- Advanced Chemistry Development, Toronto Department, 110 Yonge Street, 14th floor, Toronto, Ontario, M5C 1T4, Canada.
| | | | | | | | | |
Collapse
|
33
|
Elyashberg M, Blinov K, Molodtsov S, Williams A. Elucidating 'undecipherable' chemical structures using computer-assisted structure elucidation approaches. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2012; 50:22-27. [PMID: 22259196 DOI: 10.1002/mrc.2849] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 07/23/2011] [Revised: 09/14/2011] [Accepted: 10/13/2011] [Indexed: 05/31/2023]
Abstract
Structure elucidation using 2D NMR data and application of traditional methods of structure elucidation are known to fail for certain problems. In this work, it is shown that computer-assisted structure elucidation methods are capable of solving such problems. We conclude that it is now impossible to evaluate the capabilities of novel NMR experimental techniques in isolation from expert systems developed for processing fuzzy, incomplete and contradictory information obtained from 2D NMR spectra.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow, 117513, Russia
| | | | | | | |
Collapse
|
34
|
|
35
|
Elyashberg ME, Blinov KA, Molodtsov SG, Smurnyi ED. New computer-assisted methods for the elucidation of molecular structure from 2-D spectra. JOURNAL OF ANALYTICAL CHEMISTRY 2011. [DOI: 10.1134/s1061934808010036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 11/23/2022]
|
36
|
Elyashberg M, Blinov K, Smurnyy Y, Churanova T, Williams A. Empirical and DFT GIAO quantum-mechanical methods of (13)C chemical shifts prediction: competitors or collaborators? MAGNETIC RESONANCE IN CHEMISTRY : MRC 2010; 48:219-229. [PMID: 20108257 DOI: 10.1002/mrc.2571] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/28/2023]
Abstract
The accuracy of (13)C chemical shift prediction by both DFT GIAO quantum-mechanical (QM) and empirical methods was compared using 205 structures for which experimental and QM-calculated chemical shifts were published in the literature. For these structures, (13)C chemical shifts were calculated using HOSE code and neural network (NN) algorithms developed within our laboratory. In total, 2531 chemical shifts were analyzed and statistically processed. It has been shown that, in general, QM methods are capable of providing similar but inferior accuracy to the empirical approaches, but quite frequently they give larger mean average error values. For the structural set examined in this work, the following mean absolute errors (MAEs) were found: MAE(HOSE) = 1.58 ppm, MAE(NN) = 1.91 ppm and MAE(QM) = 3.29 ppm. A strategy of combined application of both the empirical and DFT GIAO approaches is suggested. The strategy could provide a synergistic effect if the advantages intrinsic to each method are exploited.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev St, 117513 Moscow, Russian Federation
| | | | | | | | | |
Collapse
|
37
|
Elyashberg M, Williams AJ, Blinov K. Structural revisions of natural products by Computer-Assisted Structure Elucidation (CASE) systems. Nat Prod Rep 2010; 27:1296-328. [DOI: 10.1039/c002332a] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/21/2022]
|
38
|
Elyashberg M, Blinov K, Williams A. A systematic approach for the generation and verification of structural hypotheses. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:371-389. [PMID: 19197914 DOI: 10.1002/mrc.2397] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/27/2023]
Abstract
During the process of molecular structure elucidation the selection of the most probable structural hypothesis may be based on chemical shift prediction. The prediction is carried out using either empirical or quantum-mechanical (QM) methods. When QM methods are used, NMR prediction commonly utilizes the GIAO option of the DFT approximation. In this approach the structural hypotheses are expected to be investigated by scientist. In this article we hope to show that the most rational manner by which to create structural hypotheses is actually by the application of an expert system capable of deducing all potential structures consistent with the experimental spectral data and specifically using 2D NMR data. When an expert system is used the best structure(s) can be distinguished using chemical shift prediction, which is best performed either by an incremental or neural net algorithm. The time-consuming QM calculations can then be applied, if necessary, to one or more of the 'best' structures to confirm the suggested solution.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | |
Collapse
|
39
|
Elyashberg ME, Blinov KA, Williams AJ. The application of empirical methods of (13)C NMR chemical shift prediction as a filter for determining possible relative stereochemistry. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2009; 47:333-341. [PMID: 19206140 DOI: 10.1002/mrc.2396] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/27/2023]
Abstract
The reliable determination of stereocenters contained within chemical structures usually requires utilization of NMR data, chemical derivatization, molecular modeling, quantum-mechanical (QM) calculations and, if available, X-ray analysis. In this article, we show that the number of stereoisomers which need to be thoroughly verified, can be significantly reduced by the application of NMR chemical shift calculation to the full stereoisomer set of possibilities using a fragmental approach based on HOSE codes. The applicability of this suggested method is illustrated using experimental data published for a series of complex chemical structures.
Collapse
Affiliation(s)
- Mikhail E Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | |
Collapse
|
40
|
Elyashberg M, Blinov K, Molodtsov S, Smurnyy Y, Williams AJ, Churanova T. Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream. J Cheminform 2009; 1:3. [PMID: 20142986 PMCID: PMC2816863 DOI: 10.1186/1758-2946-1-3] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/09/2009] [Accepted: 03/17/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND This article coincides with the 40 year anniversary of the first published works devoted to the creation of algorithms for computer-aided structure elucidation (CASE). The general principles on which CASE methods are based will be reviewed and the present state of the art in this field will be described using, as an example, the expert system Structure Elucidator. RESULTS The developers of CASE systems have been forced to overcome many obstacles hindering the development of a software application capable of drastically reducing the time and effort required to determine the structures of newly isolated organic compounds. Large complex molecules of up to 100 or more skeletal atoms with topological peculiarity can be quickly identified using the expert system Structure Elucidator based on spectral data. Logical analysis of 2D NMR data frequently allows for the detection of the presence of COSY and HMBC correlations of "nonstandard" length. Fuzzy structure generation provides a possibility to obtain the correct solution even in those cases when an unknown number of nonstandard correlations of unknown length are present in the spectra. The relative stereochemistry of big rigid molecules containing many stereocenters can be determined using the StrucEluc system and NOESY/ROESY 2D NMR data for this purpose. CONCLUSION The StrucEluc system continues to be developed in order to expand the general applicability, provide improved workflows, usability of the system and increased reliability of the results. It is expected that expert systems similar to that described in this paper will receive increasing acceptance in the next decade and will ultimately be integrated directly to analytical instruments for the purpose of organic analysis. Work in this direction is in progress. In spite of the fact that many difficulties have already been overcome to deliver on the spectroscopist's dream of "fully automated structure elucidation" there is still work to do. Nevertheless, as the efficiency of expert systems is enhanced the solution of increasingly complex structural problems will be achievable.
Collapse
Affiliation(s)
- Mikhail Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | | | |
Collapse
|
41
|
Williams AJ, Elyashberg ME, Blinov KA, Lankin DC, Martin GE, Reynolds WF, Porco JA, Singleton CA, Su S. Applying computer-assisted structure elucidation algorithms for the purpose of structure validation: revisiting the NMR assignments of hexacyclinol. JOURNAL OF NATURAL PRODUCTS 2008; 71:581-588. [PMID: 18257535 DOI: 10.1021/np070557t] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/25/2023]
Abstract
Computer-assisted structure elucidation (CASE) using a combination of 1D and 2D NMR data has been available for a number of years. These algorithms can be considered as "logic machines" capable of deriving all plausible structures from a set of structural constraints or "axioms", defined by the spectroscopic data and associated chemical information or prior knowledge. CASE programs allow the spectroscopist not only to determine structures from spectroscopic data but also to study the dependence of the proposed structure on changes to the set of axioms. In this article, we describe the application of the ACD/Structure Elucidator expert system to help resolve the conflict between two different hypothetical hexacyclinol structures derived by different researchers from the NMR spectra of this complex natural product. It has been shown that the combination of algorithms for both structure elucidation and structure validation delivered by the expert system enables the identification of the most probable structure as well as the associated chemical shift assignments.
Collapse
Affiliation(s)
- A J Williams
- ChemZoo, 904 Tamaras Circle, Wake Forest, North Carolina 27587, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Blinov KA, Smurnyy YD, Elyashberg ME, Churanova TS, Kvasha M, Steinbeck C, Lefebvre BA, Williams AJ. Performance validation of neural network based (13)c NMR prediction using a publicly available data source. J Chem Inf Model 2008; 48:550-5. [PMID: 18293952 DOI: 10.1021/ci700363r] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
The validation of the performance of a neural network based 13C NMR prediction algorithm using a test set available from an open source publicly available database, NMRShiftDB, is described. The validation was performed using a version of the database containing ca. 214,000 chemical shifts as well as for two subsets of the database to compare performance when overlap with the training set is taken into account. The first subset contained ca. 93,000 chemical shifts that were absent from the ACD\CNMR DB, the "excluded shift set" used for training of the neural network and the ACD\CNMR prediction algorithm, while the second contained ca. 121,000 shifts that were present in the ACD\CNMR DB training set, the "included shift set". This work has shown that the mean error between experimental and predicted shifts for the entire database is 1.59 ppm, while the mean deviation for the subset with included shifts is 1.47 and 1.74 ppm for excluded shifts. Since similar work has been reported online for another algorithm we compared the results with the errors determined using Robien's CNMR Neural Network Predictor using the entire NMRShiftDB for program validation.
Collapse
Affiliation(s)
- K A Blinov
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | | | | | | | |
Collapse
|
43
|
Smurnyy YD, Blinov KA, Churanova TS, Elyashberg ME, Williams AJ. Toward more reliable 13C and 1H chemical shift prediction: a systematic comparison of neural-network and least-squares regression based approaches. J Chem Inf Model 2007; 48:128-34. [PMID: 18052244 DOI: 10.1021/ci700256n] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
The efficacy of neural network (NN) and partial least-squares (PLS) methods is compared for the prediction of NMR chemical shifts for both 1H and 13C nuclei using very large databases containing millions of chemical shifts. The chemical structure description scheme used in this work is based on individual atoms rather than functional groups. The performances of each of the methods were optimized in a systematic manner described in this work. Both of the methods, least-squares and neural network analyses, produce results of a very similar quality, but the least-squares algorithm is approximately 2--3 times faster.
Collapse
Affiliation(s)
- Yegor D Smurnyy
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation, and Advanced Chemistry Development, Inc., 110 Yonge Street, 14th Floor, Toronto, Ontario, Canada M5C 1T4
| | | | | | | | | |
Collapse
|
44
|
Elyashberg ME, Blinov KA, Molodtsov SG, Williams AJ, Martin GE. Fuzzy Structure Generation: A New Efficient Tool for Computer-Aided Structure Elucidation (CASE). J Chem Inf Model 2007; 47:1053-66. [PMID: 17385849 DOI: 10.1021/ci600528g] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
Contemporary Computer-Aided Structure Elucidation (CASE) systems are heavily based on the utilization of 2D NMR spectra. The utilization of HMBC/GHMBC and COSY/GCOSY correlations generally assumes that these correlations result from (2-3)JCH and (2-3)JHH spin-spin couplings, respectively, and consequently these values are used as the default setting in these systems. Our previous studies1,2 have shown that about half of the problems studied actually contain some correlations of 4-6 bonds, so-called "nonstandard" correlations. In such cases the initial 2D NMR data are contradictory, and the correct solution is therefore not directly attainable. Unfortunately nonstandard correlations and the number of intervening bonds usually cannot be identified experimentally. In this work we suggest a new approach that we term Fuzzy Structure Generation. This allows the solution of structural problems whose 2D NMR data contain an unknown number of nonstandard correlations having different and unknown lengths. Suggested methods for the application of Fuzzy Structure Generation are described, and their application is illustrated by a series of real-world examples. We conclude that Fuzzy Structure Generation is efficient, and there is no real alternative at present in terms of a universal practical method for the structure elucidation of organic molecules from 2D NMR data.
Collapse
Affiliation(s)
- Mikhail E Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | |
Collapse
|
45
|
Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinformatics 2007; 8:105. [PMID: 17389044 PMCID: PMC1851972 DOI: 10.1186/1471-2105-8-105] [Citation(s) in RCA: 732] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/16/2006] [Accepted: 03/27/2007] [Indexed: 11/25/2022] Open
Abstract
Background Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. Results An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. Conclusion The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65–81%. Corresponding software and supplemental data are available for downloads from the authors' website.
Collapse
|
46
|
Masui H, Hong H. Spec2D: a structure elucidation system based on 1H NMR and H-H COSY spectra in organic chemistry. J Chem Inf Model 2006; 46:775-87. [PMID: 16563009 DOI: 10.1021/ci0502810] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/28/2022]
Abstract
A system for structure elucidation based on proton NMR spectra has been developed. The system, named Spec2D (system for spectra from 2D-NMR), incorporates 1H NMR and H-H correlation spectroscopy (COSY) spectral information obtained from 2D-NMR experiments. 2D-NMR is important for the structure elucidation because it provides information about the relationships among differently situated protons in the structures of unknown compounds. The system uses the concepts of molecular graphs. The improved representation of substructures as well as several novel algorithms for structure generation have been devised to solve the combinatorial problem and to reduce the processing time. Spec2D consists of a knowledge base, an analysis module, and a candidate structure generator module. Spec2D proposes candidate structures from only 1H NMR and H-H COSY spectral information of an unknown compound without any 13C NMR spectral or structural information, such as molecular formulas. Spec2D has the capability to propose the "new" structure of an unknown compound, if the corresponding substructures are included in the knowledge base.
Collapse
Affiliation(s)
- Hideyuki Masui
- Organic Synthesis Research Laboratory, Sumitomo Chemical Co., Ltd., Osaka 554-8558, Japan.
| | | |
Collapse
|
47
|
Abstract
The history of chemoinformatics is reviewed in a decade-by-decade manner from the 1940s to the present. The focus is placed on four traditional research areas: chemical database systems, computer-assisted structure elucidation systems, computer-assisted synthesis design systems, and 3D structure builders. Considering the fact that computer technology has been one of the major driving forces of the development of chemoinformatics, each section will start from a brief description of the new advances in computer technology of each decade. The summary and future prospects are given in the last section.
Collapse
|
48
|
Elyashberg ME, Blinov KA, Williams AJ, Molodtsov SG, Martin GE. Are Deterministic Expert Systems for Computer-Assisted Structure Elucidation Obsolete? J Chem Inf Model 2006; 46:1643-56. [PMID: 16859296 DOI: 10.1021/ci050469j] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/30/2022]
Abstract
Expert systems for spectroscopic molecular structure elucidation have been developed since the mid-1960s. Algorithms associated with the structure generation process within these systems are deterministic; that is, they are based on graph theory and combinatorial analysis. A series of expert systems utilizing 2D NMR spectra have been described in the literature and are capable of determining the molecular structures of large organic molecules including complex natural products. Recently, an opinion was expressed in the literature that these systems would fail when elucidating structures containing more than 30 heavy atoms. A suggestion was put forward that stochastic algorithms for structure generation would be necessary to overcome this shortcoming. In this article, we describe a comprehensive investigation of the capabilities of the deterministic expert system Structure Elucidator. The results of performing the structure elucidation of 250 complex natural products with this program were studied and generalized. The conclusion is that 2D NMR deterministic expert systems are certainly capable of elucidating large structures (up to about 100 heavy atoms) and can deal with the complexities associated with both poor and contradictory spectral data.
Collapse
Affiliation(s)
- Mikhail E Elyashberg
- Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | | | | | | | | |
Collapse
|
49
|
Golotvin SS, Vodopianov E, Lefebvre BA, Williams AJ, Spitzer TD. Automated structure verification based on 1H NMR prediction. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2006; 44:524-38. [PMID: 16489552 DOI: 10.1002/mrc.1781] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/06/2023]
Abstract
A unique opportunity exists when an experimental NMR spectrum is obtained for which a specific chemical structure is anticipated. A process of Verification--the confirmation of a postulated structure--is now possible, as opposed to Elucidation-the de novo determination of a structure. A method for automated structure verification is suggested, which compares the chemical shifts, intensities and multiplicities of signals in an experimental 1H NMR spectrum with those from a predicted spectrum for the proposed structure. A match factor (MF) is produced and used to classify the spectrum-structure match into one of three categories, correct, ambiguous, or incorrect. The verification result is also augmented by the spectrum assignment obtained as part of the verification process. This method was tested on a set of synthetic spectra and several sets of experimental spectra, all of which were automatically prepared from raw data. Taking into account even the most problematic structures, with many labile protons present and poor prediction accuracy, 50% of all spectra can still be automatically verified without any false positives or negatives. In a blind test on a typical set of data, it is shown that fewer than 31% of the structures would need manual evaluation. This means that a system is possible whereby 69% of the spectra are prepared and evaluated automatically, and never need to be seen or evaluated by a human.
Collapse
Affiliation(s)
- Sergey S Golotvin
- Advanced Chemistry Development Inc., Moscow Department, 6 Akademik Bakulev Street, Moscow 117 513, Russian Federation, Russia
| | | | | | | | | |
Collapse
|
50
|
Blinov KA, Larin NI, Kvasha MP, Moser A, Williams AJ, Martin GE. Analysis and elimination of artifacts in indirect covariance NMR spectra via unsymmetrical processing. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2005; 43:999-1007. [PMID: 16144032 DOI: 10.1002/mrc.1674] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/04/2023]
Abstract
Indirect covariance NMR offers an alternative method of extracting spin-spin connectivity information via the conversion of an indirect-detection heteronuclear shift-correlation data matrix to a homonuclear data matrix. Using an IDR (inverted direct response)-HSQC-TOCSY spectrum as a starting point for the indirect covariance processing, a spectrum that can be described as a carbon-carbon COSY experiment is obtained. These data are analogous to the autocorrelated 13C-13C double quantum INADEQUATE experiment except that the indirect covariance NMR spectrum establishes carbon-carbon connectivities only between contiguous protonated carbons. Cyclopentafuranone and the complex polynuclear heteroaromatic naphtho[2',1':5,6]-naphtho[2',1':4,5]thieno[2,3-c]quinoline are used as model compounds. The former is a straightforward example because of its well-resolved proton spectrum, while the latter, which has considerable resonance overlap in its congested proton spectrum, gives rise to two types of artifact responses that must be considered when using the indirect covariance NMR method.
Collapse
Affiliation(s)
- Kirill A Blinov
- Advanced Chemistry Development, Moscow Department 6 Akademik Bakulev Street, Moscow 117512 Russian Federation, Russia
| | | | | | | | | | | |
Collapse
|