Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Le T, Winter R, Noé F, Clevert DA. Neuraldecipher - reverse-engineering extended-connectivity fingerprints (ECFPs) to their molecular structures. Chem Sci 2020;11:10378-10389. [PMID: 34094299 PMCID: PMC8162443 DOI: 10.1039/d0sc03115a] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/10/2020] [Indexed: 12/22/2022] Open

For:	Le T, Winter R, Noé F, Clevert DA. Neuraldecipher - reverse-engineering extended-connectivity fingerprints (ECFPs) to their molecular structures. Chem Sci 2020;11:10378-10389. [PMID: 34094299 PMCID: PMC8162443 DOI: 10.1039/d0sc03115a] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/10/2020] [Indexed: 12/22/2022] Open

Number

Cited by Other Article(s)

Liu Q, He D, Fan M, Wang J, Cui Z, Wang H, Mi Y, Li N, Meng Q, Hou Y. Prediction and Interpretation Microglia Cytotoxicity by Machine Learning. J Chem Inf Model 2024. [PMID: 38949724 DOI: 10.1021/acs.jcim.4c00366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]

Abstract

Ameliorating microglia-mediated neuroinflammation is a crucial strategy in developing new drugs for neurodegenerative diseases. Plant compounds are an important screening target for the discovery of drugs for the treatment of neurodegenerative diseases. However, due to the spatial complexity of phytochemicals, it becomes particularly important to evaluate the effectiveness of compounds while avoiding the mixing of cytotoxic substances in the early stages of compound screening. Traditional high-throughput screening methods suffer from high cost and low efficiency. A computational model based on machine learning provides a novel avenue for cytotoxicity determination. In this study, a microglia cytotoxicity classifier was developed using a machine learning approach. First, we proposed a data splitting strategy based on the molecule murcko generic scaffold, under this condition, three machine learning approaches were coupled with three kinds of molecular representation methods to construct microglia cytotoxicity classifier, which were then compared and assessed by the predictive accuracy, balanced accuracy, F1-score, and Matthews Correlation Coefficient. Then, the recursive feature elimination integrated with support vector machine (RFE-SVC) dimension reduction method was introduced to molecular fingerprints with high dimensions to further improve the model performance. Among all the microglial cytotoxicity classifiers, the SVM coupled with ECFP4 fingerprint after feature selection (ECFP4-RFE-SVM) obtained the most accurate classification for the test set (ACC of 0.99, BA of 0.99, F1-score of 0.99, MCC of 0.97). Finally, the Shapley additive explanations (SHAP) method was used in interpreting the microglia cytotoxicity classifier and key substructure smart identified as structural alerts. Experimental results show that ECFP4-RFE-SVM have reliable classification capability for microglia cytotoxicity, and SHAP can not only provide a rational explanation for microglia cytotoxicity predictions, but also offer a guideline for subsequent molecular cytotoxicity modifications.

Collapse

Affiliation(s)

Qing Liu College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Dakuo He College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Mengmeng Fan College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Jinpeng Wang College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Zeyu Cui College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Hao Wang College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
Yan Mi Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
Ning Li School of Traditional Chinese Materia Medica, Key Laboratory for TCM Material Basis Study and Innovative Drug Development of Shenyang City, Shenyang Pharmaceutical University, Shenyang 110016, P. R. China
Qingqi Meng Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
Yue Hou Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China

Collapse

Lin J, He Y, Ru C, Long W, Li M, Wen Z. Advancing Adverse Drug Reaction Prediction with Deep Chemical Language Model for Drug Safety Evaluation. Int J Mol Sci 2024;25:4516. [PMID: 38674100 PMCID: PMC11050562 DOI: 10.3390/ijms25084516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open

Liang X, Liu S, Li Z, Deng Y, Jiang Y, Yang H. Efficient cocrystal coformer screening based on a Machine learning Strategy: A case study for the preparation of imatinib cocrystal with enhanced physicochemical properties. Eur J Pharm Biopharm 2024;196:114201. [PMID: 38309538 DOI: 10.1016/j.ejpb.2024.114201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 01/18/2024] [Accepted: 01/29/2024] [Indexed: 02/05/2024]

Yoshikai Y, Mizuno T, Nemoto S, Kusuhara H. Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations. Nat Commun 2024;15:1197. [PMID: 38365821 PMCID: PMC10873378 DOI: 10.1038/s41467-024-45102-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open

Mizuno T, Kusuhara H. Investigation of normalization procedures for transcriptome profiles of compounds oriented toward practical study design. J Toxicol Sci 2024;49:249-259. [PMID: 38825484 DOI: 10.2131/jts.49.249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]

Ilnicka A, Schneider G. Designing molecules with autoencoder networks. NATURE COMPUTATIONAL SCIENCE 2023;3:922-933. [PMID: 38177601 DOI: 10.1038/s43588-023-00548-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/03/2023] [Indexed: 01/06/2024]

Cremer J, Medrano Sandonas L, Tkatchenko A, Clevert DA, De Fabritiis G. Equivariant Graph Neural Networks for Toxicity Prediction. Chem Res Toxicol 2023;36. [PMID: 37690056 PMCID: PMC10583285 DOI: 10.1021/acs.chemrestox.3c00032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Indexed: 09/12/2023]

Abstract

Predictive modeling of toxicity is a crucial step in the drug discovery pipeline. It can help filter out molecules with a high probability of failing in the early stages of de novo drug design. Thus, several machine learning (ML) models have been developed to predict the toxicity of molecules by combining classical ML techniques or deep neural networks with well-known molecular representations such as fingerprints or 2D graphs. But the more natural, accurate representation of molecules is expected to be defined in physical 3D space like in ab initio methods. Recent studies successfully used equivariant graph neural networks (EGNNs) for representation learning based on 3D structures to predict quantum-mechanical properties of molecules. Inspired by this, we investigated the performance of EGNNs to construct reliable ML models for toxicity prediction. We used the equivariant transformer (ET) model in TorchMD-NET for this. Eleven toxicity data sets taken from MoleculeNet, TDCommons, and ToxBenchmark have been considered to evaluate the capability of ET for toxicity prediction. Our results show that ET adequately learns 3D representations of molecules that can successfully correlate with toxicity activity, achieving good accuracies on most data sets comparable to state-of-the-art models. We also test a physicochemical property, namely, the total energy of a molecule, to inform the toxicity prediction with a physical prior. However, our work suggests that these two properties can not be related. We also provide an attention weight analysis for helping to understand the toxicity prediction in 3D space and thus increase the explainability of the ML model. In summary, our findings offer promising insights considering 3D geometry information via EGNNs and provide a straightforward way to integrate molecular conformers into ML-based pipelines for predicting and investigating toxicity prediction in physical space. We expect that in the future, especially for larger, more diverse data sets, EGNNs will be an essential tool in this domain.

Collapse

Nemoto S, Mizuno T, Kusuhara H. Investigation of chemical structure recognition by encoder-decoder models in learning progress. J Cheminform 2023;15:45. [PMID: 37046349 PMCID: PMC10100163 DOI: 10.1186/s13321-023-00713-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 03/18/2023] [Indexed: 04/14/2023] Open

Ucak UV, Ashyrmamatov I, Lee J. Reconstruction of lossless molecular representations from fingerprints. J Cheminform 2023;15:26. [PMID: 36823647 PMCID: PMC9948316 DOI: 10.1186/s13321-023-00693-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 02/04/2023] [Indexed: 02/25/2023] Open

Muegge I, Hu Y. How do we further enhance 2D fingerprint similarity searching for novel drug discovery? Expert Opin Drug Discov 2022;17:1173-1176. [PMID: 36150044 DOI: 10.1080/17460441.2022.2128332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

He Y, Liu K, Han L, Han W. Clustering Analysis, Structure Fingerprint Analysis, and Quantum Chemical Calculations of Compounds from Essential Oils of Sunflower (Helianthus annuus L.) Receptacles. Int J Mol Sci 2022;23:ijms231710169. [PMID: 36077567 PMCID: PMC9456235 DOI: 10.3390/ijms231710169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/25/2022] [Accepted: 08/30/2022] [Indexed: 11/25/2022] Open

Shen Y, Zhao E, Zhang W, Baccarelli AA, Gao F. Predicting pesticide dissipation half-life intervals in plants with machine learning models. JOURNAL OF HAZARDOUS MATERIALS 2022;436:129177. [PMID: 35643003 DOI: 10.1016/j.jhazmat.2022.129177] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/04/2022] [Accepted: 05/15/2022] [Indexed: 06/15/2023]

Machine learning to empower electrohydrodynamic processing. MATERIALS SCIENCE & ENGINEERING. C, MATERIALS FOR BIOLOGICAL APPLICATIONS 2022;132:112553. [DOI: 10.1016/j.msec.2021.112553] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 11/09/2021] [Accepted: 11/11/2021] [Indexed: 01/13/2023]

Gao F, Shen Y, Sallach JB, Li H, Liu C, Li Y. Direct Prediction of Bioaccumulation of Organic Contaminants in Plant Roots from Soils with Machine Learning Models Based on Molecular Structures. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021;55:16358-16368. [PMID: 34859664 DOI: 10.1021/acs.est.1c02376] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Clevert DA, Le T, Winter R, Montanari F. Img2Mol - accurate SMILES recognition from molecular graphical depictions. Chem Sci 2021;12:14174-14181. [PMID: 34760202 PMCID: PMC8565361 DOI: 10.1039/d1sc01839f] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 09/22/2021] [Indexed: 11/25/2022] Open

Kwon Y, Kang S, Choi YS, Kim I. Evolutionary design of molecules based on deep learning and a genetic algorithm. Sci Rep 2021;11:17304. [PMID: 34453086 PMCID: PMC8397714 DOI: 10.1038/s41598-021-96812-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Accepted: 08/17/2021] [Indexed: 11/09/2022] Open

Rajan K, Zielesny A, Steinbeck C. DECIMER 1.0: deep learning for chemical image recognition using transformers. J Cheminform 2021;13:61. [PMID: 34404468 PMCID: PMC8369700 DOI: 10.1186/s13321-021-00538-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/25/2021] [Indexed: 11/29/2022] Open

Liu Y, De Vijlder T, Bittremieux W, Laukens K, Heyndrickx W. Current and future deep learning algorithms for tandem mass spectrometry (MS/MS)-based small molecule structure elucidation. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2021:e9120. [PMID: 33955607 DOI: 10.1002/rcm.9120] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 04/13/2021] [Accepted: 04/29/2021] [Indexed: 06/12/2023]

Abstract

RATIONALE

Structure elucidation of small molecules has been one of the cornerstone applications of mass spectrometry for decades. Despite the increasing availability of software tools, structure elucidation from tandem mass spectrometry (MS/MS) data remains a challenging task, leaving many spectra unidentified. However, as an increasing number of reference MS/MS spectra are being curated at a repository scale and shared on public servers, there is an exciting opportunity to develop powerful new deep learning (DL) models for automated structure elucidation.

ARCHITECTURES

Recent early-stage DL frameworks mostly follow a "two-step approach" that translates MS/MS spectra to database structures after first predicting molecular descriptors. The related architectures could suffer from: (1) computational complexity because of the separate training of descriptor-specific classifiers, (2) the high dimensional nature of mass spectral data and information loss due to data preprocessing, (3) low substructure coverage and class imbalance problem of predefined molecular fingerprints. Inspired by successful DL frameworks employed in drug discovery fields, we have conceptualized and designed hypothetical DL architectures to tackle the above issues. For (1), we recommend multitask learning to achieve better performance with fewer classifiers by grouping structurally related descriptors. For (2) and (3), we introduce feature engineering to extract condensed and higher-order information from spectra and structure data. For instance, encoding spectra with subtrees and pre-calculated spectral patterns add peak interactions to the model input. Encoding structures with graph convolutional networks incorporates connectivity within a molecule. The joint embedding of spectra and structures can enable simultaneous spectral library and molecular database search.

CONCLUSIONS

In principle, given enough training data, adapted DL architectures, optimal hyperparameters and computing power, DL frameworks can predict small molecule structures, completely or at least partially, from MS/MS spectra. However, their performance and general applicability should be fairly evaluated against classical machine learning frameworks.

Collapse

Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021;16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Ma H, An W, Wang Y, Sun H, Huang R, Huang J. Deep Graph Learning with Property Augmentation for Predicting Drug-Induced Liver Injury. Chem Res Toxicol 2021;34:495-506. [PMID: 33347312 PMCID: PMC9887540 DOI: 10.1021/acs.chemrestox.0c00322] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]