Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang XC, Wu CK, Yang ZJ, Wu ZX, Yi JC, Hsieh CY, Hou TJ, Cao DS. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief Bioinform 2021;22:6265201. [PMID: 33951729 DOI: 10.1093/bib/bbab152] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 04/01/2021] [Indexed: 11/12/2022] Open

For:	Zhang XC, Wu CK, Yang ZJ, Wu ZX, Yi JC, Hsieh CY, Hou TJ, Cao DS. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief Bioinform 2021;22:6265201. [PMID: 33951729 DOI: 10.1093/bib/bbab152] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 04/01/2021] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Zhai S, Tan Y, Zhu C, Zhang C, Gao Y, Mao Q, Zhang Y, Duan H, Yin Y. PepExplainer: An explainable deep learning model for selection-based macrocyclic peptide bioactivity prediction and optimization. Eur J Med Chem 2024;275:116628. [PMID: 38944933 DOI: 10.1016/j.ejmech.2024.116628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/21/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]

Zhu Y, Zhang Y, Li X, Wang L. 3MTox: A motif-level graph-based multi-view chemical language model for toxicity identification with deep interpretation. JOURNAL OF HAZARDOUS MATERIALS 2024;476:135114. [PMID: 38986414 DOI: 10.1016/j.jhazmat.2024.135114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 06/24/2024] [Accepted: 07/04/2024] [Indexed: 07/12/2024]

Wei L, Li Q, Song Y, Stefanov S, Dong R, Fu N, Siriwardane EMD, Chen F, Hu J. Crystal Composition Transformer: Self-Learning Neural Language Model for Generative and Tinkering Design of Materials. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2304305. [PMID: 39101275 DOI: 10.1002/advs.202304305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 07/09/2024] [Indexed: 08/06/2024]

Aksamit N, Tchagang A, Li Y, Ombuki-Berman B. Hybrid fragment-SMILES tokenization for ADMET prediction in drug discovery. BMC Bioinformatics 2024;25:255. [PMID: 39090573 PMCID: PMC11295479 DOI: 10.1186/s12859-024-05861-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 07/10/2024] [Indexed: 08/04/2024] Open

Lavecchia A. Advancing drug discovery with deep attention neural networks. Drug Discov Today 2024;29:104067. [PMID: 38925473 DOI: 10.1016/j.drudis.2024.104067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]

Tan Z, Zhao Y, Lin K, Zhou T. Multi-task pretrained language model with novel application domains enables more comprehensive health and ecological toxicity prediction. JOURNAL OF HAZARDOUS MATERIALS 2024;477:135265. [PMID: 39038381 DOI: 10.1016/j.jhazmat.2024.135265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 06/29/2024] [Accepted: 07/18/2024] [Indexed: 07/24/2024]

Kim J, Chang W, Ji H, Joung I. Quantum-Informed Molecular Representation Learning Enhancing ADMET Property Prediction. J Chem Inf Model 2024;64:5028-5040. [PMID: 38916580 DOI: 10.1021/acs.jcim.4c00772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Sadeghi S, Bui A, Forooghi A, Lu J, Ngom A. Can large language models understand molecules? BMC Bioinformatics 2024;25:225. [PMID: 38926641 DOI: 10.1186/s12859-024-05847-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 06/18/2024] [Indexed: 06/28/2024] Open

Duan Y, Yang X, Zeng X, Wang W, Deng Y, Cao D. Enhancing Molecular Property Prediction through Task-Oriented Transfer Learning: Integrating Universal Structural Insights and Domain-Specific Knowledge. J Med Chem 2024;67:9575-9586. [PMID: 38748846 DOI: 10.1021/acs.jmedchem.4c00692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]

De Carlo A, Ronchi D, Piastra M, Tosca EM, Magni P. Predicting ADMET Properties from Molecule SMILE: A Bottom-Up Approach Using Attention-Based Graph Neural Networks. Pharmaceutics 2024;16:776. [PMID: 38931898 PMCID: PMC11207804 DOI: 10.3390/pharmaceutics16060776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 05/08/2024] [Accepted: 05/30/2024] [Indexed: 06/28/2024] Open

Yang Z, Liu J, Yang F, Zhang X, Zhang Q, Zhu X, Jiang P. Advancing Drug-Target Interaction prediction with BERT and subsequence embedding. Comput Biol Chem 2024;110:108058. [PMID: 38593480 DOI: 10.1016/j.compbiolchem.2024.108058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 02/01/2024] [Accepted: 03/12/2024] [Indexed: 04/11/2024]

Abstract

Exploring the relationship between proteins and drugs plays a significant role in discovering new synthetic drugs. The Drug-Target Interaction (DTI) prediction is a fundamental task in the relationship between proteins and drugs. Unlike encoding proteins by amino acids, we use amino acid subsequence to encode proteins, which simulates the biological process of DTI better. For this research purpose, we proposed a novel deep learning framework based on Bidirectional Encoder Representation from Transformers (BERT), which integrates high-frequency subsequence embedding and transfer learning methods to complete the DTI prediction task. As the first key module, subsequence embedding allows to explore the functional interaction units from drug and protein sequences and then contribute to finding DTI modules. As the second key module, transfer learning promotes the model learn the common DTI features from protein and drug sequences in a large dataset. Overall, the BERT-based model can learn two kinds features through the multi-head self-attention mechanism: internal features of sequence and interaction features of both proteins and drugs, respectively. Compared with other methods, BERT-based methods enable more DTI-related features to be discovered by means of attention scores which associated with tokenized protein/drug subsequences. We conducted extensive experiments for the DTI prediction task on three different benchmark datasets. The experimental results show that the model achieves an average prediction metrics higher than most baseline methods. In order to verify the importance of transfer learning, we conducted an ablation study on datasets, and the results show the superiority of transfer learning. In addition, we test the scalability of the model on the dataset in unseen drugs and proteins, and the results of the experiments show that it is acceptable in scalability.

Collapse

Telenti A, Auli M, Hie BL, Maher C, Saria S, Ioannidis JPA. Large language models for science and medicine. Eur J Clin Invest 2024;54:e14183. [PMID: 38381530 DOI: 10.1111/eci.14183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/06/2024] [Accepted: 02/10/2024] [Indexed: 02/23/2024]

Shen A, Yuan M, Ma Y, Du J, Wang M. Complementary multi-modality molecular self-supervised learning via non-overlapping masking for property prediction. Brief Bioinform 2024;25:bbae256. [PMID: 38801702 PMCID: PMC11129775 DOI: 10.1093/bib/bbae256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 04/25/2024] [Accepted: 05/15/2024] [Indexed: 05/29/2024] Open

Xiang W, Zhong F, Ni L, Zheng M, Li X, Shi Q, Wang D. Gram matrix: an efficient representation of molecular conformation and learning objective for molecular pretraining. Brief Bioinform 2024;25:bbae340. [PMID: 38990515 PMCID: PMC11238115 DOI: 10.1093/bib/bbae340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 06/05/2024] [Accepted: 06/28/2024] [Indexed: 07/12/2024] Open

Kumar N, Acharya V. Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024;44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]

Jiang J, Li Y, Zhang R, Liu Y. INTransformer: Data augmentation-based contrastive learning by injecting noise into transformer for molecular property prediction. J Mol Graph Model 2024;128:108703. [PMID: 38228013 DOI: 10.1016/j.jmgm.2024.108703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 12/05/2023] [Accepted: 01/02/2024] [Indexed: 01/18/2024]

Li Y, Wang W, Liu J, Wu C. Pre-training molecular representation model with spatial geometry for property prediction. Comput Biol Chem 2024;109:108023. [PMID: 38335852 DOI: 10.1016/j.compbiolchem.2024.108023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/22/2024] [Accepted: 02/01/2024] [Indexed: 02/12/2024]

Ma M, Lei X. A deep learning framework for predicting molecular property based on multi-type features fusion. Comput Biol Med 2024;169:107911. [PMID: 38160501 DOI: 10.1016/j.compbiomed.2023.107911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 12/18/2023] [Accepted: 12/24/2023] [Indexed: 01/03/2024]

Yi JC, Yang ZY, Zhao WT, Yang ZJ, Zhang XC, Wu CK, Lu AP, Cao DS. ChemMORT: an automatic ADMET optimization platform using deep learning and multi-objective particle swarm optimization. Brief Bioinform 2024;25:bbae008. [PMID: 38385872 PMCID: PMC10883642 DOI: 10.1093/bib/bbae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/17/2023] [Accepted: 01/02/2024] [Indexed: 02/23/2024] Open

Zhang Y, Liu C, Liu M, Liu T, Lin H, Huang CB, Ning L. Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023;25:bbad467. [PMID: 38189543 PMCID: PMC10772984 DOI: 10.1093/bib/bbad467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/03/2023] [Accepted: 11/25/2023] [Indexed: 01/09/2024] Open

Shilpa S, Kashyap G, Sunoj RB. Recent Applications of Machine Learning in Molecular Property and Chemical Reaction Outcome Predictions. J Phys Chem A 2023;127:8253-8271. [PMID: 37769193 DOI: 10.1021/acs.jpca.3c04779] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]

Xu C, Liu R, Huang S, Li W, Li Z, Luo HB. 3D-SMGE: a pipeline for scaffold-based molecular generation and evaluation. Brief Bioinform 2023;24:bbad327. [PMID: 37756591 DOI: 10.1093/bib/bbad327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/19/2023] [Accepted: 08/30/2023] [Indexed: 09/29/2023] Open

Li B, Lin M, Chen T, Wang L. FG-BERT: a generalized and self-supervised functional group-based molecular representation learning framework for properties prediction. Brief Bioinform 2023;24:bbad398. [PMID: 37930026 DOI: 10.1093/bib/bbad398] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/25/2023] [Accepted: 10/14/2023] [Indexed: 11/07/2023] Open

Gao J, Shen Z, Xie Y, Lu J, Lu Y, Chen S, Bian Q, Guo Y, Shen L, Wu J, Zhou B, Hou T, He Q, Che J, Dong X. TransFoxMol: predicting molecular property with focused attention. Brief Bioinform 2023;24:bbad306. [PMID: 37605947 DOI: 10.1093/bib/bbad306] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/17/2023] [Accepted: 08/04/2023] [Indexed: 08/23/2023] Open

Affiliation(s)

Jian Gao Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Zheyuan Shen Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Yufeng Xie School of Software Technology, Zhejiang University, Hangzhou, China
Jialiang Lu Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Yang Lu Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Sikang Chen Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Qingyu Bian Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Yue Guo Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou, China
Liteng Shen Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Jian Wu School of Software Technology, Zhejiang University, Hangzhou, China
Binbin Zhou Department of Computer Science and Computing, Zhejiang University City College, Hangzhou, China
Tingjun Hou State Key Lab of CAD&CG, College of Pharmaceutical Sciences, Zhejiang University, Zhejiang, China Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou, China
Qiaojun He Institute of Pharmacology & Toxicology, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, PR China Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou, China Centre for Drug Safety Evaluation and Research of ZJU, Hangzhou, 310058, PR China Cancer Center of Zhejiang University, Hangzhou, China
Jinxin Che Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
Xiaowu Dong Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou, China Cancer Center of Zhejiang University, Hangzhou, China Department of Pharmacy, Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China

Collapse

Zhang Y, Ge F, Li F, Yang X, Song J, Yu DJ. Prediction of Multiple Types of RNA Modifications via Biological Language Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:3205-3214. [PMID: 37289599 DOI: 10.1109/tcbb.2023.3283985] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Zhang Y, Menke J, He J, Nittinger E, Tyrchan C, Koch O, Zhao H. Similarity-based pairing improves efficiency of siamese neural networks for regression tasks and uncertainty quantification. J Cheminform 2023;15:75. [PMID: 37649050 PMCID: PMC10469421 DOI: 10.1186/s13321-023-00744-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 08/10/2023] [Indexed: 09/01/2023] Open

Zhai S, Tan Y, Zhang C, Hipolito CJ, Song L, Zhu C, Zhang Y, Duan H, Yin Y. PepScaf: Harnessing Machine Learning with In Vitro Selection toward De Novo Macrocyclic Peptides against IL-17C/IL-17RE Interaction. J Med Chem 2023;66:11187-11200. [PMID: 37480587 DOI: 10.1021/acs.jmedchem.3c00627] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]

Wang R, Feng Y, Sun M, Jiang Y, Li Z, Cui L, Wei L. MVIL6: Accurate identification of IL-6-induced peptides using multi-view feature learning. Int J Biol Macromol 2023;246:125412. [PMID: 37327922 DOI: 10.1016/j.ijbiomac.2023.125412] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/11/2023] [Accepted: 06/13/2023] [Indexed: 06/18/2023]

Zhang J, Du W, Yang X, Wu D, Li J, Wang K, Wang Y. SMG-BERT: integrating stereoscopic information and chemical representation for molecular property prediction. Front Mol Biosci 2023;10:1216765. [PMID: 37457837 PMCID: PMC10348360 DOI: 10.3389/fmolb.2023.1216765] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 06/15/2023] [Indexed: 07/18/2023] Open

ValizadehAslani T, Shi Y, Ren P, Wang J, Zhang Y, Hu M, Zhao L, Liang H. PharmBERT: a domain-specific BERT model for drug labels. Brief Bioinform 2023:bbad226. [PMID: 37317617 DOI: 10.1093/bib/bbad226] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/10/2023] [Accepted: 05/26/2023] [Indexed: 06/16/2023] Open

Jiang J, Zhang R, Yuan Y, Li T, Li G, Zhao Z, Yu Z. NoiseMol: A noise-robusted data augmentation via perturbing noise for molecular property prediction. J Mol Graph Model 2023;121:108454. [PMID: 36963306 DOI: 10.1016/j.jmgm.2023.108454] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/05/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023]

Abstract

Simplified Molecular-Input Line-Entry System (SMILES) is one of a widely used molecular representation methods for molecular property prediction. We conjecture that all the characters in the SMILES string of a molecule are essential for making up the molecules, but most of them make little contribution to determining a particular property of the molecule. Therefore, we verified the conjecture in the pre-experiment. Motivated by the result, we propose to inject proper noisy information into the SMILES to augment the training data by increasing the diversity of the labeled molecules. To this end, we explore injecting perturbing noise into the original labeled SMILES strings to construct augmented data for alleviating the limitation of the labeled compound data and enhancing the model to extract more useful molecular representation for molecular property prediction. Specifically, we directly adopt mask, swap, deletion, and fusion operations on SMILES strings to randomly mask, swap, and delete atoms in SMILES strings. Then, the augmented data is used by two strategies: each epoch alternately feeds the original and perturbing noisy molecules, or each batch alternately feeds the original and perturbing noisy molecules. We conduct experiments on both Transformer and BiGRU models to validate the effectiveness by adopting widely used datasets from MoleculeNet and ZINC. Experimental results demonstrate that the proposed method outperforms strong baselines on all the datasets. NoiseMol obtains the best performance on BBBP and FDA when compared with state-of-the-art methods. Besides, NoiseMol achieves the best accuracy on LogP. Therefore, injecting perturbing noise into the labeled SMILES strings is an effective and efficient method, which improves the prediction performance, generalization, and robustness of the deep learning models.

Collapse

Xie D, Huang Q, Zhou P. Drug Discovery Targeting Post-Translational Modifications in Response to DNA Damages Induced by Space Radiation. Int J Mol Sci 2023;24:ijms24087656. [PMID: 37108815 PMCID: PMC10142602 DOI: 10.3390/ijms24087656] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 04/07/2023] [Accepted: 04/14/2023] [Indexed: 04/29/2023] Open

Liu J, Lei X, Zhang Y, Pan Y. The prediction of molecular toxicity based on BiGRU and GraphSAGE. Comput Biol Med 2023;153:106524. [PMID: 36623439 DOI: 10.1016/j.compbiomed.2022.106524] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/10/2022] [Accepted: 12/31/2022] [Indexed: 01/04/2023]

Wang D, Wu Z, Shen C, Bao L, Luo H, Wang Z, Yao H, Kong DX, Luo C, Hou T. Learning with uncertainty to accelerate the discovery of histone lysine-specific demethylase 1A (KDM1A/LSD1) inhibitors. Brief Bioinform 2023;24:6961473. [PMID: 36573494 DOI: 10.1093/bib/bbac592] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/28/2022] Open

Affiliation(s)

Dong Wang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Zhenxing Wu Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Chao Shen Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China.,CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
Lingjie Bao Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Hao Luo Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Zhe Wang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China
Hucheng Yao State Key Laboratory of Agricultural Microbiology, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
De-Xin Kong State Key Laboratory of Agricultural Microbiology, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
Cheng Luo The Center for Chemical Biology, Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203 China
Tingjun Hou Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, China

Collapse

Zheng Z, Tan Y, Wang H, Yu S, Liu T, Liang C. CasANGCL: pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction. Brief Bioinform 2023;24:6966532. [PMID: 36592051 DOI: 10.1093/bib/bbac566] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 10/18/2022] [Accepted: 11/20/2022] [Indexed: 01/03/2023] Open

Yang L, Jin C, Yang G, Bing Z, Huang L, Niu Y, Yang L. Transformer-based deep learning method for optimizing ADMET properties of lead compounds. Phys Chem Chem Phys 2023;25:2377-2385. [PMID: 36597997 DOI: 10.1039/d2cp05332b] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Bao L, Wang Z, Wu Z, Luo H, Yu J, Kang Y, Cao D, Hou T. Kinome-wide polypharmacology profiling of small molecules by multi-task graph isomorphism network approach. Acta Pharm Sin B 2023;13:54-67. [PMID: 36815050 PMCID: PMC9939366 DOI: 10.1016/j.apsb.2022.05.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/15/2022] [Accepted: 04/30/2022] [Indexed: 11/18/2022] Open

Traditional Machine and Deep Learning for Predicting Toxicity Endpoints. MOLECULES (BASEL, SWITZERLAND) 2022;28:molecules28010217. [PMID: 36615411 PMCID: PMC9822478 DOI: 10.3390/molecules28010217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/16/2022] [Accepted: 12/21/2022] [Indexed: 12/28/2022]

Du W, Yang X, Wu D, Ma F, Zhang B, Bao C, Huo Y, Jiang J, Chen X, Wang Y. Fusing 2D and 3D molecular graphs as unambiguous molecular descriptors for conformational and chiral stereoisomers. Brief Bioinform 2022;24:6931719. [PMID: 36528804 PMCID: PMC9851338 DOI: 10.1093/bib/bbac560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 10/28/2022] [Accepted: 11/15/2022] [Indexed: 12/23/2022] Open

Lee M, Kim PJ, Joe H, Kim HG. Gene-centric multi-omics integration with convolutional encoders for cancer drug response prediction. Comput Biol Med 2022;151:106192. [PMID: 36327883 DOI: 10.1016/j.compbiomed.2022.106192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/26/2022] [Accepted: 10/08/2022] [Indexed: 12/27/2022]

Zeng X, Xiang H, Yu L, Wang J, Li K, Nussinov R, Cheng F. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00557-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Jiang J, Zhang R, Zhao Z, Ma J, Liu Y, Yuan Y, Niu B. MultiGran-SMILES: multi-granularity SMILES learning for molecular property prediction. Bioinformatics 2022;38:4573-4580. [PMID: 35961025 DOI: 10.1093/bioinformatics/btac550] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 07/07/2022] [Accepted: 08/10/2022] [Indexed: 11/14/2022] Open

Deep learning methods for molecular representation and property prediction. Drug Discov Today 2022;27:103373. [PMID: 36167282 DOI: 10.1016/j.drudis.2022.103373] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/22/2022] [Accepted: 09/21/2022] [Indexed: 01/11/2023]

Ren S, Yu L, Gao L. Multidrug representation learning based on pretraining model and molecular graph for drug interaction and combination prediction. Bioinformatics 2022;38:4387-4394. [PMID: 35904544 DOI: 10.1093/bioinformatics/btac538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 07/06/2022] [Accepted: 07/27/2022] [Indexed: 12/24/2022] Open

Liu H, Huang Y, Liu X, Deng L. Attention-wise masked graph contrastive learning for predicting molecular property. Brief Bioinform 2022;23:6657662. [PMID: 35940592 DOI: 10.1093/bib/bbac303] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 06/17/2022] [Accepted: 07/04/2022] [Indexed: 11/14/2022] Open

Yu J, Wang J, Zhao H, Gao J, Kang Y, Cao D, Wang Z, Hou T. Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism. J Chem Inf Model 2022;62:2973-2986. [PMID: 35675668 DOI: 10.1021/acs.jcim.2c00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

Accurate estimation of the synthetic accessibility of small molecules is needed in many phases of drug discovery. Several expert-crafted scoring methods and descriptor-based quantitative structure-activity relationship (QSAR) models have been developed for synthetic accessibility assessment, but their practical applications in drug discovery are still quite limited because of relatively low prediction accuracy and poor model interpretability. In this study, we proposed a data-driven interpretable prediction framework called GASA (Graph Attention-based assessment of Synthetic Accessibility) to evaluate the synthetic accessibility of small molecules by distinguishing compounds to be easy- (ES) or hard-to-synthesize (HS). GASA is a graph neural network (GNN) architecture that makes self-feature deduction by applying an attention mechanism to automatically capture the most important structural features related to synthetic accessibility. The sampling around the hypothetical classification boundary was used to improve the ability of GASA to distinguish structurally similar molecules. GASA was extensively evaluated and compared with two descriptor-based machine learning methods (random forest, RF; eXtreme gradient boosting, XGBoost) and four existing scores (SYBA: SYnthetic Bayesian Accessibility; SCScore: Synthetic Complexity score; RAscore: Retrosynthetic Accessibility score; SAscore: Synthetic Accessibility score). Our analysis demonstrates that GASA achieved remarkable performance in distinguishing similar molecules compared with other methods and had a broader applicability domain. In addition, we show how GASA learns the important features that affect molecular synthetic accessibility by assigning attention weights to different atoms. An online prediction service for GASA was offered at http://cadd.zju.edu.cn/gasa/.

Collapse

Xie D, He S, Han L, Wu L, Huang H, Tao H, Zhou P, Shi X, Bai H, Bo X. Systematic optimization of host-directed therapeutic targets and preclinical validation of repositioned antiviral drugs. Brief Bioinform 2022;23:bbac047. [PMID: 35238349 PMCID: PMC9116211 DOI: 10.1093/bib/bbac047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 01/26/2022] [Accepted: 01/28/2022] [Indexed: 11/12/2022] Open

Gu Y, Zheng S, Xu Z, Yin Q, Li L, Li J. An efficient curriculum learning-based strategy for molecular graph learning. Brief Bioinform 2022;23:6562682. [DOI: 10.1093/bib/bbac099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 01/18/2022] [Accepted: 02/27/2022] [Indexed: 12/14/2022] Open

Irwin R, Dimitriadis S, He J, Bjerrum EJ. Chemformer: a pre-trained transformer for computational chemistry. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac3ffb] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Jiang D, Sun H, Wang J, Hsieh CY, Li Y, Wu Z, Cao D, Wu J, Hou T. Out-of-the-box deep learning prediction of quantum-mechanical partial charges by graph representation and transfer learning. Brief Bioinform 2022;23:6513729. [DOI: 10.1093/bib/bbab597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 12/14/2021] [Accepted: 12/23/2021] [Indexed: 11/14/2022] Open

Abstract Abstract Accurate prediction of atomic partial charges with high-level quantum mechanics (QM) methods suffers from high computational cost. Numerous feature-engineered machine learning (ML)-based predictors with favorable computability and reliability have been developed as alternatives. However, extensive expertise effort was needed for feature engineering of atom chemical environment, which may consequently introduce domain bias. In this study, SuperAtomicCharge, a data-driven deep graph learning framework, was proposed to predict three important types of partial charges (i.e. RESP, DDEC4 and DDEC78) derived from high-level QM calculations based on the structures of molecules. SuperAtomicCharge was designed to simultaneously exploit the 2D and 3D structural information of molecules, which was proved to be an effective way to improve the prediction accuracy of the model. Moreover, a simple transfer learning strategy and a multitask learning strategy based on self-supervised descriptors were also employed to further improve the prediction accuracy of the proposed model. Compared with the latest baselines, including one GNN-based predictor and two ML-based predictors, SuperAtomicCharge showed better performance on all the three external test sets and had better usability and portability. Furthermore, the QM partial charges of new molecules predicted by SuperAtomicCharge can be efficiently used in drug design applications such as structure-based virtual screening, where the predicted RESP and DDEC4 charges of new molecules showed more robust scoring and screening power than the commonly used partial charges. Finally, two tools including an online server (http://cadd.zju.edu.cn/deepchargepredictor) and the source code command lines (https://github.com/zjujdj/SuperAtomicCharge) were developed for the easy access of the SuperAtomicCharge services. Collapse