1
|
Chen L, Xu J, Zhou Y. PDATC-NCPMKL: Predicting drug's Anatomical Therapeutic Chemical (ATC) codes based on network consistency projection and multiple kernel learning. Comput Biol Med 2024; 169:107862. [PMID: 38150886 DOI: 10.1016/j.compbiomed.2023.107862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/19/2023] [Accepted: 12/17/2023] [Indexed: 12/29/2023]
Abstract
The development and discovery of new drugs is time-consuming and needs lots of human and material resources. Therefore, discovery of novel effects of existing drugs is an important alternative way, which can accelerate the process of designing "new" drugs. The anatomical Therapeutic Chemical (ATC) classification system recommended by World Health Organization (WHO) is a basic research area in this regard. A novel ATC code of an existing drug suggests its novel effects. Some computational models have been proposed, which can predict the drug-ATC code associations. However, their performance is not very high. There still exist spaces for improvement. In this study, a new recommendation system (named PDATC-NCPMKL), which incorporated network consistency projection and multi-kernel learning, was designed to identify drug-ATC code associations. For drugs or ATC codes, several kernels were constructed, which were fused by a multiple kernel learning method and an additional kernel integration scheme. To enhance the performance, the drug-ATC code association adjacency matrix was reformulated by a variant of weighted K nearest known neighbors (WKNKN). The reformulated adjacency matrix, drug and ATC code kernels were fed into network consistency projection to generate the association score matrix. The proposed recommendation system was tested on the ATC codes at the second, third and fourth levels in drug ATC classification system using ten-fold cross-validation. The results indicated that all AUROC and AUPR values were close to or exceeded 0.96. Such performance was higher than some existing computational models. Some additional tests were conducted to prove the utility of adjacency matrix reformulation and to analyze the importance of drug and ATC code kernels.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Jing Xu
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Yubin Zhou
- Department of Thoracic Surgery, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 610072, China.
| |
Collapse
|
2
|
Kim Y, Cho YR. Predicting Drug-Gene-Disease Associations by Tensor Decomposition for Network-Based Computational Drug Repositioning. Biomedicines 2023; 11:1998. [PMID: 37509637 PMCID: PMC10377142 DOI: 10.3390/biomedicines11071998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/07/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
Drug repositioning offers the significant advantage of greatly reducing the cost and time of drug discovery by identifying new therapeutic indications for existing drugs. In particular, computational approaches using networks in drug repositioning have attracted attention for inferring potential associations between drugs and diseases efficiently based on the network connectivity. In this article, we proposed a network-based drug repositioning method to construct a drug-gene-disease tensor by integrating drug-disease, drug-gene, and disease-gene associations and predict drug-gene-disease triple associations through tensor decomposition. The proposed method, which ensembles generalized tensor decomposition (GTD) and multi-layer perceptron (MLP), models drug-gene-disease associations through GTD and learns the features of drugs, genes, and diseases through MLP, providing more flexibility and non-linearity than conventional tensor decomposition. We experimented with drug-gene-disease association prediction using two distinct networks created by chemical structures and ATC codes as drug features. Moreover, we leveraged drug, gene, and disease latent vectors obtained from the predicted triple associations to predict drug-disease, drug-gene, and disease-gene pairwise associations. Our experimental results revealed that the proposed ensemble method was superior for triple association prediction. The ensemble model achieved an AUC of 0.96 in predicting triple associations for new drugs, resulting in an approximately 7% improvement over the performance of existing models. It also showed competitive accuracy for pairwise association prediction compared with previous methods. This study demonstrated that incorporating genetic information leads to notable advancements in drug repositioning.
Collapse
Affiliation(s)
- Yoonbee Kim
- Division of Software, Yonsei University Mirae Campus, Wonju-si 26493, Gangwon-do, Republic of Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University Mirae Campus, Wonju-si 26493, Gangwon-do, Republic of Korea
- Division of Digital Healthcare, Yonsei University Mirae Campus, Wonju-si 26493, Gangwon-do, Republic of Korea
| |
Collapse
|
3
|
Drug-Disease Association Prediction Using Heterogeneous Networks for Computational Drug Repositioning. Biomolecules 2022; 12:biom12101497. [PMID: 36291706 PMCID: PMC9599692 DOI: 10.3390/biom12101497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/10/2022] [Accepted: 10/13/2022] [Indexed: 11/18/2022] Open
Abstract
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug–disease associations. This review reveals existing network-based approaches for predicting drug–disease associations in three major categories: graph mining, matrix factorization or completion, and deep learning. We selected eleven methods from the three categories to compare their predictive performances. The experiment was conducted using two uniform datasets on the drug and disease sides, separately. We constructed heterogeneous networks using drug–drug similarities based on chemical structures and ATC codes, ontology-based disease–disease similarities, and drug–disease associations. An improved evaluation metric was used to reflect data imbalance as positive associations are typically sparse. The prediction results demonstrated that methods in the graph mining and matrix factorization or completion categories performed well in the overall assessment. Furthermore, prediction on the drug side had higher accuracy than on the disease side. Selecting and integrating informative drug features in drug–drug similarity measurement are crucial for improving disease-side prediction.
Collapse
|
4
|
Gallo K, Goede A, Preissner R, Gohlke BO. SuperPred 3.0: drug classification and target prediction-a machine learning approach. Nucleic Acids Res 2022; 50:W726-W731. [PMID: 35524552 PMCID: PMC9252837 DOI: 10.1093/nar/gkac297] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 04/01/2022] [Accepted: 04/13/2022] [Indexed: 11/21/2022] Open
Abstract
Since the last published update in 2014, the SuperPred webserver has been continuously developed to offer state-of-the-art models for drug classification according to ATC classes and target prediction. For the first time, a thoroughly filtered ATC dataset, that is suitable for accurate predictions, is provided along with detailed information on the achieved predictions. This aims to overcome the challenges in comparing different published prediction methods, since performance can vary greatly depending on the training dataset used. Additionally, both ATC and target prediction have been reworked and are now based on machine learning models instead of overall structural similarity, stressing the importance of functional groups for the mechanism of action of small molecule substances. Additionally, the dataset for the target prediction has been extensively filtered and is no longer only based on confirmed binders but also includes non-binding substances to reduce false positives. Using these methods, accuracy for the ATC prediction could be increased by almost 5% to 80.5% compared to the previous version, and additionally the scoring function now offers values which are easily assessable at first glance. SuperPred 3.0 is publicly available without the need for registration at: https://prediction.charite.de/index.php.
Collapse
Affiliation(s)
- Kathleen Gallo
- Charité - Universitätsmedizin Berlin, Institute of Physiology and Science IT, Corporate Member of Freie Universität Berlin, Berlin Institute of Health, Humboldt-Universität zu Berlin, 10117 Berlin, Germany
| | - Andrean Goede
- Charité - Universitätsmedizin Berlin, Institute of Physiology and Science IT, Corporate Member of Freie Universität Berlin, Berlin Institute of Health, Humboldt-Universität zu Berlin, 10117 Berlin, Germany
| | - Robert Preissner
- Charité - Universitätsmedizin Berlin, Institute of Physiology and Science IT, Corporate Member of Freie Universität Berlin, Berlin Institute of Health, Humboldt-Universität zu Berlin, 10117 Berlin, Germany
| | - Bjoern-Oliver Gohlke
- Charité - Universitätsmedizin Berlin, Institute of Physiology and Science IT, Corporate Member of Freie Universität Berlin, Berlin Institute of Health, Humboldt-Universität zu Berlin, 10117 Berlin, Germany
| |
Collapse
|
5
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
6
|
Wang X, Liu M, Zhang Y, He S, Qin C, Li Y, Lu T. Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery. Brief Bioinform 2021; 22:6342939. [PMID: 34368838 DOI: 10.1093/bib/bbab289] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 07/03/2021] [Accepted: 07/06/2021] [Indexed: 01/17/2023] Open
Abstract
The advent of large-scale biomedical data and computational algorithms provides new opportunities for drug repurposing and discovery. It is of great interest to find an appropriate data representation and modeling method to facilitate these studies. The anatomical therapeutic chemical (ATC) classification system, proposed by the World Health Organization (WHO), is an essential source of information for drug repurposing and discovery. Besides, computational methods are applied to predict drug ATC classification. We conducted a systematic review of ATC computational prediction studies and revealed the differences in data sets, data representation, algorithm approaches, and evaluation metrics. We then proposed a deep fusion learning (DFL) framework to optimize the ATC prediction model, namely DeepATC. The methods based on graph convolutional network, inferring biological network and multimodel attentive fusion network were applied in DeepATC to extract the molecular topological information and low-dimensional representation from the molecular graph and heterogeneous biological networks. The results indicated that DeepATC achieved superior model performance with area under the curve (AUC) value at 0.968. Furthermore, the DFL framework was performed for the transcriptome data-based ATC prediction, as well as another independent task that is significantly relevant to drug discovery, namely drug-target interaction. The DFL-based model achieved excellent performance in the above-extended validation task, suggesting that the idea of aggregating the heterogeneous biological network and node's (molecule or protein) self-topological features will bring inspiration for broader drug repurposing and discovery research.
Collapse
Affiliation(s)
- Xiting Wang
- Life Science School, Beijing University of Chinese Medicine, Beijing, China
| | - Meng Liu
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Yiling Zhang
- Beijing University of Chinese Medicine, Beijing, China
| | - Shuangshuang He
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Caimeng Qin
- School of Life Sciences, Beijing University of Chinese Medicine and Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yu Li
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Tao Lu
- Integrative Medicine Center in School of Life Sciences, Beijing University of Chinese Medicine, Beijing, China
| |
Collapse
|
7
|
Kanza S, Graham Frey J. Semantic Technologies in Drug Discovery. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11520-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
8
|
Zhou JP, Chen L, Guo ZH. iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs. Bioinformatics 2020; 36:1391-1396. [PMID: 31593226 DOI: 10.1093/bioinformatics/btz757] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 09/10/2019] [Accepted: 10/01/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The anatomical therapeutic chemical (ATC) classification system plays an increasingly important role in drug repositioning and discovery. The correct identification of classes in each level of such system that a given drug may belong to is an essential problem. Several multi-label classifiers have been proposed in this regard. Although they provided satisfactory performance, the feature extraction procedures were still rough. More refined features may further improve the predicted quality. RESULTS In this article, we provide a novel multi-label classifier, called iATC-NRAKEL, to predict drug ATC classes in the first level. To obtain more informative drug features, we employed the drug association information in STITCH and KEGG, which was organized by seven drug networks. The powerful network embedding algorithm, Mashup, was adopted to extract informative drug features. The obtained features were fed into the RAndom k-labELsets (RAKEL) algorithm with support vector machine as the basic classification algorithm to construct the classifier. The 10-fold cross-validation of the benchmark dataset with 3883 drugs showed that the accuracy and absolute true were 76.56 and 74.51%, respectively. The comparison results indicated that iATC-NRAKEL was much superior to all previous reported classifiers. Finally, the contribution of each network was analyzed. AVAILABILITY AND IMPLEMENTATION The codes of iATC-NRAKEL are available at https://github.com/zhou256/iATC-NRAKEL.
Collapse
Affiliation(s)
- Jian-Peng Zhou
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.,Shanghai Key Laboratory of PMMP, East China Normal University, Shanghai 200241, People's Republic of China
| | - Zi-Han Guo
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| |
Collapse
|
9
|
Peng Y, Wang M, Xu Y, Wu Z, Wang J, Zhang C, Liu G, Li W, Li J, Tang Y. Drug repositioning by prediction of drug's anatomical therapeutic chemical code via network-based inference approaches. Brief Bioinform 2020; 22:2058-2072. [PMID: 32221552 DOI: 10.1093/bib/bbaa027] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 02/05/2020] [Accepted: 02/17/2020] [Indexed: 12/17/2022] Open
Abstract
Drug discovery and development is a time-consuming and costly process. Therefore, drug repositioning has become an effective approach to address the issues by identifying new therapeutic or pharmacological actions for existing drugs. The drug's anatomical therapeutic chemical (ATC) code is a hierarchical classification system categorized as five levels according to the organs or systems that drugs act and the pharmacology, therapeutic and chemical properties of drugs. The 2nd-, 3rd- and 4th-level ATC codes reserved the therapeutic and pharmacological information of drugs. With the hypothesis that drugs with similar structures or targets would possess similar ATC codes, we exploited a network-based approach to predict the 2nd-, 3rd- and 4th-level ATC codes by constructing substructure drug-ATC (SD-ATC), target drug-ATC (TD-ATC) and Substructure&Target drug-ATC (STD-ATC) networks. After 10-fold cross validation and two external validations, the STD-ATC models outperformed the SD-ATC and TD-ATC ones. Furthermore, with KR as fingerprint, the STD-ATC model was identified as the optimal model with AUC values at 0.899 ± 0.015, 0.916 and 0.893 for 10-fold cross validation, external validation set 1 and external validation set 2, respectively. To illustrate the predictive capability of the STD-ATC model with KR fingerprint, as a case study, we predicted 25 FDA-approved drugs (22 drugs were actually purchased) to have potential activities on heart failure using that model. Experiments in vitro confirmed that 8 of the 22 old drugs have shown mild to potent cardioprotective activities on both hypoxia model and oxygen-glucose deprivation model, which demonstrated that our STD-ATC prediction model would be an effective tool for drug repositioning.
Collapse
Affiliation(s)
- Yayuan Peng
- East China University of Science and Technology, Shanghai, China
| | - Manjiong Wang
- East China University of Science and Technology, Shanghai, China
| | - Yixiang Xu
- East China University of Science and Technology, Shanghai, China
| | - Zengrui Wu
- East China University of Science and Technology, Shanghai, China
| | - Jiye Wang
- East China University of Science and Technology, Shanghai, China
| | - Chao Zhang
- East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- East China University of Science and Technology, Shanghai, China
| | - Jian Li
- East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- East China University of Science and Technology, Shanghai, China
| |
Collapse
|
10
|
Chen L, Liu T, Zhao X. Inferring anatomical therapeutic chemical (ATC) class of drugs using shortest path and random walk with restart algorithms. Biochim Biophys Acta Mol Basis Dis 2017; 1864:2228-2240. [PMID: 29247833 DOI: 10.1016/j.bbadis.2017.12.019] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Revised: 12/01/2017] [Accepted: 12/12/2017] [Indexed: 01/02/2023]
Abstract
The anatomical therapeutic chemical (ATC) classification system is a widely accepted drug classification scheme. This system comprises five levels and includes several classes in each level. Drugs are classified into classes according to their therapeutic effects and characteristics. The first level includes 14 main classes. In this study, we proposed two network-based models to infer novel potential chemicals deemed to belong in the first level of ATC classification. To build these models, two large chemical networks were constructed using the chemical-chemical interaction information retrieved from the Search Tool for Interactions of Chemicals (STITCH). Two classic network algorithms, shortest path (SP) and random walk with restart (RWR) algorithms, were executed on the corresponding network to mine novel chemicals for each ATC class using the validated drugs in a class as seed nodes. Then, the obtained chemicals yielded by these two algorithms were further evaluated by a permutation test and an association test. The former can exclude chemicals produced by the structure of the network, i.e., false positive discoveries. By contrast, the latter identifies the most important chemicals that have strong associations with the ATC class. Comparisons indicated that the two models can provide quite dissimilar results, suggesting that the results yielded by one model can be essential supplements for those obtained by the other model. In addition, several representative inferred chemicals were analyzed to confirm the reliability of the results generated by the two models. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.
| | - Tao Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.
| | - Xian Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| |
Collapse
|