1
|
Li L, Huang F, Zhang YH, Cai YD. Identifying allergic-rhinitis-associated genes with random-walk-based method in PPI network. Comput Biol Med 2024; 175:108495. [PMID: 38697003 DOI: 10.1016/j.compbiomed.2024.108495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/21/2024] [Accepted: 04/21/2024] [Indexed: 05/04/2024]
Abstract
Allergic rhinitis is a common allergic disease with a complex pathogenesis and many unresolved issues. Studies have shown that the incidence of allergic rhinitis is closely related to genetic factors, and research on the related genes could help further understand its pathogenesis and develop new treatment methods. In this study, 446 allergic rhinitis-related genes were obtained on the basis of the DisGeNET database. The protein-protein interaction network was searched using the random-walk-with-restart algorithm with these 446 genes as seed nodes to assess the linkages between other genes and allergic rhinitis. Then, this result was further examined by three screening tests, including permutation, interaction, and enrichment tests, which aimed to pick up genes that have strong and special associations with allergic rhinitis. 52 novel genes were finally obtained. The functional enrichment test confirmed their relationships to the biological processes and pathways related to allergic rhinitis. Furthermore, some genes were extensively analyzed to uncover their special or latent associations to allergic rhinitis, including IRAK2 and MAPK, which are involved in the pathogenesis of allergic rhinitis and the inhibition of allergic inflammation via the p38-MAPK pathway, respectively. The new found genes may help the following investigations for understanding the underlying molecular mechanisms of allergic rhinitis and developing effective treatments.
Collapse
Affiliation(s)
- Lin Li
- Department of Otolaryngology and Head&neck, The Affiliated Wuxi People's Hospital of Nanjing Medical University, Wuxi Medical Center, Nanjing Medical University, Wuxi, 214023, China; Department of Otolaryngology and Head&neck, China-Japan Union Hospital, Jilin University, Changchun, 130033, China.
| | - FeiMing Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China.
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
2
|
Abdallah RM, Hasan HE, Hammad A. Predictive modeling of skin permeability for molecules: Investigating FDA-approved drug permeability with various AI algorithms. PLOS DIGITAL HEALTH 2024; 3:e0000483. [PMID: 38568888 PMCID: PMC10990209 DOI: 10.1371/journal.pdig.0000483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 03/05/2024] [Indexed: 04/05/2024]
Abstract
The transdermal route of drug administration has gained popularity for its convenience and bypassing the first-pass metabolism. Accurate skin permeability prediction is crucial for successful transdermal drug delivery (TDD). In this study, we address this critical need to enhance TDD. A dataset comprising 441 records for 140 molecules with diverse LogKp values was characterized. The descriptor calculation yielded 145 relevant descriptors. Machine learning models, including MLR, RF, XGBoost, CatBoost, LGBM, and ANN, were employed for regression analysis. Notably, LGBM, XGBoost, and gradient boosting models outperformed others, demonstrating superior predictive accuracy. Key descriptors influencing skin permeability, such as hydrophobicity, hydrogen bond donors, hydrogen bond acceptors, and topological polar surface area, were identified and visualized. Cluster analysis applied to the FDA-approved drug dataset (2326 compounds) revealed four distinct clusters with significant differences in molecular characteristics. Predicted LogKp values for these clusters offered insights into the permeability variations among FDA-approved drugs. Furthermore, an investigation into skin permeability patterns across 83 classes of FDA-approved drugs based on the ATC code showcased significant differences, providing valuable information for drug development strategies. The study underscores the importance of accurate skin permeability prediction for TDD, emphasizing the superior performance of nonlinear machine learning models. The identified key descriptors and clusters contribute to a nuanced understanding of permeability characteristics among FDA-approved drugs. These findings offer actionable insights for drug design, formulation, and prioritization of molecules with optimum properties, potentially reducing reliance on costly experimental testing. Future research directions include offering promising applications in pharmaceutical research and formulation within the burgeoning field of computer-aided drug design.
Collapse
Affiliation(s)
- Rami M. Abdallah
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, Zarqa University, Zarqa, Jordan
| | - Hisham E. Hasan
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, Zarqa University, Zarqa, Jordan
| | - Ahmad Hammad
- Department of Artificial Intelligence, Faculty of Information Technology, Middle East University, Amman, Jordan
| |
Collapse
|
3
|
Chen L, Xu J, Zhou Y. PDATC-NCPMKL: Predicting drug's Anatomical Therapeutic Chemical (ATC) codes based on network consistency projection and multiple kernel learning. Comput Biol Med 2024; 169:107862. [PMID: 38150886 DOI: 10.1016/j.compbiomed.2023.107862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/19/2023] [Accepted: 12/17/2023] [Indexed: 12/29/2023]
Abstract
The development and discovery of new drugs is time-consuming and needs lots of human and material resources. Therefore, discovery of novel effects of existing drugs is an important alternative way, which can accelerate the process of designing "new" drugs. The anatomical Therapeutic Chemical (ATC) classification system recommended by World Health Organization (WHO) is a basic research area in this regard. A novel ATC code of an existing drug suggests its novel effects. Some computational models have been proposed, which can predict the drug-ATC code associations. However, their performance is not very high. There still exist spaces for improvement. In this study, a new recommendation system (named PDATC-NCPMKL), which incorporated network consistency projection and multi-kernel learning, was designed to identify drug-ATC code associations. For drugs or ATC codes, several kernels were constructed, which were fused by a multiple kernel learning method and an additional kernel integration scheme. To enhance the performance, the drug-ATC code association adjacency matrix was reformulated by a variant of weighted K nearest known neighbors (WKNKN). The reformulated adjacency matrix, drug and ATC code kernels were fed into network consistency projection to generate the association score matrix. The proposed recommendation system was tested on the ATC codes at the second, third and fourth levels in drug ATC classification system using ten-fold cross-validation. The results indicated that all AUROC and AUPR values were close to or exceeded 0.96. Such performance was higher than some existing computational models. Some additional tests were conducted to prove the utility of adjacency matrix reformulation and to analyze the importance of drug and ATC code kernels.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Jing Xu
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China.
| | - Yubin Zhou
- Department of Thoracic Surgery, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 610072, China.
| |
Collapse
|
4
|
Yu Z, Wu Z, Wang Z, Wang Y, Zhou M, Li W, Liu G, Tang Y. Network-Based Methods and Their Applications in Drug Discovery. J Chem Inf Model 2024; 64:57-75. [PMID: 38150548 DOI: 10.1021/acs.jcim.3c01613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
Drug discovery is time-consuming, expensive, and predominantly follows the "one drug → one target → one disease" paradigm. With the rapid development of systems biology and network pharmacology, a novel drug discovery paradigm, "multidrug → multitarget → multidisease", has emerged. This new holistic paradigm of drug discovery aligns well with the essence of networks, leading to the emergence of network-based methods in the field of drug discovery. In this Perspective, we initially introduce the concept and data sources of networks and highlight classical methodologies employed in network-based methods. Subsequently, we focus on the practical applications of network-based methods across various areas of drug discovery, such as target prediction, virtual screening, prediction of drug therapeutic effects or adverse drug events, and elucidation of molecular mechanisms. In addition, we provide representative web servers for researchers to use network-based methods in specific applications. Finally, we discuss several challenges of network-based methods and the directions for future development. In a word, network-based methods could serve as powerful tools to accelerate drug discovery.
Collapse
Affiliation(s)
- Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Moran Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
5
|
Zhao H, Duan G, Ni P, Yan C, Li Y, Wang J. RNPredATC: A Deep Residual Learning-Based Model With Applications to the Prediction of Drug-ATC Code Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2712-2723. [PMID: 34110998 DOI: 10.1109/tcbb.2021.3088256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The Anatomical Therapeutic Chemical (ATC) classification system, designated by the World Health Organization Collaborating Center (WHOCC), has been widely used in drug screening, repositioning, and similarity research. The ATC classification system assigns different codes to drugs according to the organ or system on which they act and/or their therapeutic and chemical characteristics. Correctly identifying the potential ATC codes for drugs can accelerate drug development and reduce the cost of experiments. Several classifiers have been proposed in this regard. However, they lack of ability to learn basic features from sparsely known drug-ATC code associations. Therefore, there is an urgent need for novel computational methods to precisely predict potential drug-ATC code associations in multiple levels of the ATC classification system based on known associations between drugs and ATC codes. In this paper, we provide a novel end-to-end model, so-called RNPredATC, to predict potential drug-ATC code associations in five ATC classification levels. RNPredATC can extract dense feature vectors from sparsely known drug-ATC code associations and reduce the impact from the degradation problem by a novel deep residual learning. We extensively compare our method with some state-of-the-art methods, including NetPredATC, SPACE, and some multi-label-based methods. Our experimental results show that RNPredATC achieves better performances in five-fold and ten-fold cross validations. Furthermore, the visualization analysis of hidden layers and case studies of predicted associations at the fifth ATC classification level confirm that RNPredATC can effectively identify the potential ATC codes of drugs.
Collapse
|
6
|
Cao Y, Yang ZQ, Zhang XL, Fan W, Wang Y, Shen J, Wei DQ, Li Q, Wei XY. Identifying the kind behind SMILES-anatomical therapeutic chemical classification using structure-only representations. Brief Bioinform 2022; 23:6677124. [PMID: 36027578 DOI: 10.1093/bib/bbac346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/11/2022] [Accepted: 07/26/2022] [Indexed: 01/25/2023] Open
Abstract
Anatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. We present a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development. To this end, we construct a new benchmark consisting of 4545 compounds which is with larger scale than the one used in previous study. A light-weight prediction model is proposed. The model is with better explainability in the sense that it is consists of a straightforward tokenization that extracts and embeds statistically and physicochemically meaningful tokens, and a deep network backed by a set of pyramid kernels to capture multi-resolution chemical structural characteristics. Its efficacy has been validated in the experiments where it outperforms the state-of-the-art methods by 15.53% in accuracy and by 69.66% in terms of efficiency. We make the benchmark dataset, source code and web server open to ease the reproduction of this study.
Collapse
Affiliation(s)
- Yi Cao
- Department of Computer Science, Sichuan University, 610065, Chengdu, China
| | - Zhen-Qun Yang
- Department of Biomedical Engineering, Chinese University of Hong Kong, Street, Shatin, Hong Kong
| | - Xu-Lu Zhang
- Department of Computer Science, Sichuan University, 610065, Chengdu, China
| | - Wenqi Fan
- Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Yaowei Wang
- Peng Cheng Laboratory, 518000, Shenzhen, China
| | | | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Qing Li
- Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Xiao-Yong Wei
- Department of Computer Science, Sichuan University, 610065, Chengdu, China.,Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong
| |
Collapse
|
7
|
Jiang M, Zhou B, Chen L. Identification of drug side effects with a path-based method. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:5754-5771. [PMID: 35603377 DOI: 10.3934/mbe.2022269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The study of drug side effects is a significant task in drug discovery. Candidate drugs with unaccepted side effects must be eliminated to prevent risks for both patients and pharmaceutical companies. Thus, all side effects for any candidate drug should be determined. However, this task, which is carried out through traditional experiments, is time-consuming and expensive. Building computational methods has been increasingly used for the identification of drug side effects. In the present study, a new path-based method was proposed to determine drug side effects. A heterogeneous network was built to perform such method, which defined drugs and side effects as nodes. For any drug and side effect, the proposed path-based method determined all paths with limited length that connects them and further evaluated the association between them based on these paths. The strong association indicates that the drug has a side effect with a high probability. By using two types of jackknife test, the method yielded good performance and was superior to some other network-based methods. Furthermore, the effects of one parameter in the method and heterogeneous network was analyzed.
Collapse
Affiliation(s)
- Meng Jiang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Bo Zhou
- Shanghai University of Medicine & Health Sciences, Shanghai 201318, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
8
|
Similarity-Based Method with Multiple-Feature Sampling for Predicting Drug Side Effects. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9547317. [PMID: 35401786 PMCID: PMC8993545 DOI: 10.1155/2022/9547317] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 09/18/2021] [Accepted: 03/15/2022] [Indexed: 12/23/2022]
Abstract
Drugs can treat different diseases but also bring side effects. Undetected and unaccepted side effects for approved drugs can greatly harm the human body and bring huge risks for pharmaceutical companies. Traditional experimental methods used to determine the side effects have several drawbacks, such as low efficiency and high cost. One alternative to achieve this purpose is to design computational methods. Previous studies modeled a binary classification problem by pairing drugs and side effects; however, their classifiers can only extract one feature from each type of drug association. The present work proposed a novel multiple-feature sampling scheme that can extract several features from one type of drug association. Thirteen classification algorithms were employed to construct classifiers with features yielded by such scheme. Their performance was greatly improved compared with that of the classifiers that use the features yielded by the original scheme. Best performance was observed for the classifier based on random forest with MCC of 0.8661, AUROC of 0.969, and AUPR of 0.977. Finally, one key parameter in the multiple-feature sampling scheme was analyzed.
Collapse
|
9
|
Yu Z, Wu Z, Li W, Liu G, Tang Y. ADENet: a novel network-based inference method for prediction of drug adverse events. Brief Bioinform 2022; 23:6510157. [PMID: 35039845 DOI: 10.1093/bib/bbab580] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/02/2021] [Accepted: 12/19/2021] [Indexed: 11/13/2022] Open
Abstract
Identification of adverse drug events (ADEs) is crucial to reduce human health risks and improve drug safety assessment. With an increasing number of biological and medical data, computational methods such as network-based methods were proposed for ADE prediction with high efficiency and low cost. However, previous network-based methods rely on the topological information of known drug-ADE networks, and hence cannot make predictions for novel compounds without any known ADE. In this study, we introduced chemical substructures to bridge the gap between the drug-ADE network and novel compounds, and developed a novel network-based method named ADENet, which can predict potential ADEs for not only drugs within the drug-ADE network, but also novel compounds outside the network. To show the performance of ADENet, we collected drug-ADE associations from a comprehensive database named MetaADEDB and constructed a series of network-based prediction models. These models obtained high area under the receiver operating characteristic curve values ranging from 0.871 to 0.947 in 10-fold cross-validation. The best model further showed high performance in external validation, which outperformed a previous network-based and a recent deep learning-based method. Using several approved drugs as case studies, we found that 32-54% of the predicted ADEs can be validated by the literature, indicating the practical value of ADENet. Moreover, ADENet is freely available at our web server named NetInfer (http://lmmd.ecust.edu.cn/netinfer). In summary, our method would provide a promising tool for ADE prediction and drug safety assessment in drug discovery and development.
Collapse
Affiliation(s)
- Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
10
|
Sheng M, Cai H, Yang Q, Li J, Zhang J, Liu L. A Random Walk-Based Method to Identify Candidate Genes Associated With Lymphoma. Front Genet 2021; 12:792754. [PMID: 34899868 PMCID: PMC8655984 DOI: 10.3389/fgene.2021.792754] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 11/02/2021] [Indexed: 11/16/2022] Open
Abstract
Lymphoma is a serious type of cancer, especially for adolescents and elder adults, although this malignancy is quite rare compared with other types of cancer. The cause of this malignancy remains ambiguous. Genetic factor is deemed to be highly associated with the initiation and progression of lymphoma, and several genes have been related to this disease. Determining the pathogeny of lymphoma by identifying the related genes is important. In this study, we presented a random walk-based method to infer the novel lymphoma-associated genes. From the reported 1,458 lymphoma-associated genes and protein–protein interaction network, raw candidate genes were mined by using the random walk with restart algorithm. The determined raw genes were further filtered by using three screening tests (i.e., permutation, linkage, and enrichment tests). These tests could control false-positive genes and screen out essential candidate genes with strong linkages to validate the lymphoma-associated genes. A total of 108 inferred genes were obtained. Analytical results indicated that some inferred genes, such as RAC3, TEC, IRAK2/3/4, PRKCE, SMAD3, BLK, TXK, PRKCQ, were associated with the initiation and progression of lymphoma.
Collapse
Affiliation(s)
- Minjie Sheng
- Department of Ophthalmology, Yangpu Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Haiying Cai
- Department of Ophthalmology, Yangpu Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Qin Yang
- Department of Ophthalmology, Yangpu Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Jing Li
- Department of Ophthalmology, Yangpu Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Jian Zhang
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai, China.,Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai, China.,National Clinical Research Center for Eye Diseases, Shanghai, China.,Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, China
| | - Lihua Liu
- Department of Ophthalmology, Yangpu Hospital, School of Medicine, Tongji University, Shanghai, China
| |
Collapse
|
11
|
Wang X, Liu M, Zhang Y, He S, Qin C, Li Y, Lu T. Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery. Brief Bioinform 2021; 22:6342939. [PMID: 34368838 DOI: 10.1093/bib/bbab289] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 07/03/2021] [Accepted: 07/06/2021] [Indexed: 01/17/2023] Open
Abstract
The advent of large-scale biomedical data and computational algorithms provides new opportunities for drug repurposing and discovery. It is of great interest to find an appropriate data representation and modeling method to facilitate these studies. The anatomical therapeutic chemical (ATC) classification system, proposed by the World Health Organization (WHO), is an essential source of information for drug repurposing and discovery. Besides, computational methods are applied to predict drug ATC classification. We conducted a systematic review of ATC computational prediction studies and revealed the differences in data sets, data representation, algorithm approaches, and evaluation metrics. We then proposed a deep fusion learning (DFL) framework to optimize the ATC prediction model, namely DeepATC. The methods based on graph convolutional network, inferring biological network and multimodel attentive fusion network were applied in DeepATC to extract the molecular topological information and low-dimensional representation from the molecular graph and heterogeneous biological networks. The results indicated that DeepATC achieved superior model performance with area under the curve (AUC) value at 0.968. Furthermore, the DFL framework was performed for the transcriptome data-based ATC prediction, as well as another independent task that is significantly relevant to drug discovery, namely drug-target interaction. The DFL-based model achieved excellent performance in the above-extended validation task, suggesting that the idea of aggregating the heterogeneous biological network and node's (molecule or protein) self-topological features will bring inspiration for broader drug repurposing and discovery research.
Collapse
Affiliation(s)
- Xiting Wang
- Life Science School, Beijing University of Chinese Medicine, Beijing, China
| | - Meng Liu
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Yiling Zhang
- Beijing University of Chinese Medicine, Beijing, China
| | - Shuangshuang He
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Caimeng Qin
- School of Life Sciences, Beijing University of Chinese Medicine and Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yu Li
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Tao Lu
- Integrative Medicine Center in School of Life Sciences, Beijing University of Chinese Medicine, Beijing, China
| |
Collapse
|
12
|
Aryanfar A, Medlej S, Tarhini A, Tehrani B AR. Elliptic percolation model for predicting the electrical conductivity of graphene-polymer composites. SOFT MATTER 2021; 17:2081-2089. [PMID: 33439207 DOI: 10.1039/d0sm01950j] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Graphene-based polymers exhibit a conductive microstructure formed by aggregates in a matrix which drastically enhances their transmitting properties. We develop a new numerical framework for predicting the electrical conductivity based on continuum percolation theory in a two dimensional stochastically-generated medium. We analyze the role of the flake shape and its aspect ratio and consequently predict the onset of percolation based on the particle density and the domain scale. Simultaneously, we have performed experiments and have achieved very high electrical conductivity for such composites compared to other film fabrication techniques, which have verified the results of computing the homogenized electrical conductivity. As well, the proximity to and a comparison with other analytical models and other experimental techniques are presented. The numerical model can predict the composite transmitting conductivity in a larger range of particle geometry. Such quantification is exceedingly useful for effective utilization and optimization of graphene filler densities and their spatial distribution during manufacturing.
Collapse
Affiliation(s)
- Asghar Aryanfar
- American University of Beirut, Riad El-Solh 1107, Lebanon.
- Bahçesehir University, 4 Çırağan Cad, Besiktas, Istanbul 34353, Turkey
| | - Sajed Medlej
- American University of Beirut, Riad El-Solh 1107, Lebanon.
| | - Ali Tarhini
- American University of Beirut, Riad El-Solh 1107, Lebanon.
| | - Ali R Tehrani B
- American University of Beirut, Riad El-Solh 1107, Lebanon.
- Aalto University, Chemical Engineering, Espoo 02150, Finland
| |
Collapse
|
13
|
Liang H, Hu B, Chen L, Wang S, Aorigele. Recognizing novel chemicals/drugs for anatomical therapeutic chemical classes with a heat diffusion algorithm. Biochim Biophys Acta Mol Basis Dis 2020; 1866:165910. [DOI: 10.1016/j.bbadis.2020.165910] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/20/2020] [Accepted: 08/03/2020] [Indexed: 12/14/2022]
|
14
|
Identification of Latent Oncogenes with a Network Embedding Method and Random Forest. BIOMED RESEARCH INTERNATIONAL 2020; 2020:5160396. [PMID: 33029511 PMCID: PMC7530476 DOI: 10.1155/2020/5160396] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/09/2020] [Accepted: 09/14/2020] [Indexed: 12/29/2022]
Abstract
Oncogene is a special type of genes, which can promote the tumor initiation. Good study on oncogenes is helpful for understanding the cause of cancers. Experimental techniques in early time are quite popular in detecting oncogenes. However, their defects become more and more evident in recent years, such as high cost and long time. The newly proposed computational methods provide an alternative way to study oncogenes, which can provide useful clues for further investigations on candidate genes. Considering the limitations of some previous computational methods, such as lack of learning procedures and terming genes as individual subjects, a novel computational method was proposed in this study. The method adopted the features derived from multiple protein networks, viewing proteins in a system level. A classic machine learning algorithm, random forest, was applied on these features to capture the essential characteristic of oncogenes, thereby building the prediction model. All genes except validated oncogenes were ranked with a measurement yielded by the prediction model. Top genes were quite different from potential oncogenes discovered by previous methods, and they can be confirmed to become novel oncogenes. It was indicated that the newly identified genes can be essential supplements for previous results.
Collapse
|
15
|
Zhou JP, Chen L, Guo ZH. iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs. Bioinformatics 2020; 36:1391-1396. [PMID: 31593226 DOI: 10.1093/bioinformatics/btz757] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 09/10/2019] [Accepted: 10/01/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The anatomical therapeutic chemical (ATC) classification system plays an increasingly important role in drug repositioning and discovery. The correct identification of classes in each level of such system that a given drug may belong to is an essential problem. Several multi-label classifiers have been proposed in this regard. Although they provided satisfactory performance, the feature extraction procedures were still rough. More refined features may further improve the predicted quality. RESULTS In this article, we provide a novel multi-label classifier, called iATC-NRAKEL, to predict drug ATC classes in the first level. To obtain more informative drug features, we employed the drug association information in STITCH and KEGG, which was organized by seven drug networks. The powerful network embedding algorithm, Mashup, was adopted to extract informative drug features. The obtained features were fed into the RAndom k-labELsets (RAKEL) algorithm with support vector machine as the basic classification algorithm to construct the classifier. The 10-fold cross-validation of the benchmark dataset with 3883 drugs showed that the accuracy and absolute true were 76.56 and 74.51%, respectively. The comparison results indicated that iATC-NRAKEL was much superior to all previous reported classifiers. Finally, the contribution of each network was analyzed. AVAILABILITY AND IMPLEMENTATION The codes of iATC-NRAKEL are available at https://github.com/zhou256/iATC-NRAKEL.
Collapse
Affiliation(s)
- Jian-Peng Zhou
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.,Shanghai Key Laboratory of PMMP, East China Normal University, Shanghai 200241, People's Republic of China
| | - Zi-Han Guo
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| |
Collapse
|
16
|
Zhang X, Chen L. Prediction of membrane protein types by fusing protein-protein interaction and protein sequence information. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020; 1868:140524. [PMID: 32858174 DOI: 10.1016/j.bbapap.2020.140524] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/17/2020] [Accepted: 07/30/2020] [Indexed: 11/30/2022]
Abstract
Membrane proteins are gatekeepers to the cell and essential for determination of the function of cells. Identification of the types of membrane proteins is an essential problem in cell biology. It is time-consuming and expensive to identify the type of membrane proteins with traditional experimental methods. The alternative way is to design effective computational methods, which can provide quick and reliable predictions. To date, several computational methods have been proposed in this regard. Several of them used the features extracted from the sequence information of individual proteins. Recently, networks are more and more popular to tackle different protein-related problems, which can organize proteins in a system level and give an overview of all proteins. However, such form weakens the essential properties of proteins, such as their sequence information. In this study, a novel feature fusion scheme was proposed, which integrated the information of protein sequences and protein-protein interaction network. The fused features of a protein were defined as the linear combination of sequence features of all proteins in the network, where the combination coefficients were the probabilities yielded by the random walk with restart algorithm with the protein as the seed node. Several models with such fused features and different classification algorithms were built and evaluated. Their performance for predicting the type of membrane proteins was improved compared with the models only with the sequence features or network information.
Collapse
Affiliation(s)
- Xiaolin Zhang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.
| |
Collapse
|
17
|
Identification of COVID-19 Infection-Related Human Genes Based on a Random Walk Model in a Virus-Human Protein Interaction Network. BIOMED RESEARCH INTERNATIONAL 2020; 2020:4256301. [PMID: 32685484 PMCID: PMC7345912 DOI: 10.1155/2020/4256301] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 06/26/2020] [Indexed: 12/15/2022]
Abstract
Coronaviruses are specific crown-shaped viruses that were first identified in the 1960s, and three typical examples of the most recent coronavirus disease outbreaks include severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and COVID-19. Particularly, COVID-19 is currently causing a worldwide pandemic, threatening the health of human beings globally. The identification of viral pathogenic mechanisms is important for further developing effective drugs and targeted clinical treatment methods. The delayed revelation of viral infectious mechanisms is currently one of the technical obstacles in the prevention and treatment of infectious diseases. In this study, we proposed a random walk model to identify the potential pathological mechanisms of COVID-19 on a virus–human protein interaction network, and we effectively identified a group of proteins that have already been determined to be potentially important for COVID-19 infection and for similar SARS infections, which help further developing drugs and targeted therapeutic methods against COVID-19. Moreover, we constructed a standard computational workflow for predicting the pathological biomarkers and related pharmacological targets of infectious diseases.
Collapse
|
18
|
Zhou B, Zhao X, Lu J, Sun Z, Liu M, Zhou Y, Liu R, Wang Y. Relating Substructures and Side Effects of Drugs with Chemical-chemical Interactions. Comb Chem High Throughput Screen 2020; 23:285-294. [DOI: 10.2174/1386207322666190702102752] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Revised: 03/11/2019] [Accepted: 04/16/2019] [Indexed: 12/17/2022]
Abstract
Background:Drugs are very important for human life because they can provide treatment, cure, prevention, or diagnosis of different diseases. However, they also cause side effects, which can increase the risks for humans and pharmaceuticals companies. It is essential to identify drug side effects in drug discovery. To date, lots of computational methods have been proposed to predict the side effects of drugs and most of them used the fact that similar drugs always have similar side effects. However, previous studies did not analyze which substructures are highly related to which kind of side effect.Method:In this study, we conducted a computational investigation. In this regard, we extracted a drug set for each side effect, which consisted of drugs having the side effect. Also, for each substructure, a set was constructed by picking up drugs owing such substructure. The relationship between one side effect and one substructure was evaluated based on linkages between drugs in their corresponding drug sets, resulting in an Es value. Then, the statistical significance of Es value was measured by a permutation test.Results and Conclusion:A number of highly related pairs of side effects and substructures were obtained and some were extensively analyzed to confirm the reliability of the results reported in this study.
Collapse
Affiliation(s)
- Bo Zhou
- Shanghai University of Medicine and Health Sciences, Shanghai 201318, China
| | - Xian Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Jing Lu
- School of Pharmacy, Key Laboratory of Molecular Pharmacology and Drug Evaluation (Yantai University), Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, Yantai University, Yantai 264005, China
| | - Zuntao Sun
- Informatization Office, Shanghai Maritime University, Shanghai 201306, China
| | - Min Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Yilu Zhou
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Rongzhi Liu
- Center for Medical Device Evaluation, China Drug Administration, State Administration for Market Regulation, Beijing 100081, China
| | - Yihua Wang
- Biological Sciences, Faculty of Environmental and Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
19
|
Che J, Chen L, Guo ZH, Wang S, Aorigele. Drug Target Group Prediction with Multiple Drug Networks. Comb Chem High Throughput Screen 2020; 23:274-284. [DOI: 10.2174/1386207322666190702103927] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 03/11/2019] [Accepted: 04/15/2019] [Indexed: 02/07/2023]
Abstract
Background:
Identification of drug-target interaction is essential in drug discovery. It is
beneficial to predict unexpected therapeutic or adverse side effects of drugs. To date, several
computational methods have been proposed to predict drug-target interactions because they are
prompt and low-cost compared with traditional wet experiments.
Methods:
In this study, we investigated this problem in a different way. According to KEGG,
drugs were classified into several groups based on their target proteins. A multi-label classification
model was presented to assign drugs into correct target groups. To make full use of the known drug
properties, five networks were constructed, each of which represented drug associations in one
property. A powerful network embedding method, Mashup, was adopted to extract drug features
from above-mentioned networks, based on which several machine learning algorithms, including
RAndom k-labELsets (RAKEL) algorithm, Label Powerset (LP) algorithm and Support Vector
Machine (SVM), were used to build the classification model.
Results and Conclusion:
Tenfold cross-validation yielded the accuracy of 0.839, exact match of
0.816 and hamming loss of 0.037, indicating good performance of the model. The contribution of
each network was also analyzed. Furthermore, the network model with multiple networks was
found to be superior to the one with a single network and classic model, indicating the superiority
of the proposed model.
Collapse
Affiliation(s)
- Jingang Che
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Zi-Han Guo
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Aorigele
- Faculty of Engineering, University of Toyama, Toyama, Japan
| |
Collapse
|
20
|
Prediction of Drug Side Effects with a Refined Negative Sample Selection Strategy. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:1573543. [PMID: 32454877 PMCID: PMC7232712 DOI: 10.1155/2020/1573543] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 04/14/2020] [Accepted: 04/23/2020] [Indexed: 01/07/2023]
Abstract
Drugs are an important way to treat various diseases. However, they inevitably produce side effects, bringing great risks to human bodies and pharmaceutical companies. How to predict the side effects of drugs has become one of the essential problems in drug research. Designing efficient computational methods is an alternative way. Some studies paired the drug and side effect as a sample, thereby modeling the problem as a binary classification problem. However, the selection of negative samples is a key problem in this case. In this study, a novel negative sample selection strategy was designed for accessing high-quality negative samples. Such strategy applied the random walk with restart (RWR) algorithm on a chemical-chemical interaction network to select pairs of drugs and side effects, such that drugs were less likely to have corresponding side effects, as negative samples. Through several tests with a fixed feature extraction scheme and different machine-learning algorithms, models with selected negative samples produced high performance. The best model even yielded nearly perfect performance. These models had much higher performance than those without such strategy or with another selection strategy. Furthermore, it is not necessary to consider the balance of positive and negative samples under such a strategy.
Collapse
|
21
|
Inferring novel genes related to oral cancer with a network embedding method and one-class learning algorithms. Gene Ther 2019; 26:465-478. [PMID: 31455874 DOI: 10.1038/s41434-019-0099-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 06/18/2019] [Accepted: 07/15/2019] [Indexed: 12/14/2022]
Abstract
Oral cancer (OC) is one of the most common cancers threatening human lives. However, OC pathogenesis has yet to be fully uncovered, and thus designing effective treatments remains difficult. Identifying genes related to OC is an important way for achieving this purpose. In this study, we proposed three computational models for inferring novel OC-related genes. In contrast to previously proposed computational methods, which lacked the learning procedures, each proposed model adopted a one-class learning algorithm, which can provide a deep insight into features of validated OC-related genes. A network embedding algorithm (i.e., node2vec) was applied to the protein-protein interaction network to produce the representation of genes. The features of the OC-related genes were used in the training of the one-class algorithm, and the performance of the final inferring model was improved through a feature selection procedure. Then, candidate genes were produced by applying the trained inferring model to other genes. Three tests were performed to screen out the important candidate genes. Accordingly, we obtained three inferred gene sets, any two of which were different. The inferred genes were also different from previous reported genes and some of them have been included in the public Oral Cancer Gene Database. Finally, we analyzed several inferred genes to confirm whether they are novel OC-related genes.
Collapse
|
22
|
Lu S, Zhu ZG, Lu WC. Inferring novel genes related to colorectal cancer via random walk with restart algorithm. Gene Ther 2019; 26:373-385. [PMID: 31308477 DOI: 10.1038/s41434-019-0090-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 05/20/2019] [Accepted: 06/11/2019] [Indexed: 12/12/2022]
Abstract
Colorectal cancer (CRC) is the third most common type of cancer. In recent decades, genomic analysis has played an increasingly important role in understanding the molecular mechanisms of CRC. However, its pathogenesis has not been fully uncovered. Identification of genes related to CRC as complete as possible is an important way to investigate its pathogenesis. Therefore, we proposed a new computational method for the identification of novel CRC-associated genes. The proposed method is based on existing proven CRC-associated genes, human protein-protein interaction networks, and random walk with restart algorithm. The utility of the method is indicated by comparing it to the methods based on Guilt-by-association or shortest path algorithm. Using the proposed method, we successfully identified 298 novel CRC-associated genes. Previous studies have validated the involvement of the majority of these 298 novel genes in CRC-associated biological processes, thus suggesting the efficacy and accuracy of our method.
Collapse
Affiliation(s)
- Sheng Lu
- Department of General Surgery, Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Institute of Digestive Surgery, Shanghai, 200025, China
| | - Zheng-Gang Zhu
- Department of General Surgery, Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Institute of Digestive Surgery, Shanghai, 200025, China
| | - Wen-Cong Lu
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
23
|
Wang T, Chen L, Zhao X. Prediction of Drug Combinations with a Network Embedding Method. Comb Chem High Throughput Screen 2019; 21:789-797. [DOI: 10.2174/1386207322666181226170140] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 11/02/2018] [Accepted: 11/28/2018] [Indexed: 01/10/2023]
Abstract
Aim and Objective:
There are several diseases having a complicated mechanism. For such
complicated diseases, a single drug cannot treat them very well because these diseases always
involve several targets and single targeted drugs cannot modulate these targets simultaneously. Drug
combination is an effective way to treat such diseases. However, determination of effective drug
combinations is time- and cost-consuming via traditional methods. It is urgent to build quick and
cheap methods in this regard. Designing effective computational methods incorporating advanced
computational techniques to predict drug combinations is an alternative and feasible way.
Method:
In this study, we proposed a novel network embedding method, which can extract
topological features of each drug combination from a drug network that was constructed using
chemical-chemical interaction information retrieved from STITCH. These topological features were
combined with individual features of drug combination reported in one previous study. Several
advanced computational methods were employed to construct an effective prediction model, such as
synthetic minority oversampling technique (SMOTE) that was used to tackle imbalanced dataset,
minimum redundancy maximum relevance (mRMR) and incremental feature selection (IFS)
methods that were adopted to analyze features and extract optimal features for building an optimal
support machine vector (SVM) classifier.
Results and Conclusion:
The constructed optimal SVM classifier yielded an MCC of 0.806, which
is superior to the classifier only using individual features with or without SMOTE. The performance
of the classifier can be improved by combining the topological features and essential features of a
drug combination.
Collapse
Affiliation(s)
- Tianyun Wang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xian Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
24
|
Lu S, Zhao K, Wang X, Liu H, Ainiwaer X, Xu Y, Ye M. Use of Laplacian Heat Diffusion Algorithm to Infer Novel Genes With Functions Related to Uveitis. Front Genet 2018; 9:425. [PMID: 30349554 PMCID: PMC6186792 DOI: 10.3389/fgene.2018.00425] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 09/10/2018] [Indexed: 12/17/2022] Open
Abstract
Uveitis is the inflammation of the uvea and is a serious eye disease that can cause blindness for middle-aged and young people. However, the pathogenesis of this disease has not been fully uncovered and thus renders difficulties in designing effective treatments. Completely identifying the genes related to this disease can help improve and accelerate the comprehension of uveitis. In this study, a new computational method was developed to infer potential related genes based on validated ones. We employed a large protein–protein interaction network reported in STRING, in which Laplacian heat diffusion algorithm was applied using validated genes as seed nodes. Except for the validated ones, all genes in the network were filtered by three tests, namely, permutation, association, and function tests, which evaluated the genes based on their specialties and associations to uveitis. Results indicated that 59 inferred genes were accessed, several of which were confirmed to be highly related to uveitis by literature review. In addition, the inferred genes were compared with those reported in a previous study, indicating that our reported genes are necessary supplements.
Collapse
Affiliation(s)
- Shiheng Lu
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| | - Ke Zhao
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| | - Xuefei Wang
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| | - Hui Liu
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| | - Xiamuxiya Ainiwaer
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| | - Yan Xu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Min Ye
- Department of Ophthalmology, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Pudong, China
| |
Collapse
|
25
|
Zhao X, Chen L, Lu J. A similarity-based method for prediction of drug side effects with heterogeneous information. Math Biosci 2018; 306:136-144. [PMID: 30296417 DOI: 10.1016/j.mbs.2018.09.010] [Citation(s) in RCA: 111] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 09/22/2018] [Accepted: 09/25/2018] [Indexed: 12/25/2022]
Abstract
Drugs can produce intended therapeutic effects to treat different diseases. However, they may also cause side effects at the same time. For an approved drug, it is best to detect all side effects it can produce. Otherwise, it may bring great risks for pharmaceuticals companies as well as be harmful to human body. It is urgent to design quick and reliable identification methods to detect the side effects for a given drug. In this study, a binary classification model was proposed to predict drug side effects. Different from most previous methods, our model termed the pair of drug and side effect as a sample and convert the original problem to a binary classification problem. Based on the similarity idea, each pair was represented by five features, each of which was derived from a type of drug property. The strong machine learning algorithm, random forest, was adopted as the prediction engine. The ten-fold cross-validation on five datasets with different negative samples indicated that the proposed model yielded a good performance of Matthews correlation coefficient around 0.550 and AUC around 0.8492. In addition, we also analyzed the contribution of each drug property for construction of the model. The results indicated that drug similarity in fingerprint was most related to the prediction of drug side effects and all drug properties gave less or more contributions.
Collapse
Affiliation(s)
- Xian Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China; Shanghai Key Laboratory of PMMP, East China Normal University, Shanghai 200241, People's Republic of China.
| | - Jing Lu
- School of Pharmacy, Key Laboratory of Molecular Pharmacology and Drug Evaluation (Yantai University), Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, Yantai University, Yantai 264005, People's Republic of China
| |
Collapse
|
26
|
Chen L, Zhang YH, Zhang Z, Huang T, Cai YD. Inferring Novel Tumor Suppressor Genes with a Protein-Protein Interaction Network and Network Diffusion Algorithms. MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT 2018; 10:57-67. [PMID: 30069494 PMCID: PMC6068090 DOI: 10.1016/j.omtm.2018.06.007] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Accepted: 06/19/2018] [Indexed: 02/07/2023]
Abstract
Extensive studies on tumor suppressor genes (TSGs) are helpful to understand the pathogenesis of cancer and design effective treatments. However, identifying TSGs using traditional experiments is quite difficult and time consuming. Developing computational methods to identify possible TSGs is an alternative way. In this study, we proposed two computational methods that integrated two network diffusion algorithms, including Laplacian heat diffusion (LHD) and random walk with restart (RWR), to search possible genes in the whole network. These two computational methods were LHD-based and RWR-based methods. To increase the reliability of the putative genes, three strict screening tests followed to filter genes obtained by these two algorithms. After comparing the putative genes obtained by the two methods, we designated twelve genes (e.g., MAP3K10, RND1, and OTX2) as common genes, 29 genes (e.g., RFC2 and GUCY2F) as genes that were identified only by the LHD-based method, and 128 genes (e.g., SNAI2 and FGF4) as genes that were inferred only by the RWR-based method. Some obtained genes can be confirmed as novel TSGs according to recent publications, suggesting the utility of our two proposed methods. In addition, the reported genes in this study were quite different from those reported in a previous one.
Collapse
Affiliation(s)
- Lei Chen
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People’s Republic of China
| | - Yu-Hang Zhang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China
| | - Zhenghua Zhang
- Department of Clinical Oncology, Jing’an District Centre Hospital of Shanghai (Huashan Hospital Fudan University Jing’An Branch), Shanghai 200040, People’s Republic of China
| | - Tao Huang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China
- Corresponding author: Tao Huang, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People’s Republic of China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, People’s Republic of China
- Corresponding author: Yu-Dong Cai, School of Life Sciences, Shanghai University, Shanghai 200444, People’s Republic of China.
| |
Collapse
|