1
|
Das P, Mazumder DH. An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects. Artif Intell Rev 2023; 56:1-28. [PMID: 36819660 PMCID: PMC9930028 DOI: 10.1007/s10462-023-10413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/19/2023]
Abstract
Approved drugs for sale must be effective and safe, implying that the drug's advantages outweigh its known harmful side effects. Side effects (SE) of drugs are one of the common reasons for drug failure that may halt the whole drug discovery pipeline. The side effects might vary from minor concerns like a runny nose to potentially life-threatening issues like liver damage, heart attack, and death. Therefore, predicting the side effects of the drug is vital in drug development, discovery, and design. Supervised machine learning-based side effects prediction task has recently received much attention since it reduces time, chemical waste, design complexity, risk of failure, and cost. The advancement of supervised learning approaches for predicting side effects have emerged as essential computational tools. Supervised machine learning technique provides early information on drug side effects to develop an effective drug based on drug properties. Still, there are several challenges to predicting drug side effects. Thus, a near-exhaustive survey is carried out in this paper on the use of supervised machine learning approaches employed in drug side effects prediction tasks in the past two decades. In addition, this paper also summarized the drug descriptor required for the side effects prediction task, commonly utilized drug properties sources, computational models, and their performances. Finally, the research gap, open problems, and challenges for the further supervised learning-based side effects prediction task have been discussed.
Collapse
Affiliation(s)
- Pranab Das
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| | - Dilwar Hussain Mazumder
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| |
Collapse
|
2
|
Villalobos-Alva J, Ochoa-Toledo L, Villalobos-Alva MJ, Aliseda A, Pérez-Escamirosa F, Altamirano-Bustamante NF, Ochoa-Fernández F, Zamora-Solís R, Villalobos-Alva S, Revilla-Monsalve C, Kemper-Valverde N, Altamirano-Bustamante MM. Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field. Front Bioeng Biotechnol 2022; 10:788300. [PMID: 35875501 PMCID: PMC9301016 DOI: 10.3389/fbioe.2022.788300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open
Abstract
Proteins are some of the most fascinating and challenging molecules in the universe, and they pose a big challenge for artificial intelligence. The implementation of machine learning/AI in protein science gives rise to a world of knowledge adventures in the workhorse of the cell and proteome homeostasis, which are essential for making life possible. This opens up epistemic horizons thanks to a coupling of human tacit–explicit knowledge with machine learning power, the benefits of which are already tangible, such as important advances in protein structure prediction. Moreover, the driving force behind the protein processes of self-organization, adjustment, and fitness requires a space corresponding to gigabytes of life data in its order of magnitude. There are many tasks such as novel protein design, protein folding pathways, and synthetic metabolic routes, as well as protein-aggregation mechanisms, pathogenesis of protein misfolding and disease, and proteostasis networks that are currently unexplored or unrevealed. In this systematic review and biochemical meta-analysis, we aim to contribute to bridging the gap between what we call binomial artificial intelligence (AI) and protein science (PS), a growing research enterprise with exciting and promising biotechnological and biomedical applications. We undertake our task by exploring “the state of the art” in AI and machine learning (ML) applications to protein science in the scientific literature to address some critical research questions in this domain, including What kind of tasks are already explored by ML approaches to protein sciences? What are the most common ML algorithms and databases used? What is the situational diagnostic of the AI–PS inter-field? What do ML processing steps have in common? We also formulate novel questions such as Is it possible to discover what the rules of protein evolution are with the binomial AI–PS? How do protein folding pathways evolve? What are the rules that dictate the folds? What are the minimal nuclear protein structures? How do protein aggregates form and why do they exhibit different toxicities? What are the structural properties of amyloid proteins? How can we design an effective proteostasis network to deal with misfolded proteins? We are a cross-functional group of scientists from several academic disciplines, and we have conducted the systematic review using a variant of the PICO and PRISMA approaches. The search was carried out in four databases (PubMed, Bireme, OVID, and EBSCO Web of Science), resulting in 144 research articles. After three rounds of quality screening, 93 articles were finally selected for further analysis. A summary of our findings is as follows: regarding AI applications, there are mainly four types: 1) genomics, 2) protein structure and function, 3) protein design and evolution, and 4) drug design. In terms of the ML algorithms and databases used, supervised learning was the most common approach (85%). As for the databases used for the ML models, PDB and UniprotKB/Swissprot were the most common ones (21 and 8%, respectively). Moreover, we identified that approximately 63% of the articles organized their results into three steps, which we labeled pre-process, process, and post-process. A few studies combined data from several databases or created their own databases after the pre-process. Our main finding is that, as of today, there are no research road maps serving as guides to address gaps in our knowledge of the AI–PS binomial. All research efforts to collect, integrate multidimensional data features, and then analyze and validate them are, so far, uncoordinated and scattered throughout the scientific literature without a clear epistemic goal or connection between the studies. Therefore, our main contribution to the scientific literature is to offer a road map to help solve problems in drug design, protein structures, design, and function prediction while also presenting the “state of the art” on research in the AI–PS binomial until February 2021. Thus, we pave the way toward future advances in the synthetic redesign of novel proteins and protein networks and artificial metabolic pathways, learning lessons from nature for the welfare of humankind. Many of the novel proteins and metabolic pathways are currently non-existent in nature, nor are they used in the chemical industry or biomedical field.
Collapse
Affiliation(s)
- Jalil Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Luis Ochoa-Toledo
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Mario Javier Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Atocha Aliseda
- Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Fernando Pérez-Escamirosa
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | | | - Francine Ochoa-Fernández
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Ricardo Zamora-Solís
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Sebastián Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Cristina Revilla-Monsalve
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Nicolás Kemper-Valverde
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Myriam M. Altamirano-Bustamante
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
- *Correspondence: Myriam M. Altamirano-Bustamante,
| |
Collapse
|
3
|
Qiu Y, Zhang Y, Deng Y, Liu S, Zhang W. A Comprehensive Review of Computational Methods For Drug-Drug Interaction Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1968-1985. [PMID: 34003753 DOI: 10.1109/tcbb.2021.3081268] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The detection of drug-drug interactions (DDIs) is a crucial task for drug safety surveillance, which provides effective and safe co-prescriptions of multiple drugs. Since laboratory researches are often complicated, costly and time-consuming, it's urgent to develop computational approaches to detect drug-drug interactions. In this paper, we conduct a comprehensive review of state-of-the-art computational methods falling into three categories: literature-based extraction methods, machine learning-based prediction methods and pharmacovigilance-based data mining methods. Literature-based extraction methods detect DDIs from published literature using natural language processing techniques; machine learning-based prediction methods build prediction models based on the known DDIs in databases and predict novel ones; pharmacovigilance-based data mining methods usually apply statistical techniques on various electronic data to detect drug-drug interaction signals. We first present the taxonomy of drug-drug interaction detection methods and provide the outlines of three categories of methods. Afterwards, we respectively introduce research backgrounds and data sources of three categories, and illustrate their representative approaches as well as evaluation metrics. Finally, we discuss the current challenges of existing methods and highlight potential opportunities for future directions.
Collapse
|
4
|
Joshi P, Masilamani V, Mukherjee A. A Knowledge Graph Embedding Based Approach to Predict the Adverse Drug Reactions Using a Deep Neural Network. J Biomed Inform 2022; 132:104122. [PMID: 35753606 DOI: 10.1016/j.jbi.2022.104122] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 12/27/2022]
Abstract
Recently Artificial Intelligence(AI) has not only been used to diagnose the disease but also to cure the disease. Researchers started using AI for drug discovery. Predicting the Adverse Drug Reactions(ADRs) caused by the drug in the manufacturing stage or in the clinical trial stage is a very important problem in drug discovery. ADRs have become a major concern resulting in injuries and also becoming fatal sometimes. Drug safety has gained much importance over the years propelling to the forefront investigation of predicting the ADRs. Although prior studies have queried diverse approaches to predict ADRs, very few were found to be effective. Also, the problem of having fewer reports makes the prediction of ADRs more difficult. To tackle this problem effectively, a novel method has been proposed in this paper. The proposed method is based on Knowledge Graph(KG) embedding. Using the KG embedding, we designed and trained a custom-made Deep Neural Network(DNN) called KGDNN(Knowledge Graph DNN) for predicting the ADRs. A KG has been constructed with 6 types of entities: drugs, ADRs, target proteins, indications, pathways, and genes. Using the Node2Vec algorithm, each node has been embedded into a feature space. Using those embeddings, the ADRs are classified by the KGDNN model. The proposed method has obtained an AUROC score of 0.917 and significantly outperformed the existing methods. Two case studies on drugs causing liver injury and COVID-19 recommended drugs have been performed to illustrate the model efficacy.
Collapse
Affiliation(s)
- Pratik Joshi
- Department of Computer Science and Engineering, Indian Institute of Information Technology Design & Manufacturing, Kancheepuram, Chennai - 600127, India.
| | - V Masilamani
- Department of Computer Science and Engineering, Indian Institute of Information Technology Design & Manufacturing, Kancheepuram, Chennai - 600127, India
| | | |
Collapse
|
5
|
Chen M, Jiang W, Pan Y, Dai J, Lei Y, Ji C. SGFNNs: Signed Graph Filtering-based Neural Networks for Predicting Drug-Drug Interactions. J Comput Biol 2022; 29:1104-1116. [PMID: 35723646 DOI: 10.1089/cmb.2022.0113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Capturing comprehensive information about drug-drug interactions (DDIs) is one of the key tasks in public health and drug development. Recently, graph neural networks (GNNs) have received increasing attention in the drug discovery domain due to their capability of integrating drugs profiles and the network structure into a low-dimensional feature space for predicting links and classification. Most of GNN models for DDI predictions are built on an unsigned graph, which tends to represent associated nodes with similar embedding results. However, semantic correlation between drugs, such as degressive effects, or even adverse side reactions should be disassortative. In this study, we put forward signed GNNs to model assortative and disassortative relationships within drug pairs. Since negative links exclude direct generalization of spectral filters on unsigned graph, we divide the signed graph into two unsigned subgraphs to dedicate two spectral filters, which captures both commonality and difference of drug pairs. For drug representations we derive two signed graph filtering-based neural networks (SGFNNs) which integrate signed graph structures and drug node attributes. Moreover, we use an end-to-end framework for learning DDIs, where an SGFNN together with a discriminator is jointly trained under a problem-specific loss function. The experimental results on two prediction problems show that our framework can obtain significant improvements compared with baselines. The case study further verifies the validation of our method.
Collapse
Affiliation(s)
- Ming Chen
- Department of Artificial Intelligence, College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Wei Jiang
- Department of Artificial Intelligence, College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Jianhua Dai
- Department of Artificial Intelligence, College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Yunwen Lei
- School of Computer Science, University of Birmingham, Birmingham, United Kingdom
| | - Chunyan Ji
- Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
6
|
Zhao L, Zhu Y, Wang J, Wen N, Wang C, Cheng L. A brief review of protein-ligand interaction prediction. Comput Struct Biotechnol J 2022; 20:2831-2838. [PMID: 35765652 PMCID: PMC9189993 DOI: 10.1016/j.csbj.2022.06.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 01/21/2023] Open
Abstract
The task of identifying protein–ligand interactions (PLIs) plays a prominent role in the field of drug discovery. However, it is infeasible to identify potential PLIs via costly and laborious in vitro experiments. There is a need to develop PLI computational prediction approaches to speed up the drug discovery process. In this review, we summarize a brief introduction to various computation-based PLIs. We discuss these approaches, in particular, machine learning-based methods, with illustrations of different emphases based on mainstream trends. Moreover, we analyzed three research dynamics that can be further explored in future studies.
Collapse
Affiliation(s)
- Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
- Corresponding authors.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, China
- Corresponding authors.
| |
Collapse
|
7
|
Mobility in Unsupervised Word Embeddings for Knowledge Extraction—The Scholars’ Trajectories across Research Topics. FUTURE INTERNET 2022. [DOI: 10.3390/fi14010025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the knowledge discovery field of the Big Data domain the analysis of geographic positioning and mobility information plays a key role. At the same time, in the Natural Language Processing (NLP) domain pre-trained models such as BERT and word embedding algorithms such as Word2Vec enabled a rich encoding of words that allows mapping textual data into points of an arbitrary multi-dimensional space, in which the notion of proximity reflects an association among terms or topics. The main contribution of this paper is to show how analytical tools, traditionally adopted to deal with geographic data to measure the mobility of an agent in a time interval, can also be effectively applied to extract knowledge in a semantic realm, such as a semantic space of words and topics, looking for latent trajectories that can benefit the properties of neural network latent representations. As a case study, the Scopus database was queried about works of highly cited researchers in recent years. On this basis, we performed a dynamic analysis, for measuring the Radius of Gyration as an index of the mobility of researchers across scientific topics. The semantic space is built from the automatic analysis of the paper abstracts of each author. In particular, we evaluated two different methodologies to build the semantic space and we found that Word2Vec embeddings perform better than the BERT ones for this task. Finally, The scholars’ trajectories show some latent properties of this model, which also represent new scientific contributions of this work. These properties include (i) the correlation between the scientific mobility and the achievement of scientific results, measured through the H-index; (ii) differences in the behavior of researchers working in different countries and subjects; and (iii) some interesting similarities between mobility patterns in this semantic realm and those typically observed in the case of human mobility.
Collapse
|
8
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
9
|
Patrick MT, Bardhi R, Raja K, He K, Tsoi LC. Advancement in predicting interactions between drugs used to treat psoriasis and its comorbidities by integrating molecular and clinical resources. J Am Med Inform Assoc 2021; 28:1159-1167. [PMID: 33544847 DOI: 10.1093/jamia/ocaa335] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 12/14/2020] [Indexed: 12/18/2022] Open
Abstract
OBJECTIVE Drug-drug interactions (DDIs) can result in adverse and potentially life-threatening health consequences; however, it is challenging to predict potential DDIs in advance. We introduce a new computational approach to comprehensively assess the drug pairs which may be involved in specific DDI types by combining information from large-scale gene expression (984 transcriptomic datasets), molecular structure (2159 drugs), and medical claims (150 million patients). MATERIALS AND METHODS Features were integrated using ensemble machine learning techniques, and we evaluated the DDIs predicted with a large hospital-based medical records dataset. Our pipeline integrates information from >30 different resources, including >10 000 drugs and >1.7 million drug-gene pairs. We applied our technique to predict interactions between 37 611 drug pairs used to treat psoriasis and its comorbidities. RESULTS Our approach achieves >0.9 area under the receiver operator curve (AUROC) for differentiating 11 861 known DDIs from 25 750 non-DDI drug pairs. Significantly, we demonstrate that the novel DDIs we predict can be confirmed through independent data sources and supported using clinical medical records. CONCLUSIONS By applying machine learning and taking advantage of molecular, genomic, and health record data, we are able to accurately predict potential new DDIs that can have an impact on public health.
Collapse
Affiliation(s)
- Matthew T Patrick
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Redina Bardhi
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan, USA.,School of Medicine, Wayne State University, Detroit, Michigan, USA
| | - Kalpana Raja
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan, USA.,Morgridge Institute for Research, Madison, Wisconsin, USA
| | - Kevin He
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA
| | - Lam C Tsoi
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan, USA.,Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
10
|
Liu T, Cui J, Zhuang H, Wang H. Modeling polypharmacy effects with heterogeneous signed graph convolutional networks. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02296-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
11
|
Zhang F, Sun B, Diao X, Zhao W, Shu T. Prediction of adverse drug reactions based on knowledge graph embedding. BMC Med Inform Decis Mak 2021; 21:38. [PMID: 33541342 PMCID: PMC7863488 DOI: 10.1186/s12911-021-01402-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 01/19/2021] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Adverse drug reactions (ADRs) are an important concern in the medication process and can pose a substantial economic burden for patients and hospitals. Because of the limitations of clinical trials, it is difficult to identify all possible ADRs of a drug before it is marketed. We developed a new model based on data mining technology to predict potential ADRs based on available drug data. METHOD Based on the Word2Vec model in Nature Language Processing, we propose a new knowledge graph embedding method that embeds drugs and ADRs into their respective vectors and builds a logistic regression classification model to predict whether a given drug will have ADRs. RESULT First, a new knowledge graph embedding method was proposed, and comparison with similar studies showed that our model not only had high prediction accuracy but also was simpler in model structure. In our experiments, the AUC of the classification model reached a maximum of 0.87, and the mean AUC was 0.863. CONCLUSION In this paper, we introduce a new method to embed knowledge graph to vectorize drugs and ADRs, then use a logistic regression classification model to predict whether there is a causal relationship between them. The experiment showed that the use of knowledge graph embedding can effectively encode drugs and ADRs. And the proposed ADRs prediction system is also very effective.
Collapse
Affiliation(s)
- Fei Zhang
- Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037 China
| | - Bo Sun
- Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037 China
| | - Xiaolin Diao
- Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037 China
| | - Wei Zhao
- Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037 China
| | - Ting Shu
- National Institute of Hospital Administration, National Health Commission, Building 3, Yard 6, Shouti South Road, Haidian, Beijing, 100044 China
| |
Collapse
|
12
|
Mantripragada AS, Teja SP, Katasani RR, Joshi P, V M, Ramesh R. Prediction of adverse drug reactions using drug convolutional neural networks. J Bioinform Comput Biol 2021; 19:2050046. [PMID: 33472571 DOI: 10.1142/s0219720020500468] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Prediction of Adverse Drug Reactions (ADRs) has been an important aspect of Pharmacovigilance because of its impact in the pharma industry. The standard process of introduction of a new drug into a market involves a lot of clinical trials and tests. This is a tedious and time consuming process and also involves a lot of monetary resources. The faster approval of a drug helps the patients who are in need of the drug. The in silico prediction of Adverse Drug Reactions can help speed up the aforementioned process. The challenges involved are lack of negative data present and predicting ADR from just the chemical structure. Although many models are already available to predict ADR, most of the models use biological activities identifiers, chemical and physical properties in addition to chemical structures of the drugs. But for most of the new drugs to be tested, only chemical structures will be available. The performance of the existing models predicting ADR only using chemical structures is not efficient. Therefore, an efficient prediction of ADRs from just the chemical structure has been proposed in this paper. The proposed method involves a separate model for each ADR, making it a binary classification problem. This paper presents a novel CNN model called Drug Convolutional Neural Network (DCNN) to predict ADRs using chemical structures of the drugs. The performance is measured using the metrics such as Accuracy, Recall, Precision, Specificity, F1 score, AUROC and MCC. The results obtained by the proposed DCNN model outperform the competing models on the SIDER4.1 database in terms of all the metrics. A case study has been performed on a COVID-19 recommended drugs, where the proposed model predicted the ADRs that are well aligned with the observations made by medical professionals using conventional methods.
Collapse
Affiliation(s)
| | - Sai Phani Teja
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai 600127, India
| | - Rohith Reddy Katasani
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai 600127, India
| | - Pratik Joshi
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai 600127, India
| | - Masilamani V
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai 600127, India
| | | |
Collapse
|
13
|
Liu Z, Chen Q, Lan W, Liang J, Chen YPP, Chen B. A Survey of Network Embedding for Drug Analysis and Prediction. Curr Protein Pept Sci 2020; 22:CPPS-EPUB-107859. [PMID: 32614745 DOI: 10.2174/1389203721666200702145701] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 04/05/2020] [Accepted: 05/21/2020] [Indexed: 11/22/2022]
Abstract
Traditional network-based computational methods have shown good results in drug analysis and prediction. However, these methods are time consuming and lack universality, and it is difficult to exploit the auxiliary information of nodes and edges. Network embedding provides a promising way for alleviating the above problems by transforming network into a low-dimensional space while preserving network structure and auxiliary information. This thus facilitates the application of machine learning algorithms for subsequent processing. Network embedding has been introduced into drug analysis and prediction in the last few years, and has shown superior performance over traditional methods. However, there is no systematic review of this issue. This article offers a comprehensive survey of the primary network embedding methods and their applications in drug analysis and prediction. The network embedding technologies applied in homogeneous network and heterogeneous network are investigated and compared, including matrix decomposition, random walk, and deep learning. Especially, the Graph neural network (GNN) methods in deep learning are highlighted. Further, the applications of network embedding in drug similarity estimation, drug-target interaction prediction, adverse drug reactions prediction, protein function and therapeutic peptides prediction are discussed. Several future potential research directions are also discussed.
Collapse
Affiliation(s)
- Zhixian Liu
- School of Medical, Guangxi University, Nanning. China
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning. China
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning. China
| | - Jiahai Liang
- School of Electronics and Information Engineering, Beibu Gulf University, Qinzhou. China
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne. Australia
| | - Baoshan Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, Nanning. China
| |
Collapse
|
14
|
Drug Side-Effect Prediction Via Random Walk on the Signed Heterogeneous Drug Network. MOLECULES (BASEL, SWITZERLAND) 2019; 24:molecules24203668. [PMID: 31614686 PMCID: PMC6832386 DOI: 10.3390/molecules24203668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/08/2019] [Accepted: 10/10/2019] [Indexed: 11/30/2022]
Abstract
Drug side-effects have become a major public health concern as they are the underlying cause of over a million serious injuries and deaths each year. Therefore, it is of critical importance to detect side-effects as early as possible. Existing computational methods mainly utilize the drug chemical profile and the drug biological profile to predict the side-effects of a drug. In the utilized drug biological profile information, they only focus on drug–target interactions and neglect the modes of action of drugs on target proteins. In this paper, we develop a new method for predicting potential side-effects of drugs based on more comprehensive drug information in which the modes of action of drugs on target proteins are integrated. Drug information of multiple types is modeled as a signed heterogeneous information network. We propose a signed heterogeneous information network embedding framework for learning drug embeddings and predicting side-effects of drugs. We use two bias random walk procedures to obtain drug sequences and train a Skip-gram model to learn drug embeddings. We experimentally demonstrate the performance of the proposed method by comparison with state-of-the-art methods. Furthermore, the results of a case study support our hypothesis that modes of action of drugs on target proteins are meaningful in side-effect prediction.
Collapse
|
15
|
Liang X, Zhu W, Lv Z, Zou Q. Molecular Computing and Bioinformatics. Molecules 2019; 24:E2358. [PMID: 31247973 PMCID: PMC6651761 DOI: 10.3390/molecules24132358] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 06/25/2019] [Indexed: 02/06/2023] Open
Abstract
Molecular computing and bioinformatics are two important interdisciplinary sciences that study molecules and computers. Molecular computing is a branch of computing that uses DNA, biochemistry, and molecular biology hardware, instead of traditional silicon-based computer technologies. Research and development in this area concerns theory, experiments, and applications of molecular computing. The core advantage of molecular computing is its potential to pack vastly more circuitry onto a microchip than silicon will ever be capable of-and to do it cheaply. Molecules are only a few nanometers in size, making it possible to manufacture chips that contain billions-even trillions-of switches and components. To develop molecular computers, computer scientists must draw on expertise in subjects not usually associated with their field, including organic chemistry, molecular biology, bioengineering, and smart materials. Bioinformatics works on the contrary; bioinformatics researchers develop novel algorithms or software tools for computing or predicting the molecular structure or function. Molecular computing and bioinformatics pay attention to the same object, and have close relationships, but work toward different orientations.
Collapse
Affiliation(s)
- Xin Liang
- School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China
| | - Wen Zhu
- School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China
| | - Zhibin Lv
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|