1
|
Singh DP, Kaushik B. A systematic literature review for the prediction of anticancer drug response using various machine-learning and deep-learning techniques. Chem Biol Drug Des 2023; 101:175-194. [PMID: 36303299 DOI: 10.1111/cbdd.14164] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/13/2022] [Accepted: 10/24/2022] [Indexed: 12/24/2022]
Abstract
Computational methods have gained prominence in healthcare research. The accessibility of healthcare data has greatly incited academicians and researchers to develop executions that help in prognosis of cancer drug response. Among various computational methods, machine-learning (ML) and deep-learning (DL) methods provide the most consistent and effectual approaches to handle the serious aftermaths of the deadly disease and drug administered to the patients. Hence, this systematic literature review has reviewed researches that have investigated drug discovery and prognosis of anticancer drug response using ML and DL algorithms. Fot this purpose, PRISMA guidelines have been followed to choose research papers from Google Scholar, PubMed, and Sciencedirect websites. A total count of 105 papers that align with the context of this review were chosen. Further, the review also presents accuracy of the existing ML and DL methods in the prediction of anticancer drug response. It has been found from the review that, amidst the availability of various studies, there are certain challenges associated with each method. Thus, future researchers can consider these limitations and challenges to develop a prominent anticancer drug response prediction method, and it would be greatly beneficial to the medical professionals in administering non-invasive treatment to the patients.
Collapse
Affiliation(s)
- Davinder Paul Singh
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| | - Baijnath Kaushik
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| |
Collapse
|
2
|
Singh N, Bhatnagar S. Machine Learning for Prediction of Drug Targets in Microbe Associated Cardiovascular Diseases by Incorporating Host-pathogen Interaction Network Parameters. Mol Inform 2021; 41:e2100115. [PMID: 34676983 DOI: 10.1002/minf.202100115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 10/01/2021] [Indexed: 12/20/2022]
Abstract
Host-pathogen interactions play a crucial role in invasion, infection, and induction of immune response in humans. In this work, four machine learning algorithms, namely Logistic regression, K-nearest neighbor, Support Vector Machine, and Random Forest were implemented for the classification of drug targets. The algorithms were trained using 3400 hosts and 3800 pathogen drug and non-drug target proteins as learning instances. For each protein, 68 pathogen and 73 host features were computed that included sequence, structure, biological and host-pathogen network centrality characteristics. The Random Forest classifier model achieved the best accuracy after 10-fold cross-validation. 99 % accuracy was achieved with a ROC-AUC score of 0.99±0.01 for both pathogen and host training sets. The Eigenvector Centrality of host-pathogen interactions and host-host interactions was the top feature in performing classification of pathogen and host targets respectively. Other features important for classification were the presence of catalytic and binding sites, low instability/aliphatic index, and cellular location. The Random Forest classifier was then used for prediction of drug targets involved in Microbe Associated Cardiovascular Diseases. 331 host and 743 pathogen proteins were predicted as drug targets by the random forest model and can be validated experimentally for therapeutic intervention in Microbe Associated Cardiovascular Diseases.
Collapse
Affiliation(s)
- Nirupma Singh
- Department of Biotechnology, Netaji Subhas Institute of Technology, Dwarka, New Delhi, 110078, India
| | - Sonika Bhatnagar
- Department of Biotechnology, Netaji Subhas Institute of Technology, Dwarka, New Delhi, 110078, India.,Computational and Structural Biology Laboratory, Department of Biological Sciences and Engineering, Netaji Subhas University of Technology Dwarka, New Delhi, 110078, India
| |
Collapse
|
3
|
Madaj R, Geoffrey B, Sanker A, Valluri PP. Target2DeNovoDrug: a novel programmatic tool for in silico-deep learning based de novo drug design for any target of interest. J Biomol Struct Dyn 2021; 40:7511-7516. [PMID: 33703998 DOI: 10.1080/07391102.2021.1898474] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The on-going data-science and Artificial Intelligence (AI) revolution offer researchers a fresh set of tools to approach structure-based drug design problems in the computer-aided drug design space. A novel programmatic tool that incorporates in silico and deep learning based approaches for de novo drug design for any target of interest has been reported. Once the user specifies the target of interest in the form of a representative amino acid sequence or corresponding nucleotide sequence, the programmatic workflow of the tool generates compounds from the PubChem ligand library and novel SMILES of compounds not present in any ligand library but are likely to be active against the target. Following this, the tool performs a computationally efficient In-Silico modeling of the target and the newly generated compounds and stores the results of the protein-ligand interaction in the working folder of the user. Further, for the protein-ligand complex associated with the best protein-ligand interaction, the tool performs an automated Molecular Dynamics (MD) protocol and generates plots such as RMSD (Root Mean Square Deviation) which reveal the stability of the complex. A demonstrated use of the tool has been shown with the target signatures of Tumor Necrosis Factor-Alpha, an important therapeutic target in the case of anti-inflammatory treatment. The future scope of the tool involves, running the tool on a High-Performance Cluster for all known target signatures to generate data that will be useful to drive AI and Big data driven drug discovery. The code is hosted, maintained, and supported at the GitHub repository given in the link below https://github.com/bengeof/Target2DeNovoDrugCommunicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rafal Madaj
- Centre of Molecular and Macromolecular Studies, Polish Academy of Sciences, Poland
| | | | - Akhil Sanker
- Deparment of Computer Science, SRM University, Chennai, India
| | - Pavan Preetham Valluri
- Department of Applied Mathematics and Computational Science, PSG College of Technology, Coimbatore, India
| |
Collapse
|
4
|
Sadeghi SS, Keyvanpour MR. An Analytical Review of Computational Drug Repurposing. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:472-488. [PMID: 31403439 DOI: 10.1109/tcbb.2019.2933825] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Drug repurposing is a vital function in pharmaceutical fields and has gained popularity in recent years in both the pharmaceutical industry and research community. It refers to the process of discovering new uses and indications for existing or failed drugs. It is cost-effective and reliable in contrast to experimental drug discovery, which is a costly, time-consuming, and risky process and limited to a relatively small number of targets. Accordingly, a plethora of computational methodologies have been propounded to repurpose drugs on a large scale by utilizing available high throughput data. The available literature, however, lacks a contemporary and comprehensive analysis of the current computational drug repurposing methodologies. In this paper, we presented a systematic analysis of computational drug repurposing which consists of three main sections: Initially, we categorize the computational drug repurposing methods based on their technical approach and artificial intelligence perspective and discuss the strengths and weaknesses of various methods. Secondly, some general criteria are recommended to analyze our proposed categorization. In the third and final section, a qualitative comparison is made between each approach which is a guide to understanding their preference to one another. Further, this systematic analysis can help in the efficient selection and improvement of drug repurposing techniques based on the nature of computational methods implemented on biological resources.
Collapse
|
5
|
Aafjes-van Doorn K, Kamsteeg C, Bate J, Aafjes M. A scoping review of machine learning in psychotherapy research. Psychother Res 2020; 31:92-116. [PMID: 32862761 DOI: 10.1080/10503307.2020.1808729] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Machine learning (ML) offers robust statistical and probabilistic techniques that can help to make sense of large amounts of data. This scoping review paper aims to broadly explore the nature of research activity using ML in the context of psychological talk therapies, highlighting the scope of current methods and considerations for clinical practice and directions for future research. Using a systematic search methodology, fifty-one studies were identified. A narrative synthesis indicates two types of studies, those who developed and tested an ML model (k=44), and those who reported on the feasibility of a particular treatment tool that uses an ML algorithm (k=7). Most model development studies used supervised learning techniques to classify or predict labeled treatment process or outcome data, whereas others used unsupervised techniques to identify clusters in the unlabeled patient or treatment data. Overall, the current applications of ML in psychotherapy research demonstrated a range of possible benefits for indications of treatment process, adherence, therapist skills and treatment response prediction, as well as ways to accelerate research through automated behavioral or linguistic process coding. Given the novelty and potential of this research field, these proof-of-concept studies are encouraging, however, do not necessarily translate to improved clinical practice (yet).
Collapse
Affiliation(s)
| | | | - Jordan Bate
- Ferkauf Graduate School of Psychology, Yeshiva University, Bronx, NY, USA
| | | |
Collapse
|
6
|
Ma R, Li Y, Li C, Wan F, Hu H, Xu W, Zeng J. Secure multiparty computation for privacy-preserving drug discovery. Bioinformatics 2020; 36:2872-2880. [DOI: 10.1093/bioinformatics/btaa038] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 01/08/2020] [Accepted: 01/15/2020] [Indexed: 01/24/2023] Open
Abstract
Abstract
Motivation
Quantitative structure–activity relationship (QSAR) and drug–target interaction (DTI) prediction are both commonly used in drug discovery. Collaboration among pharmaceutical institutions can lead to better performance in both QSAR and DTI prediction. However, the drug-related data privacy and intellectual property issues have become a noticeable hindrance for inter-institutional collaboration in drug discovery.
Results
We have developed two novel algorithms under secure multiparty computation (MPC), including QSARMPC and DTIMPC, which enable pharmaceutical institutions to achieve high-quality collaboration to advance drug discovery without divulging private drug-related information. QSARMPC, a neural network model under MPC, displays good scalability and performance and is feasible for privacy-preserving collaboration on large-scale QSAR prediction. DTIMPC integrates drug-related heterogeneous network data and accurately predicts novel DTIs, while keeping the drug information confidential. Under several experimental settings that reflect the situations in real drug discovery scenarios, we have demonstrated that DTIMPC possesses significant performance improvement over the baseline methods, generates novel DTI predictions with supporting evidence from the literature and shows the feasible scalability to handle growing DTI data. All these results indicate that QSARMPC and DTIMPC can provide practically useful tools for advancing privacy-preserving drug discovery.
Availability and implementation
The source codes of QSARMPC and DTIMPC are available on the GitHub: https://github.com/rongma6/QSARMPC_DTIMPC.git.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rong Ma
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Yi Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Chenxing Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Fangping Wan
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing 100084, China
| | - Wei Xu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
| |
Collapse
|
7
|
Rathi M, Grover V, Kheterpal T. Dr. Query. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH 2020. [DOI: 10.4018/ijsir.2020010103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Drugs can help us to treat disease, but sometimes medication can cause severe side effects. With a little knowledge, one can have drugs that are intended to prevent or avoid adverse outcome. Recognizing potential drugs enhances the quality of the healthcare system and reduces the risk associated with drug intake. Several factors like drug-drug interactions and side effects should be known to us before we intake drugs. So, the authors' motive is to develop a predictive mobile-based healthcare tool that would help drug consumers to find drugs which suit them best. As an outcome, the tool will provide the names of the top 10 medicines that will be best for specified indications and do not cause specified side effects and do not or least interact with mentioned drugs. Proposed mobile-based drug query tool will provide exact query matching drugs as well as close matches by leveraging machine learning in the tool.
Collapse
Affiliation(s)
- Megha Rathi
- Jaypee Institute of Information Technology, Noida, India
| | - Vaibhav Grover
- Jaypee Institute of Information Technology, Noida, India
| | | |
Collapse
|
8
|
Rumpf RW, Wolock SL, Ray WC. StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data. Cancer Inform 2014; 13:63-9. [PMID: 25368511 PMCID: PMC4214597 DOI: 10.4137/cin.s14024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Revised: 07/30/2014] [Accepted: 07/30/2014] [Indexed: 11/17/2022] Open
Abstract
As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria – effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene–SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.
Collapse
Affiliation(s)
- Robert Wolfgang Rumpf
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Samuel L Wolock
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - William C Ray
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
9
|
Kandel DD, Raychaudhury C, Pal D. Two new atom centered fragment descriptors and scoring function enhance classification of antibacterial activity. J Mol Model 2014; 20:2164. [PMID: 24664120 DOI: 10.1007/s00894-014-2164-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 01/30/2014] [Indexed: 11/26/2022]
Abstract
Classification of pharmacologic activity of a chemical compound is an essential step in any drug discovery process. We develop two new atom-centered fragment descriptors (vertex indices)--one based solely on topological considerations without discriminating atom or bond types, and another based on topological and electronic features. We also assess their usefulness by devising a method to rank and classify molecules with regard to their antibacterial activity. Classification performances of our method are found to be superior compared to two previous studies on large heterogeneous data sets for hit finding and hit-to-lead studies even though we use much fewer parameters. It is found that for hit finding studies topological features (simple graph) alone provide significant discriminating power, and for hit-to-lead process small but consistent improvement can be made by additionally including electronic features (colored graph). Our approach is simple, interpretable, and suitable for design of molecules as we do not use any physicochemical properties. The singular use of vertex index as descriptor, novel range based feature extraction, and rigorous statistical validation are the key elements of this study.
Collapse
|
10
|
Wang Y, Zhou Q, Dai H, Zhang T, Wei DQ. Prediction of the functional consequences of single amino acid substitution in human cytochrome P450. MOLECULAR SIMULATION 2012. [DOI: 10.1080/08927022.2012.708415] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
11
|
Abstract
There is a general agreement that the development of metabolomics depends not only on advances in chemical analysis techniques but also on advances in computing and data analysis methods. Metabolomics data usually requires intensive pre-processing, analysis, and mining procedures. Selecting and applying such procedures requires attention to issues including justification, traceability, and reproducibility. We describe a strategy for selecting data mining techniques which takes into consideration the goals of data mining techniques on the one hand, and the goals of metabolomics investigations and the nature of the data on the other. The strategy aims to ensure the validity and soundness of results and promote the achievement of the investigation goals.
Collapse
|
12
|
Comparative Study of Classification Algorithms Using Molecular Descriptors in Toxicological DataBases. ACTA ACUST UNITED AC 2009. [DOI: 10.1007/978-3-642-03223-3_11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
|
13
|
Dearden JC. In silico prediction of ADMET properties: how far have we come? Expert Opin Drug Metab Toxicol 2008; 3:635-9. [PMID: 17916052 DOI: 10.1517/17425255.3.5.635] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
There have been considerable advances in the last few years in both the quantity and the quality of in silico ADMET property predictions. Most ADMET properties are now computable, and the accuracy of some of the software predictions for physicochemical properties in particular is close to that of measured data. There is, however, universal agreement that more good experimental ADMET data are needed for use in in silico model development, for models are only as good as the data on which they are based. Many data remain confidential but it is to be hoped that, with projects such as the Vitic toxicity database, being developed by Lhasa Limited, pharmaceutical companies will be prepared to release data to an 'honest broker' on a confidential basis, so that better in silico models can be developed. Incorporation of calculated ADMET properties into drug discovery and development is a multi-factorial problem and really needs a multi-factorial solution. Some progress is being made in this direction and it is hoped that within the foreseeable future software will be available for this purpose.
Collapse
Affiliation(s)
- John C Dearden
- Liverpool John Moores University, School of Pharmacy and Chemistry, Byrom Street, Liverpool, L3 3AF, UK.
| |
Collapse
|
14
|
Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications. STUDIES IN COMPUTATIONAL INTELLIGENCE 2008. [DOI: 10.1007/978-3-540-78293-3_22] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|