1
|
Vishwakarma S, Hernandez-Hernandez S, Ballester PJ. Graph neural networks are promising for phenotypic virtual screening on cancer cell lines. Biol Methods Protoc 2024; 9:bpae065. [PMID: 39502795 PMCID: PMC11537795 DOI: 10.1093/biomethods/bpae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/20/2024] [Accepted: 09/02/2024] [Indexed: 11/08/2024] Open
Abstract
Artificial intelligence is increasingly driving early drug design, offering novel approaches to virtual screening. Phenotypic virtual screening (PVS) aims to predict how cancer cell lines respond to different compounds by focusing on observable characteristics rather than specific molecular targets. Some studies have suggested that deep learning may not be the best approach for PVS. However, these studies are limited by the small number of tested molecules as well as not employing suitable performance metrics and dissimilar-molecules splits better mimicking the challenging chemical diversity of real-world screening libraries. Here we prepared 60 datasets, each containing approximately 30 000-50 000 molecules tested for their growth inhibitory activities on one of the NCI-60 cancer cell lines. We conducted multiple performance evaluations of each of the five machine learning algorithms for PVS on these 60 problem instances. To provide even a more comprehensive evaluation, we used two model validation types: the random split and the dissimilar-molecules split. Overall, about 14 440 training runs aczross datasets were carried out per algorithm. The models were primarily evaluated using hit rate, a more suitable metric in VS contexts. The results show that all models are more challenged by test molecules that are substantially different from those in the training data. In both validation types, the D-MPNN algorithm, a graph-based deep neural network, was found to be the most suitable for building predictive models for this PVS problem.
Collapse
Affiliation(s)
- Sachin Vishwakarma
- Evotec SAS (France), Toulouse, France
- Centre de Recherche en Cancérologie de Marseille, Marseille 13009, France
| | | | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
2
|
Ogunleye A, Piyawajanusorn C, Ghislat G, Ballester PJ. Large-Scale Machine Learning Analysis Reveals DNA Methylation and Gene Expression Response Signatures for Gemcitabine-Treated Pancreatic Cancer. HEALTH DATA SCIENCE 2024; 4:0108. [PMID: 38486621 PMCID: PMC10904073 DOI: 10.34133/hds.0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 12/08/2023] [Indexed: 03/17/2024]
Abstract
Background: Gemcitabine is a first-line chemotherapy for pancreatic adenocarcinoma (PAAD), but many PAAD patients do not respond to gemcitabine-containing treatments. Being able to predict such nonresponders would hence permit the undelayed administration of more promising treatments while sparing gemcitabine life-threatening side effects for those patients. Unfortunately, the few predictors of PAAD patient response to this drug are weak, none of them exploiting yet the power of machine learning (ML). Methods: Here, we applied ML to predict the response of PAAD patients to gemcitabine from the molecular profiles of their tumors. More concretely, we collected diverse molecular profiles of PAAD patient tumors along with the corresponding clinical data (gemcitabine responses and clinical features) from the Genomic Data Commons resource. From systematically combining 8 tumor profiles with 16 classification algorithms, each of the resulting 128 ML models was evaluated by multiple 10-fold cross-validations. Results: Only 7 of these 128 models were predictive, which underlines the importance of carrying out such a large-scale analysis to avoid missing the most predictive models. These were here random forest using 4 selected mRNAs [0.44 Matthews correlation coefficient (MCC), 0.785 receiver operating characteristic-area under the curve (ROC-AUC)] and XGBoost combining 12 DNA methylation probes (0.32 MCC, 0.697 ROC-AUC). By contrast, the hENT1 marker obtained much worse random-level performance (practically 0 MCC, 0.5 ROC-AUC). Despite not being trained to predict prognosis (overall and progression-free survival), these ML models were also able to anticipate this patient outcome. Conclusions: We release these promising ML models so that they can be evaluated prospectively on other gemcitabine-treated PAAD patients.
Collapse
Affiliation(s)
- Adeolu Ogunleye
- Department of Organismal Biology,
Uppsala University, Uppsala, Sweden
| | | | - Ghita Ghislat
- Department of Life Sciences,
Imperial College London, London, UK
| | | |
Collapse
|
3
|
Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023; 10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open
Abstract
Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients. A wave of recent papers demonstrates promising results in predicting cancer response to drug treatments while utilizing deep learning methods. These papers investigate diverse data representations, neural network architectures, learning methodologies, and evaluations schemes. However, deciphering promising predominant and emerging trends is difficult due to the variety of explored methods and lack of standardized framework for comparing drug response prediction models. To obtain a comprehensive landscape of deep learning methods, we conducted an extensive search and analysis of deep learning models that predict the response to single drug treatments. A total of 61 deep learning-based models have been curated, and summary plots were generated. Based on the analysis, observable patterns and prevalence of methods have been revealed. This review allows to better understand the current state of the field and identify major challenges and promising solution paths.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Thomas S. Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Austin Clyde
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Jamie Overbeek
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
4
|
Wang XC, Zhou H, Jiang WJ, Jiang P, Sun YC, Ni WJ. Effect of CX3CL1/CX3CR1 gene polymorphisms on the clinical efficacy of carboplatin therapy in Han patients with ovarian cancer. Front Genet 2023; 13:1065213. [PMID: 36685881 PMCID: PMC9852718 DOI: 10.3389/fgene.2022.1065213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 11/28/2022] [Indexed: 01/07/2023] Open
Abstract
Gene polymorphisms have a close relationship with the clinical effects of carboplatin for ovarian cancer. Here, we investigated the relationship between CX3CL1 and CX3CR1 genotypes and the clinical efficacy of carboplatin in ovarian cancer, thereby clarifying the unidentified genetic factors that influence the efficacy of carboplatin in ovarian cancer. Based on the above purposes, we used Sequenom Mass ARRAY technology to detect CX3CL1 and CX3CR1 gene polymorphisms in 127 patients with carboplatin-treated ovarian cancer. We performed various statistical analyses to evaluate the effects of CX3CL1 and CX3CR1 genetic variants, demographic data, and clinical characteristics on the effect of carboplatin therapy. The results show that the CX3CL1 genotypes rs223815 (G>C) and rs682082 (G>A) will significantly affect the clinical efficacy of carboplatin for ovarian cancer (p < 0.05), while the other six genotypes and all CX3CR1 genotypes have no significant effect (p > 0.05). In addition, only one population factor, age, had a significant effect on the clinical efficacy of carboplatin-treated ovarian cancer (p < 0.05). Based on the above research results, we concluded that the clinical efficacy of carboplatin in ovarian cancer patients was significantly correlated with age and CX3CL1 polymorphism factors; however, more in-depth effects and mechanisms need to be explored by large-scale, multicenter studies.
Collapse
Affiliation(s)
- Xin-Chen Wang
- Department of Pharmacy, Anhui Provincial Cancer Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Hong Zhou
- Department of Pharmacy, Anhui Provincial Cancer Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Wen-Jing Jiang
- Department of Gynecological Oncology, Anhui Provincial Cancer Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Peng Jiang
- Department of Pharmacy, Anhui Provincial Cancer Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Yan-Cai Sun
- Department of Pharmacy, Anhui Provincial Cancer Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Wei-Jian Ni
- Department of Pharmacy, Anhui Provincial Hospital, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China,Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, the Key Laboratory of Anti-inflammatory of Immune Medicines, Ministry of Education, Anhui Institute of Innovative Drugs, School of Pharmacy, Anhui Medical University, Hefei, Anhui, China,*Correspondence: Wei-Jian Ni,
| |
Collapse
|
5
|
Ogunleye AZ, Piyawajanusorn C, Gonçalves A, Ghislat G, Ballester PJ. Interpretable Machine Learning Models to Predict the Resistance of Breast Cancer Patients to Doxorubicin from Their microRNA Profiles. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2201501. [PMID: 35785523 PMCID: PMC9403644 DOI: 10.1002/advs.202201501] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/02/2022] [Indexed: 05/05/2023]
Abstract
Doxorubicin is a common treatment for breast cancer. However, not all patients respond to this drug, which sometimes causes life-threatening side effects. Accurately anticipating doxorubicin-resistant patients would therefore permit to spare them this risk while considering alternative treatments without delay. Stratifying patients based on molecular markers in their pretreatment tumors is a promising approach to advance toward this ambitious goal, but single-gene gene markers such as HER2 expression have not shown to be sufficiently predictive. The recent availability of matched doxorubicin-response and diverse molecular profiles across breast cancer patients permits now analysis at a much larger scale. 16 machine learning algorithms and 8 molecular profiles are systematically evaluated on the same cohort of patients. Only 2 of the 128 resulting models are substantially predictive, showing that they can be easily missed by a standard-scale analysis. The best model is classification and regression tree (CART) nonlinearly combining 4 selected miRNA isoforms to predict doxorubicin response (median Matthew correlation coefficient (MCC) and area under the curve (AUC) of 0.56 and 0.80, respectively). By contrast, HER2 expression is significantly less predictive (median MCC and AUC of 0.14 and 0.57, respectively). As the predictive accuracy of this CART model increases with larger training sets, its update with future data should result in even better accuracy.
Collapse
Affiliation(s)
- Adeolu Z. Ogunleye
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Chayanit Piyawajanusorn
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Anthony Gonçalves
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Ghita Ghislat
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
- Department of BioengineeringImperial College LondonLondonSW7 2AZUK
| |
Collapse
|
6
|
Nguyen LC, Naulaerts S, Bruna A, Ghislat G, Ballester PJ. Predicting Cancer Drug Response In Vivo by Learning an Optimal Feature Selection of Tumour Molecular Profiles. Biomedicines 2021; 9:biomedicines9101319. [PMID: 34680436 PMCID: PMC8533095 DOI: 10.3390/biomedicines9101319] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 09/22/2021] [Accepted: 09/23/2021] [Indexed: 12/17/2022] Open
Abstract
(1) Background: Inter-tumour heterogeneity is one of cancer’s most fundamental features. Patient stratification based on drug response prediction is hence needed for effective anti-cancer therapy. However, single-gene markers of response are rare and/or may fail to achieve a significant impact in the clinic. Machine Learning (ML) is emerging as a particularly promising complementary approach to precision oncology. (2) Methods: Here we leverage comprehensive Patient-Derived Xenograft (PDX) pharmacogenomic data sets with dimensionality-reducing ML algorithms with this purpose. (3) Results: Combining multiple gene alterations via ML leads to better discrimination between sensitive and resistant PDXs in 19 of the 26 analysed cases. Highly predictive ML models employing concise gene lists were found for three cases: paclitaxel (breast cancer), binimetinib (breast cancer) and cetuximab (colorectal cancer). Interestingly, each of these multi-gene ML models identifies some treatment-responsive PDXs not harbouring the best actionable mutation for that case. Thus, ML multi-gene predictors generally have much fewer false negatives than the corresponding single-gene marker. (4) Conclusions: As PDXs often recapitulate clinical outcomes, these results suggest that many more patients could benefit from precision oncology if ML algorithms were also applied to existing clinical pharmacogenomics data, especially those algorithms generating classifiers combining data-selected gene alterations.
Collapse
Affiliation(s)
- Linh C. Nguyen
- Cancer Research Center of Marseille, INSERM U1068, F-13009 Marseille, France;
- Institut Paoli-Calmettes, F-13009 Marseille, France
- Aix-Marseille Université UM105, F-13009 Marseille, France
- CNRS UMR7258, F-13009 Marseille, France
- Department of Life Sciences, University of Science and Technology of Hanoi, Vietnam Academy of Science and Technology, Hanoi 100803, Vietnam
| | - Stefan Naulaerts
- Ludwig Institute for Cancer Research, 1200 Brussels, Belgium;
- Duve Institute, UCLouvain, 1200 Brussels, Belgium
| | | | - Ghita Ghislat
- Centre d’Immunologie de Marseille-Luminy, INSERM U1104, CNRS UMR7280, F-13009 Marseille, France;
| | - Pedro J. Ballester
- Cancer Research Center of Marseille, INSERM U1068, F-13009 Marseille, France;
- Institut Paoli-Calmettes, F-13009 Marseille, France
- Aix-Marseille Université UM105, F-13009 Marseille, France
- CNRS UMR7258, F-13009 Marseille, France
- Correspondence: ; Tel.: + 33-(0)4-8697-7201
| |
Collapse
|