1
|
Nguyen T, Campbell A, Kumar A, Amponsah E, Fiterau M, Shahriyari L. Optimal fusion of genotype and drug embeddings in predicting cancer drug response. Brief Bioinform 2024; 25:bbae227. [PMID: 38754407 PMCID: PMC11097979 DOI: 10.1093/bib/bbae227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 04/14/2024] [Accepted: 04/25/2024] [Indexed: 05/18/2024] Open
Abstract
Predicting cancer drug response using both genomics and drug features has shown some success compared to using genomics features alone. However, there has been limited research done on how best to combine or fuse the two types of features. Using a visible neural network with two deep learning branches for genes and drug features as the base architecture, we experimented with different fusion functions and fusion points. Our experiments show that injecting multiplicative relationships between gene and drug latent features into the original concatenation-based architecture DrugCell significantly improved the overall predictive performance and outperformed other baseline models. We also show that different fusion methods respond differently to different fusion points, indicating that the relationship between drug features and different hierarchical biological level of gene features is optimally captured using different methods. Considering both predictive performance and runtime speed, tensor product partial is the best-performing fusion function to combine late-stage representations of drug and gene features to predict cancer drug response.
Collapse
Affiliation(s)
- Trang Nguyen
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Anthony Campbell
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Ankit Kumar
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Edwin Amponsah
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Madalina Fiterau
- Department of Computer Science, University of Massachusetts Amherst, Amherst 01002, MA, United States
| | - Leili Shahriyari
- Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst 01002, MA, United States
| |
Collapse
|
2
|
Kim J, Park SH, Lee H. PANCDR: precise medicine prediction using an adversarial network for cancer drug response. Brief Bioinform 2024; 25:bbae088. [PMID: 38487849 PMCID: PMC10940842 DOI: 10.1093/bib/bbae088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/09/2024] [Accepted: 02/16/2024] [Indexed: 03/18/2024] Open
Abstract
Pharmacogenomics aims to provide personalized therapy to patients based on their genetic variability. However, accurate prediction of cancer drug response (CDR) is challenging due to genetic heterogeneity. Since clinical data are limited, most studies predicting drug response use preclinical data to train models. However, such models might not be generalizable to external clinical data due to differences between the preclinical and clinical datasets. In this study, a Precision Medicine Prediction using an Adversarial Network for Cancer Drug Response (PANCDR) model is proposed. PANCDR consists of two sub-models, an adversarial model and a CDR prediction model. The adversarial model reduces the gap between the preclinical and clinical datasets, while the CDR prediction model extracts features and predicts responses. PANCDR was trained using both preclinical data and unlabeled clinical data. Subsequently, it was tested on external clinical data, including The Cancer Genome Atlas and brain tumor patients. PANCDR outperformed other machine learning models in predicting external test data. Our results demonstrate the robustness of PANCDR and its potential in precision medicine by recommending patient-specific drug candidates. The PANCDR codes and data are available at https://github.com/DMCB-GIST/PANCDR.
Collapse
Affiliation(s)
- Juyeon Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
| | - Sung-Hye Park
- Department of Pathology, Seoul National University Hospital, Seoul National University College of Medicine, 03080, Seoul, South Korea
- Neuroscience Research Institute, Seoul National University College of Medicine, 03080, Seoul, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
- Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea
| |
Collapse
|
3
|
Yuan S, Chen YC, Tsai CH, Chen HW, Shieh GS. Feature selection translates drug response predictors from cell lines to patients. Front Genet 2023; 14:1217414. [PMID: 37519889 PMCID: PMC10382684 DOI: 10.3389/fgene.2023.1217414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Targeted therapies and chemotherapies are prevalent in cancer treatment. Identification of predictive markers to stratify cancer patients who will respond to these therapies remains challenging because patient drug response data are limited. As large amounts of drug response data have been generated by cell lines, methods to efficiently translate cell-line-trained predictors to human tumors will be useful in clinical practice. Here, we propose versatile feature selection procedures that can be combined with any classifier. For demonstration, we combined the feature selection procedures with a (linear) logit model and a (non-linear) K-nearest neighbor and trained these on cell lines to result in LogitDA and KNNDA, respectively. We show that LogitDA/KNNDA significantly outperforms existing methods, e.g., a logistic model and a deep learning method trained by thousands of genes, in prediction AUC (0.70-1.00 for seven of the ten drugs tested) and is interpretable. This may be due to the fact that sample sizes are often limited in the area of drug response prediction. We further derive a novel adjustment on the prediction cutoff for LogitDA to yield a prediction accuracy of 0.70-0.93 for seven drugs, including erlotinib and cetuximab, whose pathways relevant to anti-cancer therapies are also uncovered. These results indicate that our methods can efficiently translate cell-line-trained predictors into tumors.
Collapse
Affiliation(s)
- Shinsheng Yuan
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
| | - Yen-Chou Chen
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Chi-Hsuan Tsai
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Huei-Wen Chen
- College of Medicine, Graduate Institute of Toxicology, National Taiwan University, Taipei, Taiwan
| | - Grace S. Shieh
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
- Data Science Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
| |
Collapse
|
4
|
Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023; 10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open
Abstract
Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients. A wave of recent papers demonstrates promising results in predicting cancer response to drug treatments while utilizing deep learning methods. These papers investigate diverse data representations, neural network architectures, learning methodologies, and evaluations schemes. However, deciphering promising predominant and emerging trends is difficult due to the variety of explored methods and lack of standardized framework for comparing drug response prediction models. To obtain a comprehensive landscape of deep learning methods, we conducted an extensive search and analysis of deep learning models that predict the response to single drug treatments. A total of 61 deep learning-based models have been curated, and summary plots were generated. Based on the analysis, observable patterns and prevalence of methods have been revealed. This review allows to better understand the current state of the field and identify major challenges and promising solution paths.
Collapse
Affiliation(s)
- Alexander Partin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Thomas S. Brettin
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Yitan Zhu
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Oleksandr Narykov
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Austin Clyde
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Jamie Overbeek
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
| | - Rick L. Stevens
- Division of Data Science and Learning, Argonne National Laboratory, Lemont, IL, United States
- Department of Computer Science, The University of Chicago, Chicago, IL, United States
| |
Collapse
|
5
|
Shen B, Feng F, Li K, Lin P, Ma L, Li H. A systematic assessment of deep learning methods for drug response prediction: from in vitro to clinical applications. Brief Bioinform 2023; 24:6961794. [PMID: 36575826 DOI: 10.1093/bib/bbac605] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/30/2022] [Accepted: 12/09/2022] [Indexed: 12/29/2022] Open
Abstract
Drug response prediction is an important problem in personalized cancer therapy. Among various newly developed models, significant improvement in prediction performance has been reported using deep learning methods. However, systematic comparisons of deep learning methods, especially of the transferability from preclinical models to clinical cohorts, are currently lacking. To provide a more rigorous assessment, the performance of six representative deep learning methods for drug response prediction using nine evaluation metrics, including the overall prediction accuracy, predictability of each drug, potential associated factors and transferability to clinical cohorts, in multiple application scenarios was benchmarked. Most methods show promising prediction within cell line datasets, and TGSA, with its lower time cost and better performance, is recommended. Although the performance metrics decrease when applying models trained on cell lines to patients, a certain amount of power to distinguish clinical response on some drugs can be maintained using CRDNN and TGSA. With these assessments, we provide a guidance for researchers to choose appropriate methods, as well as insights into future directions for the development of more effective methods in clinical scenarios.
Collapse
Affiliation(s)
- Bihan Shen
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Fangyoumin Feng
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Kunshi Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ping Lin
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Liangxiao Ma
- Bio-Med Big Data Center at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Hong Li
- Cancer Systems Biology group at Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
6
|
Yingtaweesittikul H, Wu J, Mongia A, Peres R, Ko K, Nagarajan N, Suphavilai C. CREAMMIST: an integrative probabilistic database for cancer drug response prediction. Nucleic Acids Res 2022; 51:D1242-D1248. [PMID: 36259664 PMCID: PMC9825458 DOI: 10.1093/nar/gkac911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/18/2022] [Accepted: 10/11/2022] [Indexed: 01/30/2023] Open
Abstract
Extensive in vitro cancer drug screening datasets have enabled scientists to identify biomarkers and develop machine learning models for predicting drug sensitivity. While most advancements have focused on omics profiles, cancer drug sensitivity scores precalculated by the original sources are often used as-is, without consideration for variabilities between studies. It is well-known that significant inconsistencies exist between the drug sensitivity scores across datasets due to differences in experimental setups and preprocessing methods used to obtain the sensitivity scores. As a result, many studies opt to focus only on a single dataset, leading to underutilization of available data and a limited interpretation of cancer pharmacogenomics analysis. To overcome these caveats, we have developed CREAMMIST (https://creammist.mtms.dev), an integrative database that enables users to obtain an integrative dose-response curve, to capture uncertainty (or high certainty when multiple datasets well align) across five widely used cancer cell-line drug-response datasets. We utilized the Bayesian framework to systematically integrate all available dose-response values across datasets (>14 millions dose-response data points). CREAMMIST provides easy-to-use statistics derived from the integrative dose-response curves for various downstream analyses such as identifying biomarkers, selecting drug concentrations for experiments, and training robust machine learning models.
Collapse
Affiliation(s)
| | - Jiaxi Wu
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Aanchal Mongia
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Rafael Peres
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Karrie Ko
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | | | - Chayaporn Suphavilai
- To whom correspondence should be addressed. Tel: +65 86213683; Fax: +65 68088292;
| |
Collapse
|
7
|
Ogunleye AZ, Piyawajanusorn C, Gonçalves A, Ghislat G, Ballester PJ. Interpretable Machine Learning Models to Predict the Resistance of Breast Cancer Patients to Doxorubicin from Their microRNA Profiles. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2201501. [PMID: 35785523 PMCID: PMC9403644 DOI: 10.1002/advs.202201501] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 06/02/2022] [Indexed: 05/05/2023]
Abstract
Doxorubicin is a common treatment for breast cancer. However, not all patients respond to this drug, which sometimes causes life-threatening side effects. Accurately anticipating doxorubicin-resistant patients would therefore permit to spare them this risk while considering alternative treatments without delay. Stratifying patients based on molecular markers in their pretreatment tumors is a promising approach to advance toward this ambitious goal, but single-gene gene markers such as HER2 expression have not shown to be sufficiently predictive. The recent availability of matched doxorubicin-response and diverse molecular profiles across breast cancer patients permits now analysis at a much larger scale. 16 machine learning algorithms and 8 molecular profiles are systematically evaluated on the same cohort of patients. Only 2 of the 128 resulting models are substantially predictive, showing that they can be easily missed by a standard-scale analysis. The best model is classification and regression tree (CART) nonlinearly combining 4 selected miRNA isoforms to predict doxorubicin response (median Matthew correlation coefficient (MCC) and area under the curve (AUC) of 0.56 and 0.80, respectively). By contrast, HER2 expression is significantly less predictive (median MCC and AUC of 0.14 and 0.57, respectively). As the predictive accuracy of this CART model increases with larger training sets, its update with future data should result in even better accuracy.
Collapse
Affiliation(s)
- Adeolu Z. Ogunleye
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Chayanit Piyawajanusorn
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Anthony Gonçalves
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Ghita Ghislat
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
| | - Pedro J. Ballester
- Cancer Research Center of Marseille (CRCM)INSERM U1068MarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Institut Paoli‐CalmettesMarseilleF‐13009France
- Cancer Research Center of Marseille (CRCM)Aix‐Marseille UniversitéMarseilleF‐13284France
- Cancer Research Center of Marseille (CRCM)CNRS UMR7258MarseilleF‐13009France
- Department of BioengineeringImperial College LondonLondonSW7 2AZUK
| |
Collapse
|
8
|
Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00408-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|