1
|
Boon NJ, Oliveira RA, Körner PR, Kochavi A, Mertens S, Malka Y, Voogd R, van der Horst SEM, Huismans MA, Smabers LP, Draper JM, Wessels LFA, Haahr P, Roodhart JML, Schumacher TNM, Snippert HJ, Agami R, Brummelkamp TR. DNA damage induces p53-independent apoptosis through ribosome stalling. Science 2024; 384:785-792. [PMID: 38753784 DOI: 10.1126/science.adh7950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 04/11/2024] [Indexed: 05/18/2024]
Abstract
In response to excessive DNA damage, human cells can activate p53 to induce apoptosis. Cells lacking p53 can still undergo apoptosis upon DNA damage, yet the responsible pathways are unknown. We observed that p53-independent apoptosis in response to DNA damage coincided with translation inhibition, which was characterized by ribosome stalling on rare leucine-encoding UUA codons and globally curtailed translation initiation. A genetic screen identified the transfer RNAse SLFN11 and the kinase GCN2 as factors required for UUA stalling and global translation inhibition, respectively. Stalled ribosomes activated a ribotoxic stress signal conveyed by the ribosome sensor ZAKα to the apoptosis machinery. These results provide an explanation for the frequent inactivation of SLFN11 in chemotherapy-unresponsive tumors and highlight ribosome stalling as a signaling event affecting cell fate in response to DNA damage.
Collapse
Affiliation(s)
- Nicolaas J Boon
- Oncode Institute, Utrecht, Netherlands
- Division of Biochemistry, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Rafaela A Oliveira
- Oncode Institute, Utrecht, Netherlands
- Division of Biochemistry, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Pierré-René Körner
- Oncode Institute, Utrecht, Netherlands
- Division of Oncogenomics, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Adva Kochavi
- Oncode Institute, Utrecht, Netherlands
- Division of Oncogenomics, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Sander Mertens
- Oncode Institute, Utrecht, Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, Netherlands
| | - Yuval Malka
- Oncode Institute, Utrecht, Netherlands
- Division of Oncogenomics, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Rhianne Voogd
- Department of Molecular Oncology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Suzanne E M van der Horst
- Oncode Institute, Utrecht, Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, Netherlands
| | - Maarten A Huismans
- Oncode Institute, Utrecht, Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, Netherlands
| | - Lidwien P Smabers
- Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Jonne M Draper
- Division of Biochemistry, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Lodewyk F A Wessels
- Oncode Institute, Utrecht, Netherlands
- Division of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Peter Haahr
- Oncode Institute, Utrecht, Netherlands
- Division of Biochemistry, Netherlands Cancer Institute, Amsterdam, Netherlands
- Center for Gene Expression, Department of Cellular and Molecular Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jeanine M L Roodhart
- Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Ton N M Schumacher
- Oncode Institute, Utrecht, Netherlands
- Department of Molecular Oncology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Hugo J Snippert
- Oncode Institute, Utrecht, Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, Netherlands
| | - Reuven Agami
- Oncode Institute, Utrecht, Netherlands
- Division of Oncogenomics, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Thijn R Brummelkamp
- Oncode Institute, Utrecht, Netherlands
- Division of Biochemistry, Netherlands Cancer Institute, Amsterdam, Netherlands
| |
Collapse
|
2
|
Kaushik AC, Zhao Z. Machine learning-driven exploration of drug therapies for triple-negative breast cancer treatment. Front Mol Biosci 2023; 10:1215204. [PMID: 37602329 PMCID: PMC10436744 DOI: 10.3389/fmolb.2023.1215204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 07/21/2023] [Indexed: 08/22/2023] Open
Abstract
Breast cancer is the second leading cause of cancer death in women among all cancer types. It is highly heterogeneous in nature, which means that the tumors have different morphologies and there is heterogeneity even among people who have the same type of tumor. Several staging and classifying systems have been developed due to the variability of different types of breast cancer. Due to high heterogeneity, personalized treatment has become a new strategy. Out of all breast cancer subtypes, triple-negative breast cancer (TNBC) comprises ∼10%-15%. TNBC refers to the subtype of breast cancer where cells do not express estrogen receptors, progesterone receptors, or human epidermal growth factor receptors (ERs, PRs, and HERs). Tumors in TNBC have a diverse set of genetic markers and prognostic indicators. We scanned the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) databases for potential drugs using human breast cancer cell lines and drug sensitivity data. Three different machine-learning approaches were used to evaluate the prediction of six effective drugs against the TNBC cell lines. The top biomarkers were then shortlisted on the basis of their involvement in breast cancer and further subjected to testing for radion resistance using data from the Cleveland database. It was observed that Panobinostat, PLX4720, Lapatinib, Nilotinib, Selumetinib, and Tanespimycin were six effective drugs against the TNBC cell lines. We could identify potential derivates that may be used against approved drugs. Only one biomarker (SETD7) was sensitive to all six drugs on the shortlist, while two others (SRARP and YIPF5) were sensitive to both radiation and drugs. Furthermore, we did not find any radioresistance markers for the TNBC. The proposed biomarkers and drug sensitivity analysis will provide potential candidates for future clinical investigation.
Collapse
Affiliation(s)
- Aman Chandra Kaushik
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States
- MD Anderson Cancer Center, UTHealth Graduate School of Biomedical Sciences, Houston, TX, United States
| |
Collapse
|
3
|
Mehmood A, Nawab S, Jin Y, Hassan H, Kaushik AC, Wei DQ. Ranking Breast Cancer Drugs and Biomarkers Identification Using Machine Learning and Pharmacogenomics. ACS Pharmacol Transl Sci 2023; 6:399-409. [PMID: 36926455 PMCID: PMC10012252 DOI: 10.1021/acsptsci.2c00212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Indexed: 02/26/2023]
Abstract
Breast cancer is one of the major causes of death in women worldwide. It is a diverse illness with substantial intersubject heterogeneity, even among individuals with the same type of tumor, and customized therapy has become increasingly important in this sector. Because of the clinical and physical variability of different kinds of breast cancers, multiple staging and classification systems have been developed. As a result, these tumors exhibit a wide range of gene expression and prognostic indicators. To date, no comprehensive investigation of model training procedures on information from numerous cell line screenings has been conducted together with radiation data. We used human breast cancer cell lines and drug sensitivity information from Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) databases to scan for potential drugs using cell line data. The results are further validated through three machine learning approaches: Elastic Net, LASSO, and Ridge. Next, we selected top-ranked biomarkers based on their role in breast cancer and tested them further for their resistance to radiation using the data from the Cleveland database. We have identified six drugs named Palbociclib, Panobinostat, PD-0325901, PLX4720, Selumetinib, and Tanespimycin that significantly perform on breast cancer cell lines. Also, five biomarkers named TNFSF15, DCAF6, KDM6A, PHETA2, and IFNGR1 are sensitive to all six shortlisted drugs and show sensitivity to the radiations. The proposed biomarkers and drug sensitivity analysis are helpful in translational cancer studies and provide valuable insights for clinical trial design.
Collapse
Affiliation(s)
- Aamir Mehmood
- Department
of Bioinformatics and Biological Statistics, School of Life Sciences
and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P.R. China
| | - Sadia Nawab
- State
Key Laboratory of Microbial Metabolism and School of Life Sciences
and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, P.R. China
| | - Yifan Jin
- Department
of Bioinformatics and Biological Statistics, School of Life Sciences
and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P.R. China
| | - Hesham Hassan
- Department
of Pathology, College of Medicine, King
Khalid University, Abha 61421, Saudi Arabia
- Department
of Pathology, Faculty of Medicine, Assiut
University, Assiut 71515, Egypt
| | - Aman Chandra Kaushik
- Department
of Bioinformatics and Biological Statistics, School of Life Sciences
and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P.R. China
| | - Dong-Qing Wei
- State
Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade
Joint Innovation Center on Antibacterial Resistances, Joint International
Research Laboratory of Metabolic & Developmental Sciences and
School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
- Zhongjing
Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan 473006, P.R. China
- Peng
Cheng National Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong 518055, P.R. China
| |
Collapse
|
4
|
Diaz-Uriarte R, Gómez de Lope E, Giugno R, Fröhlich H, Nazarov PV, Nepomuceno-Chamorro IA, Rauschenberger A, Glaab E. Ten quick tips for biomarker discovery and validation analyses using machine learning. PLoS Comput Biol 2022; 18:e1010357. [PMID: 35951526 PMCID: PMC9371329 DOI: 10.1371/journal.pcbi.1010357] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, School of Medicine, Universidad Autónoma de Madrid, Instituto de Investigaciones Biomédicas ‘Alberto Sols’ (UAM-CSIC), Madrid, Spain
| | - Elisa Gómez de Lope
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Verona, Italy
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Centre for IT (b-it), Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Petr V. Nazarov
- Department of Cancer Research, Luxembourg Institute of Health, Strassen, Luxembourg
| | | | - Armin Rauschenberger
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
| | - Enrico Glaab
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg
- * E-mail:
| |
Collapse
|
5
|
Functional regulations between genetic alteration-driven genes and drug target genes acting as prognostic biomarkers in breast cancer. Sci Rep 2022; 12:10641. [PMID: 35739271 PMCID: PMC9226112 DOI: 10.1038/s41598-022-13835-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 05/30/2022] [Indexed: 12/19/2022] Open
Abstract
Differences in genetic molecular features including mutation, copy number alterations and DNA methylation, can explain interindividual variability in response to anti-cancer drugs in cancer patients. However, identifying genetic alteration-driven genes and characterizing their functional mechanisms in different cancer types are still major challenges for cancer studies. Here, we systematically identified functional regulations between genetic alteration-driven genes and drug target genes and their potential prognostic roles in breast cancer. We identified two mutation and copy number-driven gene pairs (PARP1-ACSL1 and PARP1-SRD5A3), three DNA methylation-driven gene pairs (PRLR-CDKN1C, PRLR-PODXL2 and PRLR-SRD5A3), six gene pairs between mutation-driven genes and drug target genes (SLC19A1-SLC47A2, SLC19A1-SRD5A3, AKR1C3-SLC19A1, ABCB1-SRD5A3, NR3C2-SRD5A3 and AKR1C3-SRD5A3), and four copy number-driven gene pairs (ADIPOR2-SRD5A3, CASP12-SRD5A3, SLC39A11-SRD5A3 and GALNT2-SRD5A3) that all served as prognostic biomarkers of breast cancer. In particular, RARP1 was found to be upregulated by simultaneous copy number amplification and gene mutation. Copy number deletion and downregulated expression of ACSL1 and upregulation of SRD5A3 both were observed in breast cancers. Moreover, copy number deletion of ACSL1 was associated with increased resistance to PARP inhibitors. PARP1-ACSL1 pair significantly correlated with poor overall survival in breast cancer owing to the suppression of the MAPK, mTOR and NF-kB signaling pathways, which induces apoptosis, autophagy and prevents inflammatory processes. Loss of SRD5A3 expression was also associated with increased sensitivity to PARP inhibitors. The PARP1-SRD5A3 pair significantly correlated with poor overall survival in breast cancer through regulating androgen receptors to induce cell proliferation. These results demonstrate that genetic alteration-driven gene pairs might serve as potential biomarkers for the prognosis of breast cancer and facilitate the identification of combination therapeutic targets for breast cancers.
Collapse
|
6
|
Liu M, Shen X, Pan W. Deep reinforcement learning for personalized treatment recommendation. Stat Med 2022; 41:4034-4056. [PMID: 35716038 PMCID: PMC9427729 DOI: 10.1002/sim.9491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 05/22/2022] [Accepted: 05/25/2022] [Indexed: 12/12/2022]
Abstract
In precision medicine, the ultimate goal is to recommend the most effective treatment to an individual patient based on patient-specific molecular and clinical profiles, possibly high-dimensional. To advance cancer treatment, large-scale screenings of cancer cell lines against chemical compounds have been performed to help better understand the relationship between genomic features and drug response; existing machine learning approaches use exclusively supervised learning, including penalized regression and recommender systems. However, it would be more efficient to apply reinforcement learning to sequentially learn as data accrue, including selecting the most promising therapy for a patient given individual molecular and clinical features and then collecting and learning from the corresponding data. In this article, we propose a novel personalized ranking system called Proximal Policy Optimization Ranking (PPORank), which ranks the drugs based on their predicted effects per cell line (or patient) in the framework of deep reinforcement learning (DRL). Modeled as a Markov decision process, the proposed method learns to recommend the most suitable drugs sequentially and continuously over time. As a proof-of-concept, we conduct experiments on two large-scale cancer cell line data sets in addition to simulated data. The results demonstrate that the proposed DRL-based PPORank outperforms the state-of-the-art competitors based on supervised learning. Taken together, we conclude that novel methods in the framework of DRL have great potential for precision medicine and should be further studied.
Collapse
Affiliation(s)
- Mingyang Liu
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
7
|
Computational Screening of Anti-Cancer Drugs Identifies a New BRCA Independent Gene Expression Signature to Predict Breast Cancer Sensitivity to Cisplatin. Cancers (Basel) 2022; 14:cancers14102404. [PMID: 35626009 PMCID: PMC9139442 DOI: 10.3390/cancers14102404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 04/29/2022] [Accepted: 05/02/2022] [Indexed: 12/10/2022] Open
Abstract
Simple Summary Using a collection of publicly available drug screening resources, we identified different partners of genes associated with either sensitivity or resistance to 90 anti-cancer therapies. When subsequently applying these signatures to multiple datasets, we found that these predictive models could predict a large range of drug responses in patient samples. In particular, we discovered a new gene signature to identify breast cancer tumors that are likely to respond to cisplatin in the absence of BRCA1 mutations. This work constitutes an important advance to accelerate the application of platinum-based therapies in patient groups that are not routinely treated with these drugs. In the future, this approach may help to guide the choice of drugs based on the molecular profile of the tumors. Abstract The development of therapies that target specific disease subtypes has dramatically improved outcomes for patients with breast cancer. However, survival gains have not been uniform across patients, even within a given molecular subtype. Large collections of publicly available drug screening data matched with transcriptomic measurements have facilitated the development of computational models that predict response to therapy. Here, we generated a series of predictive gene signatures to estimate the sensitivity of breast cancer samples to 90 drugs, comprising FDA-approved drugs or compounds in early development. To achieve this, we used a cell line-based drug screen with matched transcriptomic data to derive in silico models that we validated in large independent datasets obtained from cell lines and patient-derived xenograft (PDX) models. Robust computational signatures were obtained for 28 drugs and used to predict drug efficacy in a set of PDX models. We found that our signature for cisplatin can be used to identify tumors that are likely to respond to this drug, even in absence of the BRCA-1 mutation routinely used to select patients for platinum-based therapies. This clinically relevant observation was confirmed in multiple PDXs. Our study foreshadows an effective delivery approach for precision medicine.
Collapse
|
8
|
Integration of Omics and Phenotypic Data for Precision Medicine. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2486:19-35. [PMID: 35437716 DOI: 10.1007/978-1-0716-2265-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Over the past two decades, biomedical research is moving toward a big-data-driven approach. The underlying causes of this transition include the ability to gather genetic or molecular profiles of humans faster, the increasing adoption of electronic health record (EHR) system, and the growing interest in linking omics and phenotypic data for analysis. The integration of individual's biology data (e.g., genomics, proteomics, metabolomics), and health-care data has created unprecedented opportunities for precision medicine, that is, a medical model that uses a patient's unique information, mainly genetic, to prevent, diagnose, or treat disease. This chapter reviewed the research opportunities and applications of integrating omics and phenotypic data for precision medicine, such as understanding the relationship between genotype and phenotype, disease subtyping, and diagnosis or prediction of adverse outcomes. We reviewed the recent advanced methods, particularly the machine learning and deep learning-based approaches used for harnessing and harmonizing the multiomics and phenotypic data to address these applications. We finally discussed the challenges and future directions.
Collapse
|
9
|
Nguyen GTT, Vu HD, Le DH. Integrating Molecular Graph Data of Drugs and Multiple -Omic Data of Cell Lines for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:710-717. [PMID: 34260355 DOI: 10.1109/tcbb.2021.3096960] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Previous studies have either learned drug's features from their string or numeric representations, which are not natural forms of drugs, or only used genomic data of cell lines for the drug response prediction problem. Here, we proposed a deep learning model, GraOmicDRP, to learn drug's features from their graph representation and integrate multiple -omic data of cell lines. In GraOmicDRP, drugs are represented as graphs of bindings among atoms; meanwhile, cell lines are depicted by not only genomic but also transcriptomic and epigenomic data. Graph convolutional and convolutional neural networks were used to learn the representation of drugs and cell lines, respectively. A combination of the two representations was then used to be representative of each pair of drug-cell line. Finally, the response value of each pair was predicted by a fully connected network. Experimental results indicate that transcriptomic data shows the best among single -omic data; meanwhile, the combinations of transcriptomic and other -omic data achieved the best performance overall in terms of both Root Mean Square Error and Pearson correlation coefficient. In addition, we also show that GraOmicDRP outperforms some state-of-the-art methods, including ones integrating -omic data with drug information such as GraphDRP, and ones using -omic data without drug information such as DeepDR and MOLI.
Collapse
|
10
|
Firoozbakht F, Yousefi B, Schwikowski B. An overview of machine learning methods for monotherapy drug response prediction. Brief Bioinform 2022; 23:bbab408. [PMID: 34619752 PMCID: PMC8769705 DOI: 10.1093/bib/bbab408] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/25/2021] [Accepted: 09/06/2021] [Indexed: 12/11/2022] Open
Abstract
For an increasing number of preclinical samples, both detailed molecular profiles and their responses to various drugs are becoming available. Efforts to understand, and predict, drug responses in a data-driven manner have led to a proliferation of machine learning (ML) methods, with the longer term ambition of predicting clinical drug responses. Here, we provide a uniquely wide and deep systematic review of the rapidly evolving literature on monotherapy drug response prediction, with a systematic characterization and classification that comprises more than 70 ML methods in 13 subclasses, their input and output data types, modes of evaluation, and code and software availability. ML experts are provided with a fundamental understanding of the biological problem, and how ML methods are configured for it. Biologists and biomedical researchers are introduced to the basic principles of applicable ML methods, and their application to the problem of drug response prediction. We also provide systematic overviews of commonly used data sources used for training and evaluation methods.
Collapse
Affiliation(s)
- Farzaneh Firoozbakht
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Behnam Yousefi
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
- Sorbonne Université, École Doctorale Complexite du Vivant, Paris, France
| | - Benno Schwikowski
- Systems Biology Group, Department of Computational Biology, Institut Pasteur, Paris, France
| |
Collapse
|
11
|
Nguyen T, Nguyen GTT, Nguyen T, Le DH. Graph Convolutional Networks for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:146-154. [PMID: 33606633 DOI: 10.1109/tcbb.2021.3060430] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Drug response prediction is an important problem in computational personalized medicine. Many machine-learning-based methods, especially deep learning-based ones, have been proposed for this task. However, these methods often represent the drugs as strings, which are not a natural way to depict molecules. Also, interpretation (e.g., what are the mutation or copy number aberration contributing to the drug response) has not been considered thoroughly. METHODS In this study, we propose a novel method, GraphDRP, based on graph convolutional network for the problem. In GraphDRP, drugs were represented in molecular graphs directly capturing the bonds among atoms, meanwhile cell lines were depicted as binary vectors of genomic aberrations. Representative features of drugs and cell lines were learned by convolution layers, then combined to represent for each drug-cell line pair. Finally, the response value of each drug-cell line pair was predicted by a fully-connected neural network. Four variants of graph convolutional networks were used for learning the features of drugs. RESULTS We found that GraphDRP outperforms tCNNS in all performance measures for all experiments. Also, through saliency maps of the resulting GraphDRP models, we discovered the contribution of the genomic aberrations to the responses. CONCLUSION Representing drugs as graphs can improve the performance of drug response prediction. Availability of data and materials: Data and source code can be downloaded athttps://github.com/hauldhut/GraphDRP.
Collapse
|
12
|
Nicolle R, Gayet O, Bigonnet M, Roques J, Chanez B, Puleo F, Augustin J, Emile JF, Svrcek M, Arsenijevic T, Hammel P, Rebours V, Giovannini M, Grandval P, Dahan L, Moutardier V, Mitry E, Van Laethem JL, Bachet JB, Cros J, Iovanna J, Dusetti NJ. Relevance of biopsy-derived pancreatic organoids in the development of efficient transcriptomic signatures to predict adjuvant chemosensitivity in pancreatic cancer. Transl Oncol 2021; 16:101315. [PMID: 34906890 PMCID: PMC8681024 DOI: 10.1016/j.tranon.2021.101315] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/06/2021] [Accepted: 12/08/2021] [Indexed: 02/07/2023] Open
Abstract
Most patient with pancreatic cancer are treated by chemotherapy. Treatments selection are not personalized on the tumor characteristics. Signatures predicting chemotherapy efficiency are essential for personalizing treatments. An RNA signature of gemcitabine-sensitivity is developed leveraged on the dissimilarities between 2D and 3D in vitro models. Combining different in vitro models can help in defining clinically efficient transcriptomic signatures.
Pancreatic ductal adenocarcinoma (PDAC) patients are frequently treated by chemotherapy. Even if personalized therapy based on molecular analysis can be performed for some tumors, PDAC regimens selection is still mainly based on patients' performance status and expected efficacy. Therefore, the establishment of molecular predictors of chemotherapeutic efficacy could potentially improve prognosis by tailoring treatments. We have recently developed an RNA-based signature that predicts the efficacy of adjuvant gemcitabine using 38 PDAC primary cell cultures. While demonstrated its efficiency, a significant association with the classical/basal-like PDAC spectrum was observed. We hypothesized that this flaw was due to the basal-like biased phenotype of cellular models used in our strategy. To overcome this limitation, we generated a prospective cohort of 27 consecutive biopsied derived pancreatic organoids (BDPO) and include them in the signature identification strategy. As BDPO's do not have the same biased phenotype as primary cell cultures we expect they can compensate one with each other and cover a broader range of molecular phenotypes. We then obtained an improved signature predicting gemcitabine sensibility that was validated in a cohort of 300 resected PDAC patients that have or have not received adjuvant gemcitabine. We demonstrated a significant association between the improved signature and the overall and disease-free survival in patients predicted as sensitive and treated with adjuvant gemcitabine. We propose then that including BDPO along primary cell cultures represent a powerful strategy that helps to overcome primary cell cultures limitations producing unbiased RNA-based signatures predictive of adjuvant treatments in PDAC.
Collapse
Affiliation(s)
- R Nicolle
- Tumor Identity Card Program (CIT), French League Against Cancer, Paris, France
| | - O Gayet
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France
| | - M Bigonnet
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France
| | - J Roques
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France
| | - B Chanez
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Paoli-Calmettes Institut, Marseille, France
| | - F Puleo
- Laboratory of Experimental Gastroenterology (Université Libre de Bruxelles), Brussels, Belgium; Department of Gastroenterology and Digestive Oncology, Delta Hospital, Center Hospitalier Interregional Edith Cavell, Brussels, Belgium
| | - J Augustin
- Department of Pathology, AP-HP, Henri Mondor University Hospital, Créteil, France
| | - J F Emile
- Ambroise Paré Hospital, Boulogne, AP-HP, Boulogne-Billancourt, France
| | - M Svrcek
- Department of Pathology, Saint-Antoine Hospital, Sorbonne University, UPMC University, Paris, France
| | - T Arsenijevic
- Laboratory of Experimental Gastroenterology (Université Libre de Bruxelles), Brussels, Belgium; Department of Gastroenterology and Digestive Oncology, Erasme Hospital, Brussels, Belgium
| | - P Hammel
- Department of Digestive Oncology, Paul Brousse Hospital, APHP, Villejuif, France
| | - V Rebours
- Université de Paris, Department of Pancreatology, Beaujon Hospital, APHP, Clichy, France
| | - M Giovannini
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Paoli-Calmettes Institut, Marseille, France
| | - P Grandval
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Université de Paris, Department of Pancreatology, Beaujon Hospital, APHP, Clichy, France
| | - L Dahan
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; La Timone Hospital, Marseille, France
| | - V Moutardier
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Nord Hospital, Marseille, France
| | - E Mitry
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Paoli-Calmettes Institut, Marseille, France
| | - J L Van Laethem
- Laboratory of Experimental Gastroenterology (Université Libre de Bruxelles), Brussels, Belgium; Department of Gastroenterology and Digestive Oncology, Erasme Hospital, Brussels, Belgium
| | - J B Bachet
- Department of Gastroenterology, Pitié-Salpetrière Hospital, Sorbonne University, UPMC University, Paris, France
| | - J Cros
- Université de Paris, Department of Pathology, Beaujon Hospital, APHP, Clichy, France
| | - J Iovanna
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France; Paoli-Calmettes Institut, Marseille, France
| | - N J Dusetti
- Cancer Research Center of Marseille, CRCM, Inserm, CNRS, Paoli-Calmettes Institut, Aix-Marseille University, Marseille, France.
| |
Collapse
|
13
|
Mourragui SMC, Loog M, Vis DJ, Moore K, Manjon AG, van de Wiel MA, Reinders MJT, Wessels LFA. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc Natl Acad Sci U S A 2021; 118:e2106682118. [PMID: 34873056 PMCID: PMC8670522 DOI: 10.1073/pnas.2106682118] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 12/13/2022] Open
Abstract
Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.
Collapse
Affiliation(s)
- Soufiane M C Mourragui
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| | - Marco Loog
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
- Department of Computer Science, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Daniel J Vis
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Kat Moore
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Anna G Manjon
- Division of Cell Biology, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Mark A van de Wiel
- Epidemiology and Biostatistics, Amsterdam University Medical Center, 1105 AZ Amsterdam, The Netherlands
- Medical Research Council Biostatistics Unit, Cambridge University, Cambridge CB2 0SR, United Kingdom
| | - Marcel J T Reinders
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands;
- Leiden Computational Biology Center, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands;
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| |
Collapse
|
14
|
Wei S, Tao J, Xu J, Chen X, Wang Z, Zhang N, Zuo L, Jia Z, Chen H, Sun H, Yan Y, Zhang M, Lv H, Kong F, Duan L, Ma Y, Liao M, Xu L, Feng R, Liu G, Project TEWAS, Jiang Y. Ten Years of EWAS. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2021; 8:e2100727. [PMID: 34382344 PMCID: PMC8529436 DOI: 10.1002/advs.202100727] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/11/2021] [Indexed: 06/13/2023]
Abstract
Epigenome-wide association study (EWAS) has been applied to analyze DNA methylation variation in complex diseases for a decade, and epigenome as a research target has gradually become a hot topic of current studies. The DNA methylation microarrays, next-generation, and third-generation sequencing technologies have prepared a high-quality platform for EWAS. Here, the progress of EWAS research is reviewed, its contributions to clinical applications, and mainly describe the achievements of four typical diseases. Finally, the challenges encountered by EWAS and make bold predictions for its future development are presented.
Collapse
Affiliation(s)
- Siyu Wei
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Junxian Tao
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Jing Xu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Xingyu Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhaoyang Wang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Nan Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Lijiao Zuo
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhe Jia
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Haiyan Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongmei Sun
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Yubo Yan
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Mingming Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongchao Lv
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Fanwu Kong
- The EWAS ProjectHarbinChina
- Department of NephrologyThe Second Affiliated HospitalHarbin Medical UniversityHarbin150001China
| | - Lian Duan
- The EWAS ProjectHarbinChina
- The First Affiliated Hospital of Wenzhou Medical UniversityWenzhou325000China
| | - Ye Ma
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Mingzhi Liao
- The EWAS ProjectHarbinChina
- College of Life SciencesNorthwest A&F UniversityYanglingShanxi712100China
| | - Liangde Xu
- The EWAS ProjectHarbinChina
- School of Biomedical EngineeringWenzhou Medical UniversityWenzhou325035China
| | - Rennan Feng
- The EWAS ProjectHarbinChina
- Department of Nutrition and Food HygienePublic Health CollegeHarbin Medical UniversityHarbin150081China
| | - Guiyou Liu
- The EWAS ProjectHarbinChina
- Beijing Institute for Brain DisordersCapital Medical UniversityBeijing100069China
| | | | - Yongshuai Jiang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| |
Collapse
|
15
|
Meybodi FY, Eslahchi C. Predicting Anti-Cancer Drug Response by Finding Optimal Subset of Drugs. Bioinformatics 2021; 37:4509-4516. [PMID: 34170297 DOI: 10.1093/bioinformatics/btab466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/26/2021] [Accepted: 06/22/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION One of the most difficult challenges in precision medicine is determining the best treatment strategy for each patient based on personal information. Since drug response prediction in vitro is extremely expensive, time-consuming, and virtually impossible, and because there are so many cell lines and drug data, computational methods are needed. RESULTS MinDrug is a method for predicting anti-cancer drug response which try to identify the best subset of drugs that are the most similar to other drugs. MinDrug predicts the anti-cancer drug response on a new cell line using information from drugs in this subset and their connections to other drugs. MinDrug employs a heuristic star algorithm to identify an optimal subset of drugs and a regression technique known as Elastic-Net approaches to predict anti-cancer drug response in a new cell line. To test MinDrug, we use both statistical and biological methods to assess the selected drugs. MinDrug is also compared to four state-of-the-art approaches using various k-fold cross-validations on two large public datasets: GDSC and CCLE. MinDrug outperforms the other approaches in terms of precision, robustness, and speed. Furthermore, we compare the evaluation results of all the approaches with an external dataset with a statistical distribution that is not exactly the same as the training data. The results show that MinDrug continues to outperform the other approaches. AVAILABILITY MinDrug's source code can be found at https://github.com/yassaee/MinDrug. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fatemeh Yassaee Meybodi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Changiz Eslahchi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.,School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| |
Collapse
|
16
|
van de Wiel MA, van Nee MM, Rauschenberger A. Fast Cross-validation for Multi-penalty High-dimensional Ridge Regression. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1904962] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Mark A. van de Wiel
- Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Mirrelijn M. van Nee
- Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Armin Rauschenberger
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
17
|
Ye Z, Yang W, Yang Y, Ouyang D. Interpretable machine learning methods for in vitro pharmaceutical formulation development. FOOD FRONTIERS 2021. [DOI: 10.1002/fft2.78] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| | - Wenmian Yang
- State Key Laboratory of Internet of Things for Smart City University of Macau Macau China
| | - Yilong Yang
- School of Software Beihang University Beijing China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| |
Collapse
|
18
|
Tozzo V, Azencott CA, Fiorini S, Fava E, Trucco A, Barla A. Where Do We Stand in Regularization for Life Science Studies? J Comput Biol 2021; 29:213-232. [PMID: 33926217 PMCID: PMC8968832 DOI: 10.1089/cmb.2019.0371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
More and more biologists and bioinformaticians turn to machine learning to analyze large amounts of data. In this context, it is crucial to understand which is the most suitable data analysis pipeline for achieving reliable results. This process may be challenging, due to a variety of factors, the most crucial ones being the data type and the general goal of the analysis (e.g., explorative or predictive). Life science data sets require further consideration as they often contain measures with a low signal-to-noise ratio, high-dimensional observations, and relatively few samples. In this complex setting, regularization, which can be defined as the introduction of additional information to solve an ill-posed problem, is the tool of choice to obtain robust models. Different regularization practices may be used depending both on characteristics of the data and of the question asked, and different choices may lead to different results. In this article, we provide a comprehensive description of the impact and importance of regularization techniques in life science studies. In particular, we provide an intuition of what regularization is and of the different ways it can be implemented and exploited. We propose four general life sciences problems in which regularization is fundamental and should be exploited for robustness. For each of these large families of problems, we enumerate different techniques as well as examples and case studies. Lastly, we provide a unified view of how to approach each data type with various regularization techniques.
Collapse
Affiliation(s)
- Veronica Tozzo
- Department of Informatics, Bioengineering, Robotics and System Engineering-DIBRIS, University of Genoa, Genoa, Italy
| | - Chloé-Agathe Azencott
- Centre for Computational Biology-CBIO, MINES ParisTech, PSL Research University, Paris, France.,Institut Curie, PSL Research University, Paris, France.,INSERM, U900, Paris, France
| | | | - Emanuele Fava
- Departiment of Electrical, Electronic, Telecommunications Engineering, and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
| | - Andrea Trucco
- Departiment of Electrical, Electronic, Telecommunications Engineering, and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
| | - Annalisa Barla
- Department of Informatics, Bioengineering, Robotics and System Engineering-DIBRIS, University of Genoa, Genoa, Italy
| |
Collapse
|
19
|
Bengtsson A, Andersson R, Rahm J, Ganganna K, Andersson B, Ansari D. Organoid technology for personalized pancreatic cancer therapy. Cell Oncol (Dordr) 2021; 44:251-260. [PMID: 33492660 PMCID: PMC7985124 DOI: 10.1007/s13402-021-00585-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 12/29/2020] [Accepted: 01/02/2021] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Pancreatic ductal adenocarcinoma has the lowest survival rate among all major cancers and is the third leading cause of cancer-related mortality. The stagnant survival statistics and dismal response rates to current therapeutics highlight the need for more efficient preclinical models. Patient-derived organoids (PDOs) offer new possibilities as powerful preclinical models able to account for interpatient variability. Organoid development can be divided into four different key phases: establishment, propagation, drug screening and response prediction. Establishment entails tailored tissue extraction and growth protocols, propagation requires consistent multiplication and passaging, while drug screening and response prediction will benefit from shorter and more precise assays, and clear decision-making tools. CONCLUSIONS This review attempts to outline the most important challenges that remain in exploiting organoid platforms for drug discovery and clinical applications. Some of these challenges may be overcome by novel methods that are under investigation, such as 3D bioprinting systems, microfluidic systems, optical metabolic imaging and liquid handling robotics. We also propose an optimized organoid workflow inspired by all technical solutions we have presented.
Collapse
Affiliation(s)
- Axel Bengtsson
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden
| | - Roland Andersson
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden
| | - Jonas Rahm
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden
| | - Karthik Ganganna
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden
| | - Bodil Andersson
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden
| | - Daniel Ansari
- Department of Surgery, Clinical Sciences Lund, Skåne University Hospital, Lund University, Skåne University Hospital, Lund, SE-221 85, Lund, Sweden.
| |
Collapse
|
20
|
Chi C, Ye Y, Chen B, Huang H. Bipartite graph-based approach for clustering of cell lines by gene expression-drug response associations. Bioinformatics 2021; 37:2617-2626. [PMID: 33682877 PMCID: PMC8428606 DOI: 10.1093/bioinformatics/btab143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 02/16/2021] [Accepted: 03/01/2021] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION In pharmacogenomic studies, the biological context of cell lines influences the predictive ability of drug-response models and the discovery of biomarkers. Thus, similar cell lines are often studied together based on prior knowledge of biological annotations. However, this selection approach is not scalable with the number of annotations, and the relationship between gene-drug association patterns and biological context may not be obvious. RESULTS We present a procedure to compare cell lines based on their gene-drug association patterns. Starting with a grouping of cell lines from biological annotation, we model gene-drug association patterns for each group as a bipartite graph between genes and drugs. This is accomplished by applying sparse canonical correlation analysis (SCCA) to extract the gene-drug associations, and using the canonical vectors to construct the edge weights. Then, we introduce a nuclear norm-based dissimilarity measure to compare the bipartite graphs. Accompanying our procedure is a permutation test to evaluate the significance of similarity of cell line groups in terms of gene-drug associations. In the pharmacogenomics datasets CTRP2, GDSC2, and CCLE, hierarchical clustering of carcinoma groups based on this dissimilarity measure uniquely reveals clustering patterns driven by carcinoma subtype rather than primary site. Next, we show that the top associated drugs or genes from SCCA can be used to characterize the clustering patterns of haematopoietic and lymphoid malignancies. Finally, we confirm by simulation that when drug responses are linearly-dependent on expression, our approach is the only one that can effectively infer the true hierarchy compared to existing approaches. AVAILABILITY Bipartite graph-based hierarchical clustering is implemented in R and can be obtained from CRAN: https://CRAN.R-project.org/package=hierBipartite. The source code is available at https://github.com/CalvinTChi/hierBipartite. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Calvin Chi
- Center for Computational Biology, University of California, Berkeley, CA 94720, USA
| | - Yuting Ye
- Division of Biostatistics, University of California, Berkeley, CA 94720, USA
| | - Bin Chen
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 48912, USA.,Department of Pharmacology and Toxicology, Michigan State University, Grand Rapids, MI 48824, USA
| | - Haiyan Huang
- Center for Computational Biology, University of California, Berkeley, CA 94720, USA.,Department of Statistics, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
21
|
Personalized logical models to investigate cancer response to BRAF treatments in melanomas and colorectal cancers. PLoS Comput Biol 2021; 17:e1007900. [PMID: 33507915 PMCID: PMC7872233 DOI: 10.1371/journal.pcbi.1007900] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 02/09/2021] [Accepted: 12/21/2020] [Indexed: 11/19/2022] Open
Abstract
The study of response to cancer treatments has benefited greatly from the contribution of different omics data but their interpretation is sometimes difficult. Some mathematical models based on prior biological knowledge of signaling pathways facilitate this interpretation but often require fitting of their parameters using perturbation data. We propose a more qualitative mechanistic approach, based on logical formalism and on the sole mapping and interpretation of omics data, and able to recover differences in sensitivity to gene inhibition without model training. This approach is showcased by the study of BRAF inhibition in patients with melanomas and colorectal cancers who experience significant differences in sensitivity despite similar omics profiles. We first gather information from literature and build a logical model summarizing the regulatory network of the mitogen-activated protein kinase (MAPK) pathway surrounding BRAF, with factors involved in the BRAF inhibition resistance mechanisms. The relevance of this model is verified by automatically assessing that it qualitatively reproduces response or resistance behaviors identified in the literature. Data from over 100 melanoma and colorectal cancer cell lines are then used to validate the model’s ability to explain differences in sensitivity. This generic model is transformed into personalized cell line-specific logical models by integrating the omics information of the cell lines as constraints of the model. The use of mutations alone allows personalized models to correlate significantly with experimental sensitivities to BRAF inhibition, both from drug and CRISPR targeting, and even better with the joint use of mutations and RNA, supporting multi-omics mechanistic models. A comparison of these untrained models with learning approaches highlights similarities in interpretation and complementarity depending on the size of the datasets. This parsimonious pipeline, which can easily be extended to other biological questions, makes it possible to explore the mechanistic causes of the response to treatment, on an individualized basis. We constructed a logical model to study, from a dynamical perspective, the differences between melanomas and colorectal cancers that share the same BRAF mutations but exhibit different sensitivities to anti-BRAF treatments. The model was built from the literature and completed from existing pathway databases. The model encompasses the key proteins of the MAPK pathway and was made specific to each cancer cell line (100 melanoma and colorectal cell lines from public database) using available omics data, including mutations and RNAseq data. It can simulate the effect of drugs and show high correlation with experimental results. Moreover, the structure of the network confirms both the importance of the reactivation of the MAPK pathway through CRAF and the involvement of PI3K/AKT pathway in the mechanisms of resistance to BRAF inhibition. The study shows that, because of the low number of samples, the mechanistic approach that we propose provides different insights than powerful standard machine learning methodologies would, showing the complementarity between the two approaches. An important aspect to mention is that the mechanistic approach presented here does not rely on training datasets but directly interprets and maps data on the model to simulate drug responses.
Collapse
|
22
|
Yao H, Liang Q, Qian X, Wang J, Sham PC, Li MJ. Methods and resources to access mutation-dependent effects on cancer drug treatment. Brief Bioinform 2020; 21:1886-1903. [PMID: 31750520 DOI: 10.1093/bib/bbz109] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Revised: 07/31/2019] [Accepted: 08/01/2019] [Indexed: 12/13/2022] Open
Abstract
In clinical cancer treatment, genomic alterations would often affect the response of patients to anticancer drugs. Studies have shown that molecular features of tumors could be biomarkers predictive of sensitivity or resistance to anticancer agents, but the identification of actionable mutations are often constrained by the incomplete understanding of cancer genomes. Recent progresses of next-generation sequencing technology greatly facilitate the extensive molecular characterization of tumors and promote precision medicine in cancers. More and more clinical studies, cancer cell lines studies, CRISPR screening studies as well as patient-derived model studies were performed to identify potential actionable mutations predictive of drug response, which provide rich resources of molecularly and pharmacologically profiled cancer samples at different levels. Such abundance of data also enables the development of various computational models and algorithms to solve the problem of drug sensitivity prediction, biomarker identification and in silico drug prioritization by the integration of multiomics data. Here, we review the recent development of methods and resources that identifies mutation-dependent effects for cancer treatment in clinical studies, functional genomics studies and computational studies and discuss the remaining gaps and future directions in this area.
Collapse
Affiliation(s)
- Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Qian Liang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xinyi Qian
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Junwen Wang
- Department of Health Sciences Research & Center for Individualized Medicine, Mayo Clinic, Scottsdale, USA
| | - Pak Chung Sham
- Center for Genomic Sciences, The University of Hong Kong, Hong Kong SAR, China.,Departments of Psychiatry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mulin Jun Li
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.,Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| |
Collapse
|
23
|
Kusch N, Schuppert A. Two-step multi-omics modelling of drug sensitivity in cancer cell lines to identify driving mechanisms. PLoS One 2020; 15:e0238961. [PMID: 33226984 PMCID: PMC7682852 DOI: 10.1371/journal.pone.0238961] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 10/30/2020] [Indexed: 11/18/2022] Open
Abstract
Drug sensitivity prediction models for human cancer cell lines constitute important tools in identifying potential computational biomarkers for responsiveness in a pre-clinical setting. Integrating information derived from a range of heterogeneous data is crucial, but remains non-trivial, as differences in data structures may hinder fitting algorithms from assigning adequate weights to complementary information that is contained in distinct omics data. In order to counteract this effect that tends to lead to just one data type dominating supposedly multi-omics models, we developed a novel tool that enables users to train single-omics models separately in a first step and to integrate them into a multi-omics model in a second step. Extensive ablation studies are performed in order to facilitate an in-depth evaluation of the respective contributions of singular data types and of combinations thereof, effectively identifying redundancies and interdependencies between them. Moreover, the integration of the single-omics models is realized by a range of distinct classification algorithms, thus allowing for a performance comparison. Sets of molecular events and tissue types found to be related to significant shifts in drug sensitivity are returned to facilitate a comprehensive and straightforward analysis of potential computational biomarkers for drug responsiveness. Our two-step approach yields sets of actual multi-omics pan-cancer classification models that are highly predictive for a majority of drugs in the GDSC data base. In the context of targeted drugs with particular modes of action, its predictive performances compare favourably to those of classification models that incorporate multi-omics data in a simple one-step approach. Additionally, case studies demonstrate that it succeeds both in correctly identifying known key biomarkers for sensitivity towards specific drug compounds as well as in providing sets of potential candidates for additional computational biomarkers.
Collapse
Affiliation(s)
- Nina Kusch
- Joint Research Center for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
- Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
- Uniklinik Aachen, Aachen, Germany
- * E-mail:
| | - Andreas Schuppert
- Joint Research Center for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
- Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
- Uniklinik Aachen, Aachen, Germany
| |
Collapse
|
24
|
Münch MM, van de Wiel MA, Richardson S, Leday GGR. Drug sensitivity prediction with normal inverse Gaussian shrinkage informed by external data. Biom J 2020; 63:289-304. [PMID: 33155717 PMCID: PMC7891636 DOI: 10.1002/bimj.201900371] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 04/30/2020] [Accepted: 06/03/2020] [Indexed: 11/09/2022]
Abstract
In precision medicine, a common problem is drug sensitivity prediction from cancer tissue cell lines. These types of problems entail modelling multivariate drug responses on high-dimensional molecular feature sets in typically >1000 cell lines. The dimensions of the problem require specialised models and estimation methods. In addition, external information on both the drugs and the features is often available. We propose to model the drug responses through a linear regression with shrinkage enforced through a normal inverse Gaussian prior. We let the prior depend on the external information, and estimate the model and external information dependence in an empirical-variational Bayes framework. We demonstrate the usefulness of this model in both a simulated setting and in the publicly available Genomics of Drug Sensitivity in Cancer data.
Collapse
Affiliation(s)
- Magnus M Münch
- Department of Epidemiology & Biostatistics, Amsterdam UMC, VU University, Amsterdam, The Netherlands.,Mathematical Institute, Leiden University, Leiden, The Netherlands.,MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Mark A van de Wiel
- Department of Epidemiology & Biostatistics, Amsterdam UMC, VU University, Amsterdam, The Netherlands.,MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Sylvia Richardson
- MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Gwenaël G R Leday
- MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge, United Kingdom
| |
Collapse
|
25
|
Kong J, Lee H, Kim D, Han SK, Ha D, Shin K, Kim S. Network-based machine learning in colorectal and bladder organoid models predicts anti-cancer drug efficacy in patients. Nat Commun 2020; 11:5485. [PMID: 33127883 PMCID: PMC7599252 DOI: 10.1038/s41467-020-19313-8] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 10/07/2020] [Indexed: 12/13/2022] Open
Abstract
Cancer patient classification using predictive biomarkers for anti-cancer drug responses is essential for improving therapeutic outcomes. However, current machine-learning-based predictions of drug response often fail to identify robust translational biomarkers from preclinical models. Here, we present a machine-learning framework to identify robust drug biomarkers by taking advantage of network-based analyses using pharmacogenomic data derived from three-dimensional organoid culture models. The biomarkers identified by our approach accurately predict the drug responses of 114 colorectal cancer patients treated with 5-fluorouracil and 77 bladder cancer patients treated with cisplatin. We further confirm our biomarkers using external transcriptomic datasets of drug-sensitive and -resistant isogenic cancer cell lines. Finally, concordance analysis between the transcriptomic biomarkers and independent somatic mutation-based biomarkers further validate our method. This work presents a method to predict cancer patient drug responses using pharmacogenomic data derived from organoid models by combining the application of gene modules and network-based approaches.
Collapse
Affiliation(s)
- JungHo Kong
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea
| | - Heetak Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea
| | - Donghyo Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea
| | - Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea
| | - Doyeon Ha
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea
| | - Kunyoo Shin
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea.
- Institute of Convergence Science, Yonsei University, Seoul, 120-749, Korea.
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790-784, Korea.
- Institute of Convergence Science, Yonsei University, Seoul, 120-749, Korea.
| |
Collapse
|
26
|
Ahmadi Moughari F, Eslahchi C. ADRML: anticancer drug response prediction using manifold learning. Sci Rep 2020; 10:14245. [PMID: 32859983 PMCID: PMC7456328 DOI: 10.1038/s41598-020-71257-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 08/13/2020] [Indexed: 12/05/2022] Open
Abstract
One of the prominent challenges in precision medicine is to select the most appropriate treatment strategy for each patient based on the personalized information. The availability of massive data about drugs and cell lines facilitates the possibility of proposing efficient computational models for predicting anticancer drug response. In this study, we propose ADRML, a model for Anticancer Drug Response Prediction using Manifold Learning to systematically integrate the cell line information with the drug information to make accurate predictions about drug therapeutic. The proposed model maps the drug response matrix into the lower-rank spaces that lead to obtaining new perspectives about cell lines and drugs. The drug response for a new cell line-drug pair is computed using the low-rank features. The evaluation of ADRML performance on various types of cell lines and drug information, in addition to the comparisons with previously proposed methods, shows that ADRML provides accurate and robust predictions. Further investigations about the association between drug response and pathway activity scores reveal that the predicted drug responses can shed light on the underlying drug mechanism. Also, the case studies suggest that the predictions of ADRML about novel cell line-drug pairs are validated by reliable pieces of evidence from the literature. Consequently, the evaluations verify that ADRML can be used in accurately predicting and imputing the anticancer drug response.
Collapse
Affiliation(s)
- Fatemeh Ahmadi Moughari
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Changiz Eslahchi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran. .,School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| |
Collapse
|
27
|
Patel SK, George B, Rai V. Artificial Intelligence to Decode Cancer Mechanism: Beyond Patient Stratification for Precision Oncology. Front Pharmacol 2020; 11:1177. [PMID: 32903628 PMCID: PMC7438594 DOI: 10.3389/fphar.2020.01177] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 07/20/2020] [Indexed: 12/13/2022] Open
Abstract
The multitude of multi-omics data generated cost-effectively using advanced high-throughput technologies has imposed challenging domain for research in Artificial Intelligence (AI). Data curation poses a significant challenge as different parameters, instruments, and sample preparations approaches are employed for generating these big data sets. AI could reduce the fuzziness and randomness in data handling and build a platform for the data ecosystem, and thus serve as the primary choice for data mining and big data analysis to make informed decisions. However, AI implication remains intricate for researchers/clinicians lacking specific training in computational tools and informatics. Cancer is a major cause of death worldwide, accounting for an estimated 9.6 million deaths in 2018. Certain cancers, such as pancreatic and gastric cancers, are detected only after they have reached their advanced stages with frequent relapses. Cancer is one of the most complex diseases affecting a range of organs with diverse disease progression mechanisms and the effectors ranging from gene-epigenetics to a wide array of metabolites. Hence a comprehensive study, including genomics, epi-genomics, transcriptomics, proteomics, and metabolomics, along with the medical/mass-spectrometry imaging, patient clinical history, treatments provided, genetics, and disease endemicity, is essential. Cancer Moonshot℠ Research Initiatives by NIH National Cancer Institute aims to collect as much information as possible from different regions of the world and make a cancer data repository. AI could play an immense role in (a) analysis of complex and heterogeneous data sets (multi-omics and/or inter-omics), (b) data integration to provide a holistic disease molecular mechanism, (c) identification of diagnostic and prognostic markers, and (d) monitor patient's response to drugs/treatments and recovery. AI enables precision disease management well beyond the prevalent disease stratification patterns, such as differential expression and supervised classification. This review highlights critical advances and challenges in omics data analysis, dealing with data variability from lab-to-lab, and data integration. We also describe methods used in data mining and AI methods to obtain robust results for precision medicine from "big" data. In the future, AI could be expanded to achieve ground-breaking progress in disease management.
Collapse
Affiliation(s)
- Sandip Kumar Patel
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
- Buck Institute for Research on Aging, Novato, CA, United States
| | - Bhawana George
- Department of Hematopathology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Vineeta Rai
- Department of Entomology & Plant Pathology, North Carolina State University, Raleigh, NC, United States
| |
Collapse
|
28
|
Turnhoff LK, Hadizadeh Esfahani A, Montazeri M, Kusch N, Schuppert A. FORESEE: a tool for the systematic comparison of translational drug response modeling pipelines. Bioinformatics 2020; 35:3846-3848. [PMID: 30821320 PMCID: PMC6761955 DOI: 10.1093/bioinformatics/btz145] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 02/11/2019] [Accepted: 02/25/2019] [Indexed: 12/27/2022] Open
Abstract
Summary Translational models that utilize omics data generated in in vitro studies to predict the drug efficacy of anti-cancer compounds in patients are highly distinct, which complicates the benchmarking process for new computational approaches. In reaction to this, we introduce the uniFied translatiOnal dRug rESponsE prEdiction platform FORESEE, an open-source R-package. FORESEE not only provides a uniform data format for public cell line and patient datasets, but also establishes a standardized environment for drug response prediction pipelines, incorporating various state-of-the-art pre-processing methods, model training algorithms and validation techniques. The modular implementation of individual elements of the pipeline facilitates a straightforward development of combinatorial models, which can be used to re-evaluate and improve already existing pipelines as well as to develop new ones. Availability and implementation FORESEE is licensed under GNU General Public License v3.0 and available at https://github.com/JRC-COMBINE/FORESEE and https://doi.org/10.17605/OSF.IO/RF6QK, and provides vignettes for documentation and application both online and in the Supplementary Files 2 and 3. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lisa-Katrin Turnhoff
- Joint Research Center for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany.,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
| | - Ali Hadizadeh Esfahani
- Joint Research Center for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany.,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
| | - Maryam Montazeri
- Joint Research Center for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany.,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
| | - Nina Kusch
- Joint Research Center for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany.,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
| | - Andreas Schuppert
- Joint Research Center for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany.,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany
| |
Collapse
|
29
|
Kurilov R, Haibe-Kains B, Brors B. Assessment of modelling strategies for drug response prediction in cell lines and xenografts. Sci Rep 2020; 10:2849. [PMID: 32071383 PMCID: PMC7028927 DOI: 10.1038/s41598-020-59656-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 01/23/2020] [Indexed: 12/20/2022] Open
Abstract
Data from several large high-throughput drug response screens have become available to the scientific community recently. Although many efforts have been made to use this information to predict drug sensitivity, our ability to accurately predict drug response based on genetic data remains limited. In order to systematically examine how different aspects of modelling affect the resulting prediction accuracy, we built a range of models for seven drugs (erlotinib, pacliatxel, lapatinib, PLX4720, sorafenib, nutlin-3 and nilotinib) using data from the largest available cell line and xenograft drug sensitivity screens. We found that the drug response metric, the choice of the molecular data type and the number of training samples have a substantial impact on prediction accuracy. We also compared the tasks of drug response prediction with tissue type prediction and found that, unlike for drug response, tissue type can be predicted with high accuracy. Furthermore, we assessed our ability to predict drug response in four xenograft cohorts (treated either with erlotinib, gemcitabine or paclitaxel) using models trained on cell line data. We could predict response in an erlotinib-treated cohort with a moderate accuracy (correlation ≈ 0.5), but were unable to correctly predict responses in cohorts treated with gemcitabine or paclitaxel.
Collapse
Affiliation(s)
- Roman Kurilov
- Division of Applied Bioinformatics, German Cancer Research Center, Heidelberg, Germany. .,Faculty of Biosciences, Heidelberg University, Heidelberg, Germany.
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, Toronto, Ontario, M5G 1L7, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, M5G 1L7, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, M5T 3A1, Canada.,Ontario Institute for Cancer Research, Toronto, Ontario, M5G 1L7, Canada
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center, Heidelberg, Germany.,National Center for Tumor Diseases (NCT), Heidelberg, Germany.,German Cancer Consortium (DKTK), Core Center Heidelberg, Heidelberg, Germany
| |
Collapse
|
30
|
Corsello SM, Nagari RT, Spangler RD, Rossen J, Kocak M, Bryan JG, Humeidi R, Peck D, Wu X, Tang AA, Wang VM, Bender SA, Lemire E, Narayan R, Montgomery P, Ben-David U, Garvie CW, Chen Y, Rees MG, Lyons NJ, McFarland JM, Wong BT, Wang L, Dumont N, O'Hearn PJ, Stefan E, Doench JG, Harrington CN, Greulich H, Meyerson M, Vazquez F, Subramanian A, Roth JA, Bittker JA, Boehm JS, Mader CC, Tsherniak A, Golub TR. Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. NATURE CANCER 2020; 1:235-248. [PMID: 32613204 PMCID: PMC7328899 DOI: 10.1038/s43018-019-0018-6] [Citation(s) in RCA: 370] [Impact Index Per Article: 92.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 12/06/2019] [Indexed: 12/26/2022]
Abstract
Anti-cancer uses of non-oncology drugs have occasionally been found, but such discoveries have been serendipitous. We sought to create a public resource containing the growth inhibitory activity of 4,518 drugs tested across 578 human cancer cell lines. We used PRISM, a molecular barcoding method, to screen drugs against cell lines in pools. An unexpectedly large number of non-oncology drugs selectively inhibited subsets of cancer cell lines in a manner predictable from the cell lines' molecular features. Our findings include compounds that killed by inducing PDE3A-SLFN12 complex formation; vanadium-containing compounds whose killing depended on the sulfate transporter SLC26A2; the alcohol dependence drug disulfiram, which killed cells with low expression of metallothioneins; and the anti-inflammatory drug tepoxalin, which killed via the multi-drug resistance protein ABCB1. The PRISM drug repurposing resource (https://depmap.org/repurposing) is a starting point to develop new oncology therapeutics, and more rarely, for potential direct clinical translation.
Collapse
Affiliation(s)
- Steven M Corsello
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | | | - Jordan Rossen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mustafa Kocak
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jordan G Bryan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Duke University, Durham, NC, USA
| | - Ranad Humeidi
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Peck
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xiaoyun Wu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrew A Tang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vickie M Wang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Evan Lemire
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rajiv Narayan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Uri Ben-David
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
| | | | - Yejia Chen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | | - Bang T Wong
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Li Wang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- 10x Genomics, Pleasanton, CA, USA
| | - Nancy Dumont
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Patrick J O'Hearn
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Relay Therapeutics, Cambridge, MA, USA
| | - Eric Stefan
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biogen, Cambridge, MA, USA
| | - John G Doench
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Matthew Meyerson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | | | | | - Joshua A Bittker
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Vertex Pharmaceuticals, Boston, MA, USA
| | - Jesse S Boehm
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christopher C Mader
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Flatiron Health, New York, NY, USA
| | | | - Todd R Golub
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
31
|
Chen J, Zhang L. A survey and systematic assessment of computational methods for drug response prediction. Brief Bioinform 2020; 22:232-246. [PMID: 31927568 DOI: 10.1093/bib/bbz164] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Drug response prediction arises from both basic and clinical research of personalized therapy, as well as drug discovery for cancers. With gene expression profiles and other omics data being available for over 1000 cancer cell lines and tissues, different machine learning approaches have been applied to drug response prediction. These methods appear in a body of literature and have been evaluated on different datasets with only one or two accuracy metrics. We systematically assess 17 representative methods for drug response prediction, which have been developed in the past 5 years, on four large public datasets in nine metrics. This study provides insights and lessons for future research into drug response prediction.
Collapse
|
32
|
Driehuis E, van Hoeck A, Moore K, Kolders S, Francies HE, Gulersonmez MC, Stigter ECA, Burgering B, Geurts V, Gracanin A, Bounova G, Morsink FH, Vries R, Boj S, van Es J, Offerhaus GJA, Kranenburg O, Garnett MJ, Wessels L, Cuppen E, Brosens LAA, Clevers H. Pancreatic cancer organoids recapitulate disease and allow personalized drug screening. Proc Natl Acad Sci U S A 2019; 116:26580-26590. [PMID: 31818951 PMCID: PMC6936689 DOI: 10.1073/pnas.1911273116] [Citation(s) in RCA: 249] [Impact Index Per Article: 49.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
We report the derivation of 30 patient-derived organoid lines (PDOs) from tumors arising in the pancreas and distal bile duct. PDOs recapitulate tumor histology and contain genetic alterations typical of pancreatic cancer. In vitro testing of a panel of 76 therapeutic agents revealed sensitivities currently not exploited in the clinic, and underscores the importance of personalized approaches for effective cancer treatment. The PRMT5 inhibitor EZP015556, shown to target MTAP (a gene commonly lost in pancreatic cancer)-negative tumors, was validated as such, but also appeared to constitute an effective therapy for a subset of MTAP-positive tumors. Taken together, the work presented here provides a platform to identify novel therapeutics to target pancreatic tumor cells using PDOs.
Collapse
Affiliation(s)
- Else Driehuis
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | - Arne van Hoeck
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, 3584 CG Utrecht, The Netherlands
| | - Kat Moore
- Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Sigrid Kolders
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | | | - M. Can Gulersonmez
- Department of Molecular Cancer Research, Center Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Edwin C. A. Stigter
- Department of Molecular Cancer Research, Center Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Boudewijn Burgering
- Department of Molecular Cancer Research, Center Molecular Medicine, Oncode Institute, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Veerle Geurts
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | - Ana Gracanin
- Hubrecht Organoid Technology, Utrecht 3584 CM, The Netherlands
| | - Gergana Bounova
- Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Folkert H. Morsink
- Department of Pathology, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Robert Vries
- Hubrecht Organoid Technology, Utrecht 3584 CM, The Netherlands
| | - Sylvia Boj
- Hubrecht Organoid Technology, Utrecht 3584 CM, The Netherlands
| | - Johan van Es
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | - G. Johan A. Offerhaus
- Department of Pathology, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Onno Kranenburg
- Utrecht Platform for Organoid Technology, Utrecht Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | | | - Lodewyk Wessels
- Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Edwin Cuppen
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, 3584 CG Utrecht, The Netherlands
- Hartwig Medical Foundation, 1098 XH Amsterdam, The Netherlands
- Center for Personalized Cancer Treatment,University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Lodewijk A. A. Brosens
- Department of Pathology, University Medical Center Utrecht, Utrecht 3584 CM, The Netherlands
| | - Hans Clevers
- Oncode Institute, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
- Princess Maxima Center, Utrecht 3584 CS, The Netherlands
| |
Collapse
|
33
|
Güvenç Paltun B, Mamitsuka H, Kaski S. Improving drug response prediction by integrating multiple data sources: matrix factorization, kernel and network-based approaches. Brief Bioinform 2019; 22:346-359. [PMID: 31838491 PMCID: PMC7820853 DOI: 10.1093/bib/bbz153] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/17/2022] Open
Abstract
Predicting the response of cancer cell lines to specific drugs is one of the central problems in personalized medicine, where the cell lines show diverse characteristics. Researchers have developed a variety of computational methods to discover associations between drugs and cell lines, and improved drug sensitivity analyses by integrating heterogeneous biological data. However, choosing informative data sources and methods that can incorporate multiple sources efficiently is the challenging part of successful analysis in personalized medicine. The reason is that finding decisive factors of cancer and developing methods that can overcome the problems of integrating data, such as differences in data structures and data complexities, are difficult. In this review, we summarize recent advances in data integration-based machine learning for drug response prediction, by categorizing methods as matrix factorization-based, kernel-based and network-based methods. We also present a short description of relevant databases used as a benchmark in drug response prediction analyses, followed by providing a brief discussion of challenges faced in integrating and interpreting data from multiple sources. Finally, we address the advantages of combining multiple heterogeneous data sources on drug sensitivity analysis by showing an experimental comparison. Contact: betul.guvenc@aalto.fi
Collapse
Affiliation(s)
- Betül Güvenç Paltun
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Hiroshi Mamitsuka
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Samuel Kaski
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| |
Collapse
|
34
|
Abstract
AbstractThis paper introduces the paired lasso: a generalisation of the lasso for paired covariate settings. Our aim is to predict a single response from two high-dimensional covariate sets. We assume a one-to-one correspondence between the covariate sets, with each covariate in one set forming a pair with a covariate in the other set. Paired covariates arise, for example, when two transformations of the same data are available. It is often unknown which of the two covariate sets leads to better predictions, or whether the two covariate sets complement each other. The paired lasso addresses this problem by weighting the covariates to improve the selection from the covariate sets and the covariate pairs. It thereby combines information from both covariate sets and accounts for the paired structure. We tested the paired lasso on more than 2000 classification problems with experimental genomics data, and found that for estimating sparse but predictive models, the paired lasso outperforms the standard and the adaptive lasso. The R package is available from cran.
Collapse
|
35
|
Kim Y, Bismeijer T, Zwart W, Wessels LFA, Vis DJ. Genomic data integration by WON-PARAFAC identifies interpretable factors for predicting drug-sensitivity in vivo. Nat Commun 2019; 10:5034. [PMID: 31695042 PMCID: PMC6834616 DOI: 10.1038/s41467-019-13027-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 10/10/2019] [Indexed: 01/20/2023] Open
Abstract
Integrative analyses that summarize and link molecular data to treatment sensitivity are crucial to capture the biological complexity which is essential to further precision medicine. We introduce Weighted Orthogonal Nonnegative parallel factor analysis (WON-PARAFAC), a data integration method that identifies sparse and interpretable factors. WON-PARAFAC summarizes the GDSC1000 cell line compendium in 130 factors. We interpret the factors based on their association with recurrent molecular alterations, pathway enrichment, cancer type, and drug-response. Crucially, the cell line derived factors capture the majority of the relevant biological variation in Patient-Derived Xenograft (PDX) models, strongly suggesting our factors capture invariant and generalizable aspects of cancer biology. Furthermore, drug response in cell lines is better and more consistently translated to PDXs using factor-based predictors as compared to raw feature-based predictors. WON-PARAFAC efficiently summarizes and integrates multiway high-dimensional genomic data and enhances translatability of drug response prediction from cell lines to patient-derived xenografts.
Collapse
Affiliation(s)
- Yongsoo Kim
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands.,Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands.,Department of Pathology, VU University Medical Center, Amsterdam, The Netherlands
| | - Tycho Bismeijer
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Wilbert Zwart
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands. .,Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands.
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands. .,Faculty of EEMCS, Delft University of Technology, Delft, The Netherlands.
| | - Daniel J Vis
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
| |
Collapse
|
36
|
Koromina M, Pandi MT, Patrinos GP. Rethinking Drug Repositioning and Development with Artificial Intelligence, Machine Learning, and Omics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2019; 23:539-548. [PMID: 31651216 DOI: 10.1089/omi.2019.0151] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Pharmaceutical industry and the art and science of drug development are sorely in need of novel transformative technologies in the current age of digital health and artificial intelligence (AI). Often described as game-changing technologies, AI and machine learning algorithms have slowly but surely begun to revolutionize pharmaceutical industry and drug development over the past 5 years. In this expert review, we describe the most frequently used machine learning algorithms in drug development pipelines and the -omics databases well poised to support machine learning and drug discovery. Subsequently, we analyze the emerging new computational approaches to drug discovery and the in silico pipelines for drug repositioning and the synergies among -omics system sciences, AI and machine learning. As with system sciences, AI and machine learning embody a system scale and Big Data driven vision for drug discovery and development. We conclude with a future outlook on the ways in which machine learning approaches can be implemented to buttress and expedite drug discovery and precision medicine. As AI and machine learning are rapidly entering pharmaceutical industry and the art and science of drug development, we need to critically examine the attendant prospects and challenges to benefit patients and public health.
Collapse
Affiliation(s)
- Maria Koromina
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece
| | - Maria-Theodora Pandi
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece
| | - George P Patrinos
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece.,Department of Pathology, College of Medicine and Health Sciences, United Arab Emirates University, Al-Ain, Abu Dhabi.,Zayed Center of Health Sciences, United Arab Emirates University, Al-Ain, Abu Dhabi
| |
Collapse
|
37
|
Parca L, Pepe G, Pietrosanto M, Galvan G, Galli L, Palmeri A, Sciandrone M, Ferrè F, Ausiello G, Helmer-Citterich M. Modeling cancer drug response through drug-specific informative genes. Sci Rep 2019; 9:15222. [PMID: 31645597 PMCID: PMC6811538 DOI: 10.1038/s41598-019-50720-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 09/06/2019] [Indexed: 12/18/2022] Open
Abstract
Recent advances in pharmacogenomics have generated a wealth of data of different types whose analysis have helped in the identification of signatures of different cellular sensitivity/resistance responses to hundreds of chemical compounds. Among the different data types, gene expression has proven to be the more successful for the inference of drug response in cancer cell lines. Although effective, the whole transcriptome can introduce noise in the predictive models, since specific mechanisms are required for different drugs and these realistically involve only part of the proteins encoded in the genome. We analyzed the pharmacogenomics data of 961 cell lines tested with 265 anti-cancer drugs and developed different machine learning approaches for dissecting the genome systematically and predict drug responses using both drug-unspecific and drug-specific genes. These methodologies reach better response predictions for the vast majority of the screened drugs using tens to few hundreds genes specific to each drug instead of the whole genome, thus allowing a better understanding and interpretation of drug-specific response mechanisms which are not necessarily restricted to the drug known targets.
Collapse
Affiliation(s)
- Luca Parca
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Gerardo Pepe
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Marco Pietrosanto
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Giulio Galvan
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Leonardo Galli
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Antonio Palmeri
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
- Celgene Institute for Translational Research Europe, Sevilla, Spain
| | - Marco Sciandrone
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology, University of Bologna Alma Mater, Bologna, Italy
| | - Gabriele Ausiello
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | | |
Collapse
|
38
|
Aben N, Westerhuis JA, Song Y, Kiers HAL, Michaut M, Smilde AK, Wessels LFA. iTOP: inferring the topology of omics data. Bioinformatics 2019; 34:i988-i996. [PMID: 30423084 PMCID: PMC6129292 DOI: 10.1093/bioinformatics/bty636] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Motivation In biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn, influence drug response. Such relationships can strongly affect analyses performed on the data, as we have previously shown for the identification of biomarkers of drug response. Therefore, it is important to be able to chart the relationships between datasets. Results We present iTOP, a methodology to infer a topology of relationships between datasets. We base this methodology on the RV coefficient, a measure of matrix correlation, which can be used to determine how much information is shared between two datasets. We extended the RV coefficient for partial matrix correlations, which allows the use of graph reconstruction algorithms, such as the PC algorithm, to infer the topologies. In addition, since multi-omics data often contain binary data (e.g. mutations), we also extended the RV coefficient for binary data. Applying iTOP to pharmacogenomics data, we found that gene expression acts as a mediator between most other datasets and drug response: only proteomics clearly shares information with drug response that is not present in gene expression. Based on this result, we used TANDEM, a method for drug response prediction, to identify which variables predictive of drug response were distinct to either gene expression or proteomics. Availability and implementation An implementation of our methodology is available in the R package iTOP on CRAN. Additionally, an R Markdown document with code to reproduce all figures is provided as Supplementary Material. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nanne Aben
- Division of Molecular Carcinogenesis, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands.,Faculty of EEMCS, Delft University of Technology, Delft, The Netherlands
| | - Johan A Westerhuis
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Yipeng Song
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Henk A L Kiers
- Heymans Institute, University of Groningen, Groningen, The Netherlands
| | - Magali Michaut
- Division of Molecular Carcinogenesis, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Age K Smilde
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands.,Faculty of EEMCS, Delft University of Technology, Delft, The Netherlands.,Cancer Genomics Netherlands, Utrecht, The Netherlands
| |
Collapse
|
39
|
Abstract
Motivation Large-scale screenings of cancer cell lines with detailed molecular profiles against libraries of pharmacological compounds are currently being performed in order to gain a better understanding of the genetic component of drug response and to enhance our ability to recommend therapies given a patient's molecular profile. These comprehensive screens differ from the clinical setting in which (i) medical records only contain the response of a patient to very few drugs, (ii) drugs are recommended by doctors based on their expert judgment and (iii) selecting the most promising therapy is often more important than accurately predicting the sensitivity to all potential drugs. Current regression models for drug sensitivity prediction fail to account for these three properties. Results We present a machine learning approach, named Kernelized Rank Learning (KRL), that ranks drugs based on their predicted effect per cell line (patient), circumventing the difficult problem of precisely predicting the sensitivity to the given drug. Our approach outperforms several state-of-the-art predictors in drug recommendation, particularly if the training dataset is sparse, and generalizes to patient data. Our work phrases personalized drug recommendation as a new type of machine learning problem with translational potential to the clinic. Availability and implementation The Python implementation of KRL and scripts for running our experiments are available at https://github.com/BorgwardtLab/Kernelized-Rank-Learning. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiao He
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Lukas Folkman
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Karsten Borgwardt
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
40
|
Singh A, Shannon CP, Gautier B, Rohart F, Vacher M, Tebbutt SJ, Lê Cao KA. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics 2019; 35:3055-3062. [PMID: 30657866 PMCID: PMC6735831 DOI: 10.1093/bioinformatics/bty1054] [Citation(s) in RCA: 375] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 12/17/2018] [Accepted: 01/14/2019] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION In the continuously expanding omics era, novel computational and statistical strategies are needed for data integration and identification of biomarkers and molecular signatures. We present Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO), a multi-omics integrative method that seeks for common information across different data types through the selection of a subset of molecular features, while discriminating between multiple phenotypic groups. RESULTS Using simulations and benchmark multi-omics studies, we show that DIABLO identifies features with superior biological relevance compared with existing unsupervised integrative methods, while achieving predictive performance comparable to state-of-the-art supervised approaches. DIABLO is versatile, allowing for modular-based analyses and cross-over study designs. In two case studies, DIABLO identified both known and novel multi-omics biomarkers consisting of mRNAs, miRNAs, CpGs, proteins and metabolites. AVAILABILITY AND IMPLEMENTATION DIABLO is implemented in the mixOmics R Bioconductor package with functions for parameters' choice and visualization to assist in the interpretation of the integrative analyses, along with tutorials on http://mixomics.org and in our Bioconductor vignette. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Amrit Singh
- Prevention of Organ Failure (PROOF) Centre of Excellence, University of British Columbia, Vancouver, BC, Canada
| | - Casey P Shannon
- Prevention of Organ Failure (PROOF) Centre of Excellence, University of British Columbia, Vancouver, BC, Canada
| | - Benoît Gautier
- The University of Queensland Diamantina Institute, Translational Research Institute, Woolloongabba, Queensland, Australia
| | - Florian Rohart
- Institute for Molecular Bioscience, The University of Queensland, St Lucia, Queensland, Australia
| | - Michaël Vacher
- Australian eHealth Research Centre, Commonwealth Scientific and Industrial Research Organisation, Brisbane, Queensland, Australia
| | - Scott J Tebbutt
- Prevention of Organ Failure (PROOF) Centre of Excellence, University of British Columbia, Vancouver, BC, Canada
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia
| |
Collapse
|
41
|
Rydenfelt M, Wongchenko M, Klinger B, Yan Y, Blüthgen N. The cancer cell proteome and transcriptome predicts sensitivity to targeted and cytotoxic drugs. Life Sci Alliance 2019; 2:2/4/e201900445. [PMID: 31253656 PMCID: PMC6600015 DOI: 10.26508/lsa.201900445] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 06/13/2019] [Accepted: 06/14/2019] [Indexed: 12/21/2022] Open
Abstract
This study shows that the proteomic and transcriptomic states of cancer cells are more predictive of drug sensitivity than genomic markers for most drugs, both within and across tumor types. Tumors of different molecular subtypes can show strongly deviating responses to drug treatment, making stratification of patients based on molecular markers an important part of cancer therapy. Pharmacogenomic studies have led to the discovery of selected genomic markers (e.g., BRAFV600E), whereas transcriptomic and proteomic markers so far have been largely absent in clinical use, thus constituting a potentially valuable resource for further substratification of patients. To systematically assess the explanatory power of different -omics data types, we assembled a panel of 49 melanoma cell lines, including genomic, transcriptomic, proteomic, and pharmacological data, showing that drug sensitivity models trained on transcriptomic or proteomic data outperform genomic-based models for most drugs. These results were confirmed in eight additional tumor types using published datasets. Furthermore, we show that drug sensitivity models can be transferred between tumor types, although after correcting for training sample size, transferred models perform worse than within-tumor–type predictions. Our results suggest that transcriptomic/proteomic signals may be alternative biomarker candidates for the stratification of patients without known genomic markers.
Collapse
Affiliation(s)
- Mattias Rydenfelt
- Charité-Universitätsmedizin, Institute of Pathology, Berlin, Germany
| | - Matthew Wongchenko
- Genentech Inc., Oncology Biomarker Development, South San Francisco CA, USA
| | - Bertram Klinger
- Charité-Universitätsmedizin, Institute of Pathology, Berlin, Germany.,Humboldt Universität zu Berlin, Integrative Research Institute for the Life Sciences, Berlin, Germany
| | - Yibing Yan
- Genentech Inc., Oncology Biomarker Development, South San Francisco CA, USA
| | - Nils Blüthgen
- Charité-Universitätsmedizin, Institute of Pathology, Berlin, Germany .,Humboldt Universität zu Berlin, Integrative Research Institute for the Life Sciences, Berlin, Germany
| |
Collapse
|
42
|
Hornung R, Wright MN. Block Forests: random forests for blocks of clinical and omics covariate data. BMC Bioinformatics 2019; 20:358. [PMID: 31248362 PMCID: PMC6598279 DOI: 10.1186/s12859-019-2942-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 06/07/2019] [Indexed: 12/25/2022] Open
Abstract
Background In the last years more and more multi-omics data are becoming available, that is, data featuring measurements of several types of omics data for each patient. Using multi-omics data as covariate data in outcome prediction is both promising and challenging due to the complex structure of such data. Random forest is a prediction method known for its ability to render complex dependency patterns between the outcome and the covariates. Against this background we developed five candidate random forest variants tailored to multi-omics covariate data. These variants modify the split point selection of random forest to incorporate the block structure of multi-omics data and can be applied to any outcome type for which a random forest variant exists, such as categorical, continuous and survival outcomes. Using 20 publicly available multi-omics data sets with survival outcome we compared the prediction performances of the block forest variants with alternatives. We also considered the common special case of having clinical covariates and measurements of a single omics data type available. Results We identify one variant termed “block forest” that outperformed all other approaches in the comparison study. In particular, it performed significantly better than standard random survival forest (adjusted p-value: 0.027). The two best performing variants have in common that the block choice is randomized in the split point selection procedure. In the case of having clinical covariates and a single omics data type available, the improvements of the variants over random survival forest were larger than in the case of the multi-omics data. The degrees of improvements over random survival forest varied strongly across data sets. Moreover, considering all clinical covariates mandatorily improved the performance. This result should however be interpreted with caution, because the level of predictive information contained in clinical covariates depends on the specific application. Conclusions The new prediction method block forest for multi-omics data can significantly improve the prediction performance of random forest and outperformed alternatives in the comparison. Block forest is particularly effective for the special case of using clinical covariates in combination with measurements of a single omics data type. Electronic supplementary material The online version of this article (10.1186/s12859-019-2942-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Roman Hornung
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, 81377, Germany.
| | - Marvin N Wright
- Leibniz Institute for Prevention Research and Epidemiology - BIPS, Achterstr. 30, Bremen, 28359, Germany.,Section of Biostatistics, Department of Public Health, University of Copenhagen, Øster Farimagsgade 5, Copenhagen, 1014, Denmark
| |
Collapse
|
43
|
Luan J, Gao X, Hu F, Zhang Y, Gou X. SLFN11 is a general target for enhancing the sensitivity of cancer to chemotherapy (DNA-damaging agents). J Drug Target 2019; 28:33-40. [PMID: 31092045 DOI: 10.1080/1061186x.2019.1616746] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
In patients with cancer, drug tolerance often occurs during the use of chemotherapy drugs, seriously affecting patient prognosis and survival. Therefore, scientists began to study the factors that affect chemotherapy drug sensitivity, and the high correlation between Schlafen-11 (SLFN11) and sensitivity to chemical drugs (mainly DNA-damaging agents, DDAs) has received increasing attention since it was discovered through bioinformatics analyses. Regarding the mechanism, SLFN11 may sensitise cells to chemotherapy drugs by preventing DNA damage repair. In recent years, SLFN11 has gradually become a hot research topic, and the results are enriching our understanding of this molecule. Indeed, the biological functions of SLFN11 under normal physiological conditions and in cancer, changes in its expression levels and mechanisms promoting apoptosis within the context of chemotherapeutic interventions have gradually been uncovered. Studies to date provide knowledge and the experimental and theoretical bases underlying SLFN11 and its effects on sensitivity to chemotherapy drugs. This review summarises the existing research on SLFN11 with the aim of achieving a more comprehensive understanding and furthering the development of strategies to target SLFN11 in the treatment of cancer.
Collapse
Affiliation(s)
- Jing Luan
- Shaanxi Key Laboratory of Brain Disorders & Institute of Basic and Translational Medicine, Xi'an Medical University, Xi'an, Shaanxi, China
| | - Xingchun Gao
- Shaanxi Key Laboratory of Brain Disorders & Institute of Basic and Translational Medicine, Xi'an Medical University, Xi'an, Shaanxi, China
| | - Fengrui Hu
- Shaanxi Key Laboratory of Brain Disorders & Institute of Basic and Translational Medicine, Xi'an Medical University, Xi'an, Shaanxi, China
| | - Yuelin Zhang
- Shaanxi Key Laboratory of Brain Disorders & Institute of Basic and Translational Medicine, Xi'an Medical University, Xi'an, Shaanxi, China
| | - Xingchun Gou
- Shaanxi Key Laboratory of Brain Disorders & Institute of Basic and Translational Medicine, Xi'an Medical University, Xi'an, Shaanxi, China
| |
Collapse
|
44
|
Su R, Liu X, Wei L, Zou Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 2019; 166:91-102. [PMID: 30772464 DOI: 10.1016/j.ymeth.2019.02.009] [Citation(s) in RCA: 135] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 01/13/2019] [Accepted: 02/10/2019] [Indexed: 12/01/2022] Open
Abstract
The identification of therapeutic biomarkers predictive of drug response is crucial in personalized medicine. A number of computational models to predict response of anti-cancer drugs have been developed as the establishment of several pharmacogenomics screening databases. In our study, we proposed a deep cascaded forest model, Deep-Resp-Forest, to classify the anti-cancer drug response as "sensitive" or "resistant". We made three contributions in this study. Firstly, diverse molecular data could be effectively integrated to provide more information than single type of data for the classification. Combination of two types of data were tested here. Secondly, two structures based on the multi-grained scanning to transform the raw features into high-dimensional feature vectors and integrate the diverse data were proposed in our study. Thirdly, the original deep and time-consuming architecture of cascade forest was improved by a feature optimization operation, which emphasized the most discriminative features across layers. We evaluated the proposed method on the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) data sets and then compared with the Support Vector Machine. The proposed Deep-Resp-Forest has demonstrated the promising use of deep learning and deep forest approach on the drug response prediction tasks. The R implementation for running our experiments is available athttps://github.com/RanSuLab/Deep-Resp-Forest.
Collapse
Affiliation(s)
- Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xinyi Liu
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Leyi Wei
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
45
|
Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping P. Machine Learning and Integrative Analysis of Biomedical Big Data. Genes (Basel) 2019; 10:E87. [PMID: 30696086 PMCID: PMC6410075 DOI: 10.3390/genes10020087] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/08/2019] [Accepted: 01/21/2019] [Indexed: 12/11/2022] Open
Abstract
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
Collapse
Affiliation(s)
- Bilal Mirza
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Wei Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Jie Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Howard Choi
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Neo Christopher Chung
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
| | - Peipei Ping
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Medicine (Cardiology), University of California Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
46
|
Wang Y, Cho DY, Lee H, Fear J, Oliver B, Przytycka TM. Reprogramming of regulatory network using expression uncovers sex-specific gene regulation in Drosophila. Nat Commun 2018; 9:4061. [PMID: 30283019 PMCID: PMC6170494 DOI: 10.1038/s41467-018-06382-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/13/2018] [Indexed: 02/07/2023] Open
Abstract
Gene regulatory networks (GRNs) describe regulatory relationships between transcription factors (TFs) and their target genes. Computational methods to infer GRNs typically combine evidence across different conditions to infer context-agnostic networks. We develop a method, Network Reprogramming using EXpression (NetREX), that constructs a context-specific GRN given context-specific expression data and a context-agnostic prior network. NetREX remodels the prior network to obtain the topology that provides the best explanation for expression data. Because NetREX utilizes prior network topology, we also develop PriorBoost, a method that evaluates a prior network in terms of its consistency with the expression data. We validate NetREX and PriorBoost using the "gold standard" E. coli GRN from the DREAM5 network inference challenge and apply them to construct sex-specific Drosophila GRNs. NetREX constructed sex-specific Drosophila GRNs that, on all applied measures, outperform networks obtained from other methods indicating that NetREX is an important milestone toward building more accurate GRNs.
Collapse
Affiliation(s)
- Yijie Wang
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Dong-Yeon Cho
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Hangnoh Lee
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Justin Fear
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Brian Oliver
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA.
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA.
| |
Collapse
|
47
|
Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics 2018; 19:322. [PMID: 30208855 PMCID: PMC6134797 DOI: 10.1186/s12859-018-2344-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 08/29/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND The inclusion of high-dimensional omics data in prediction models has become a well-studied topic in the last decades. Although most of these methods do not account for possibly different types of variables in the set of covariates available in the same dataset, there are many such scenarios where the variables can be structured in blocks of different types, e.g., clinical, transcriptomic, and methylation data. To date, there exist a few computationally intensive approaches that make use of block structures of this kind. RESULTS In this paper we present priority-Lasso, an intuitive and practical analysis strategy for building prediction models based on Lasso that takes such block structures into account. It requires the definition of a priority order of blocks of data. Lasso models are calculated successively for every block and the fitted values of every step are included as an offset in the fit of the next step. We apply priority-Lasso in different settings on an acute myeloid leukemia (AML) dataset consisting of clinical variables, cytogenetics, gene mutations and expression variables, and compare its performance on an independent validation dataset to the performance of standard Lasso models. CONCLUSION The results show that priority-Lasso is able to keep pace with Lasso in terms of prediction accuracy. Variables of blocks with higher priorities are favored over variables of blocks with lower priority, which results in easily usable and transportable models for clinical practice.
Collapse
|
48
|
Ali M, Aittokallio T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys Rev 2018; 11:31-39. [PMID: 30097794 PMCID: PMC6381361 DOI: 10.1007/s12551-018-0446-z] [Citation(s) in RCA: 102] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 07/22/2018] [Indexed: 02/07/2023] Open
Abstract
In-depth modeling of the complex interplay among multiple omics data measured from cancer cell lines or patient tumors is providing new opportunities toward identification of tailored therapies for individual cancer patients. Supervised machine learning algorithms are increasingly being applied to the omics profiles as they enable integrative analyses among the high-dimensional data sets, as well as personalized predictions of therapy responses using multi-omics panels of response-predictive biomarkers identified through feature selection and cross-validation. However, technical variability and frequent missingness in input "big data" require the application of dedicated data preprocessing pipelines that often lead to some loss of information and compressed view of the biological signal. We describe here the state-of-the-art machine learning methods for anti-cancer drug response modeling and prediction and give our perspective on further opportunities to make better use of high-dimensional multi-omics profiles along with knowledge about cancer pathways targeted by anti-cancer compounds when predicting their phenotypic responses.
Collapse
Affiliation(s)
- Mehreen Ali
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, FI-00290, Helsinki, Finland.,Helsinki Institute for Information Technology (HIIT), Aalto University, FI-02150, Espoo, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, FI-00290, Helsinki, Finland. .,Helsinki Institute for Information Technology (HIIT), Aalto University, FI-02150, Espoo, Finland. .,Department of Mathematics and Statistics, University of Turku, FI-20014, Turku, Finland.
| |
Collapse
|