26
|
Margolin AA. To Catch a Pre-Leukemia. Sci Transl Med 2014. [DOI: 10.1126/scitranslmed.3008713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
DNMT3A
mutations occur in pre-leukemic hematopoietic stem cells and define early-stage events in the progression to acute myeloid leukemia.
Collapse
|
27
|
Margolin AA. Moving from Unknown Unknown to Known Unknown. Sci Transl Med 2014. [DOI: 10.1126/scitranslmed.3008434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Genomic analysis of nearly 5000 tumors demonstrates that we are far from finding all the cancer genes.
Collapse
|
28
|
Neto EC, Jang IS, Friend SH, Margolin AA. The Stream algorithm: computationally efficient ridge-regression via Bayesian model averaging, and applications to pharmacogenomic prediction of cancer cell line sensitivity. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2014:27-38. [PMID: 24297531 PMCID: PMC3911888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Computational efficiency is important for learning algorithms operating in the "large p, small n" setting. In computational biology, the analysis of data sets containing tens of thousands of features ("large p"), but only a few hundred samples ("small n"), is nowadays routine, and regularized regression approaches such as ridge-regression, lasso, and elastic-net are popular choices. In this paper we propose a novel and highly efficient Bayesian inference method for fitting ridge-regression. Our method is fully analytical, and bypasses the need for expensive tuning parameter optimization, via cross-validation, by employing Bayesian model averaging over the grid of tuning parameters. Additional computational efficiency is achieved by adopting the singular value decomposition reparametrization of the ridge-regression model, replacing computationally expensive inversions of large p × p matrices by efficient inversions of small and diagonal n × n matrices. We show in simulation studies and in the analysis of two large cancer cell line data panels that our algorithm achieves slightly better predictive performance than cross-validated ridge-regression while requiring only a fraction of the computation time. Furthermore, in comparisons based on the cell line data sets, our algorithm systematically out-performs the lasso in both predictive performance and computation time, and shows equivalent predictive performance, but considerably smaller computation time, than the elastic-net.
Collapse
|
29
|
Jang IS, Neto EC, Guinney J, Friend SH, Margolin AA. Systematic assessment of analytical methods for drug sensitivity prediction from cancer cell line data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2014:63-74. [PMID: 24297534 PMCID: PMC3995541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.
Collapse
|
30
|
Abstract
Genomic analysis of 12 cancers identifies common somatic mutations across tumor types.
Collapse
|
31
|
Margolin AA. It Takes Two to Make a RAF Signal Right. Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3007774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A new technology allows single molecule–resolution imaging of RAS multimer formation.
Collapse
|
32
|
Omberg L, Ellrott K, Yuan Y, Kandoth C, Wong C, Kellen MR, Friend SH, Stuart J, Liang H, Margolin AA. Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas. Nat Genet 2013; 45:1121-6. [PMID: 24071850 PMCID: PMC3950337 DOI: 10.1038/ng.2761] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The Cancer Genome Atlas Pan-Cancer Analysis Working Group collaborated on the Synapse software platform to share and evolve data, results and methodologies while performing integrative analysis of molecular profiling data from 12 tumor types. The group's work serves as a pilot case study that provides (i) a template for future large collaborative studies; (ii) a system to support collaborative projects; and (iii) a public resource of highly curated data, results and automated systems for the evaluation of community-developed models.
Collapse
|
33
|
Margolin AA. Co-opting a Tumor Predator to Provide Life Support. Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3007484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Tumor cells acquire resistance to anti-VEGF therapy by mobilizing immune cells through an IL-17–mediated paracrine signaling network.
Collapse
|
34
|
Margolin AA. The Bionic Cancer-Resistant Extracellular Matrix of the Naked Mole-Rat. Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3007045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Naked mole-rats resist cancer through a high-molecular-mass version of the extracellular matrix component hyaluronan.
Collapse
|
35
|
Margolin AA. The Enemy of the Enemy of My Enemy Is a Therapeutic Target. Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3006748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A phase 1 clinical trial of the monoclonal antibody to PD-1, lambrolizumab.
Collapse
|
36
|
Bilal E, Dutkowski J, Guinney J, Jang IS, Logsdon BA, Pandey G, Sauerwine BA, Shimoni Y, Moen Vollan HK, Mecham BH, Rueda OM, Tost J, Curtis C, Alvarez MJ, Kristensen VN, Aparicio S, Børresen-Dale AL, Caldas C, Califano A, Friend SH, Ideker T, Schadt EE, Stolovitzky GA, Margolin AA. Improving breast cancer survival analysis through competition-based multidimensional modeling. PLoS Comput Biol 2013; 9:e1003047. [PMID: 23671412 PMCID: PMC3649990 DOI: 10.1371/journal.pcbi.1003047] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 03/18/2013] [Indexed: 01/09/2023] Open
Abstract
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
Collapse
|
37
|
Margolin AA. Cancer Therapeutics: Best Informed by Genes or Genomes? Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3006450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
General genomic features such as mutation frequency and copy number variation define prognosis-related classes of endometrial carcinoma better than mutations in cancer genes.
Collapse
|
38
|
Margolin AA, Bilal E, Huang E, Norman TC, Ottestad L, Mecham BH, Sauerwine B, Kellen MR, Mangravite LM, Furia MD, Vollan HKM, Rueda OM, Guinney J, Deflaux NA, Hoff B, Schildwachter X, Russnes HG, Park D, Vang VO, Pirtle T, Youseff L, Citro C, Curtis C, Kristensen VN, Hellerstein J, Friend SH, Stolovitzky G, Aparicio S, Caldas C, Børresen-Dale AL. Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer. Sci Transl Med 2013; 5:181re1. [PMID: 23596205 PMCID: PMC3897241 DOI: 10.1126/scitranslmed.3006112] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks-DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models.
Collapse
|
39
|
Margolin AA. Genomics of T-ALL Reveals a Weapon of an Evasive Foe. Sci Transl Med 2013. [DOI: 10.1126/scitranslmed.3006150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Genomic profiling of T-ALL samples at diagnosis, remission, and relapse identifies NT5C2 as a chemoresistance gene.
Collapse
|
40
|
Shao DD, Tsherniak A, Gopal S, Weir BA, Tamayo P, Stransky N, Schumacher SE, Zack TI, Beroukhim R, Garraway LA, Margolin AA, Root DE, Hahn WC, Mesirov JP. ATARiS: computational quantification of gene suppression phenotypes from multisample RNAi screens. Genome Res 2012; 23:665-78. [PMID: 23269662 PMCID: PMC3613583 DOI: 10.1101/gr.143586.112] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Genome-scale RNAi libraries enable the systematic interrogation of gene function. However, the interpretation of RNAi screens is complicated by the observation that RNAi reagents designed to suppress the mRNA transcripts of the same gene often produce a spectrum of phenotypic outcomes due to differential on-target gene suppression or perturbation of off-target transcripts. Here we present a computational method, Analytic Technique for Assessment of RNAi by Similarity (ATARiS), that takes advantage of patterns in RNAi data across multiple samples in order to enrich for RNAi reagents whose phenotypic effects relate to suppression of their intended targets. By summarizing only such reagent effects for each gene, ATARiS produces quantitative, gene-level phenotype values, which provide an intuitive measure of the effect of gene suppression in each sample. This method is robust for data sets that contain as few as 10 samples and can be used to analyze screens of any number of targeted genes. We used this analytic approach to interrogate RNAi data derived from screening more than 100 human cancer cell lines and identified HNF1B as a transforming oncogene required for the survival of cancer cells that harbor HNF1B amplifications. ATARiS is publicly available at http://broadinstitute.org/ataris.
Collapse
|
41
|
Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, J.Wilson C, Lehár J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jané-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P, de Silva M, Jagtap K, Jones MD, Wang L, Hatton C, Palescandolo E, Gupta S, Mahan S, Sougnez C, Onofrio RC, Liefeld T, MacConaill L, Winckler W, Reich M, Li N, Mesirov JP, Gabriel SB, Getz G, Ardlie K, Chan V, Myer VE, Weber BL, Porter J, Warmuth M, Finan P, Harris JL, Meyerson M, Golub TR, Morrissey MP, Sellers WR, Schlegel R, Garraway LA. Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012. [DOI: 10.1038/nature11735] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
42
|
Wei G, Margolin AA, Haery L, Brown E, Cucolo L, Julian B, Shehata S, Kung AL, Beroukhim R, Golub TR. Chemical genomics identifies small-molecule MCL1 repressors and BCL-xL as a predictor of MCL1 dependency. Cancer Cell 2012; 21:547-62. [PMID: 22516262 PMCID: PMC3685408 DOI: 10.1016/j.ccr.2012.02.028] [Citation(s) in RCA: 145] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Revised: 12/12/2011] [Accepted: 02/27/2012] [Indexed: 01/07/2023]
Abstract
MCL1, which encodes the antiapoptotic protein MCL1, is among the most frequently amplified genes in human cancer. A chemical genomic screen identified compounds, including anthracyclines, that decreased MCL1 expression. Genomic profiling indicated that these compounds were global transcriptional repressors that preferentially affect MCL1 due to its short mRNA half-life. Transcriptional repressors and MCL1 shRNAs induced apoptosis in the same cancer cell lines and could be rescued by physiological levels of ectopic MCL1 expression. Repression of MCL1 released the proapoptotic protein BAK from MCL1, and Bak deficiency conferred resistance to transcriptional repressors. A computational model, validated in vivo, indicated that high BCL-xL expression confers resistance to MCL1 repression, thereby identifying a patient-selection strategy for the clinical development of MCL1 inhibitors.
Collapse
|
43
|
Stransky N, Kryukov GV, Caponigro G, Barretina J, Venkatesan K, Margolin AA, Wilson CJ, Lehar J, Jones MD, Palescandolo E, Sougnez C, Onofrio RC, MacConaill L, Ardlie K, Golub TR, Morrissey M, Selers WR, Schlegel R, Garraway LA. Abstract 5114: Integrative analysis of the Cancer Cell Line Encyclopedia reveals genetic and transcriptional predictors of compound sensitivity. Cancer Res 2012. [DOI: 10.1158/1538-7445.am2012-5114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The Cancer Cell Line Encyclopedia (CCLE) represents a collaborative effort to assemble a comprehensive resource of human cancer models for basic and translational research. It contains a detailed genetic profiling of approximately 1,000 human cancer cell lines spanning many tumor types. Thus far, high-density SNP array data, gene expression microarray data and mutation data from hybrid capture sequencing of 1,650 cancer genes has been obtained. Additionally, we have assessed the sensitivity of these same cell lines using a series of pharmacological compounds that represent both conventional cytotoxic and targeted agents. On major goal of the CCLE effort involves systematic integration of the genomic and pharmacologic datasets in order to identify putative targets of prevalent genetic alterations as well as predictors and modifiers of pharmacologic sensitivity and resistance. The availability of high-quality data generated by uniform criteria across hundreds of cell lines markedly enhances the statistical power to discover genetic alterations involved in carcinogenesis and molecular predictors of pharmacologic vulnerability. We developed a framework based on an elastic net machine-learning regression algorithm, and combined with a bootstrapping procedure, to derive predictive models of the sensitivity to each compound, using all genetic features of the cell lines in the collection. Through this computational prediction approach, we have both rediscovered molecular features predicting response to most drugs in our set but also uncovered novel potential biomarkers of sensitivity and resistance to targeted agents and chemotherapy drugs. For instance, we have found that response to topoisomerase 1 inhibitors seems to be linked to the expression of a single gene. We have also observed that tissue lineage is a key predictor for sensitivity to certain compounds, providing rationale for clinical trials of these drugs in particular cancer types and we identified potential stratifiers for existing EGFR targeted therapies. Finally, we have found an additional target for a certain chemotype of MEK inhibitors, and shown that this interaction was responsible for growth suppression, which might be a new indication for this kind of drug. Our cell line-based platform provides a valuable tool for the development of personalized cancer medicine, revealing critical tumor dependencies and helping to stratify patients for clinical trials.
Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr 5114. doi:1538-7445.AM2012-5114
Collapse
|
44
|
Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jané-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P, de Silva M, Jagtap K, Jones MD, Wang L, Hatton C, Palescandolo E, Gupta S, Mahan S, Sougnez C, Onofrio RC, Liefeld T, MacConaill L, Winckler W, Reich M, Li N, Mesirov JP, Gabriel SB, Getz G, Ardlie K, Chan V, Myer VE, Weber BL, Porter J, Warmuth M, Finan P, Harris JL, Meyerson M, Golub TR, Morrissey MP, Sellers WR, Schlegel R, Garraway LA. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012; 483:603-7. [PMID: 22460905 PMCID: PMC3320027 DOI: 10.1038/nature11003] [Citation(s) in RCA: 5306] [Impact Index Per Article: 442.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Accepted: 03/01/2012] [Indexed: 02/07/2023]
Abstract
The systematic translation of cancer genomic data into knowledge of tumour biology and therapeutic possibilities remains challenging. Such efforts should be greatly aided by robust preclinical model systems that reflect the genomic diversity of human cancers and for which detailed genetic and pharmacological annotation is available. Here we describe the Cancer Cell Line Encyclopedia (CCLE): a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines. When coupled with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, this collection allowed identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. In addition to known predictors, we found that plasma cell lineage correlated with sensitivity to IGF1 receptor inhibitors; AHR expression was associated with MEK inhibitor efficacy in NRAS-mutant lines; and SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Together, our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of 'personalized' therapeutic regimens.
Collapse
MESH Headings
- Antineoplastic Agents/pharmacology
- Cell Line, Tumor
- Cell Lineage
- Chromosomes, Human/genetics
- Clinical Trials as Topic/methods
- Databases, Factual
- Drug Screening Assays, Antitumor/methods
- Encyclopedias as Topic
- Gene Expression Profiling
- Gene Expression Regulation, Neoplastic
- Genes, ras/genetics
- Genome, Human/genetics
- Genomics
- Humans
- Mitogen-Activated Protein Kinase Kinases/antagonists & inhibitors
- Mitogen-Activated Protein Kinase Kinases/metabolism
- Models, Biological
- Neoplasms/drug therapy
- Neoplasms/genetics
- Neoplasms/metabolism
- Neoplasms/pathology
- Pharmacogenetics
- Plasma Cells/cytology
- Plasma Cells/drug effects
- Plasma Cells/metabolism
- Precision Medicine/methods
- Receptor, IGF Type 1/antagonists & inhibitors
- Receptor, IGF Type 1/metabolism
- Receptors, Aryl Hydrocarbon/genetics
- Receptors, Aryl Hydrocarbon/metabolism
- Sequence Analysis, DNA
- Topoisomerase Inhibitors/pharmacology
Collapse
|
45
|
Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jané-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P, de Silva M, Jagtap K, Jones MD, Wang L, Hatton C, Palescandolo E, Gupta S, Mahan S, Sougnez C, Onofrio RC, Liefeld T, MacConaill L, Winckler W, Reich M, Li N, Mesirov JP, Gabriel SB, Getz G, Ardlie K, Chan V, Myer VE, Weber BL, Porter J, Warmuth M, Finan P, Harris JL, Meyerson M, Golub TR, Morrissey MP, Sellers WR, Schlegel R, Garraway LA. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012. [PMID: 22460905 DOI: 10.1038/nature1100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The systematic translation of cancer genomic data into knowledge of tumour biology and therapeutic possibilities remains challenging. Such efforts should be greatly aided by robust preclinical model systems that reflect the genomic diversity of human cancers and for which detailed genetic and pharmacological annotation is available. Here we describe the Cancer Cell Line Encyclopedia (CCLE): a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines. When coupled with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, this collection allowed identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. In addition to known predictors, we found that plasma cell lineage correlated with sensitivity to IGF1 receptor inhibitors; AHR expression was associated with MEK inhibitor efficacy in NRAS-mutant lines; and SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Together, our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of 'personalized' therapeutic regimens.
Collapse
|
46
|
Margolin AA, Ong SE, Schenone M, Gould R, Schreiber SL, Carr SA, Golub TR. Empirical Bayes analysis of quantitative proteomics experiments. PLoS One 2009; 4:e7454. [PMID: 19829701 PMCID: PMC2759080 DOI: 10.1371/journal.pone.0007454] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Accepted: 09/15/2009] [Indexed: 12/23/2022] Open
Abstract
Background Advances in mass spectrometry-based proteomics have enabled the incorporation of proteomic data into systems approaches to biology. However, development of analytical methods has lagged behind. Here we describe an empirical Bayes framework for quantitative proteomics data analysis. The method provides a statistical description of each experiment, including the number of proteins that differ in abundance between 2 samples, the experiment's statistical power to detect them, and the false-positive probability of each protein. Methodology/Principal Findings We analyzed 2 types of mass spectrometric experiments. First, we showed that the method identified the protein targets of small-molecules in affinity purification experiments with high precision. Second, we re-analyzed a mass spectrometric data set designed to identify proteins regulated by microRNAs. Our results were supported by sequence analysis of the 3′ UTR regions of predicted target genes, and we found that the previously reported conclusion that a large fraction of the proteome is regulated by microRNAs was not supported by our statistical analysis of the data. Conclusions/Significance Our results highlight the importance of rigorous statistical analysis of proteomic data, and the method described here provides a statistical framework to robustly and reliably interpret such data.
Collapse
|
47
|
Wang K, Saito M, Bisikirska BC, Alvarez MJ, Lim WK, Rajbhandari P, Shen Q, Nemenman I, Basso K, Margolin AA, Klein U, Dalla-Favera R, Califano A. Genome-wide identification of post-translational modulators of transcription factor activity in human B cells. Nat Biotechnol 2009; 27:829-39. [PMID: 19741643 PMCID: PMC2753889 DOI: 10.1038/nbt.1563] [Citation(s) in RCA: 171] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Accepted: 08/11/2009] [Indexed: 01/06/2023]
Abstract
The ability of a transcription factor (TF) to regulate its targets is modulated by a variety of genetic and epigenetic mechanisms, resulting in highly context-dependent regulatory networks. However, high-throughput methods for the identification of proteins that affect TF activity are still largely unavailable. Here we introduce an algorithm, modulator inference by network dynamics (MINDy), for the genome-wide identification of post-translational modulators of TF activity within a specific cellular context. When used to dissect the regulation of MYC activity in human B lymphocytes, the approach inferred novel modulators of MYC function, which act by distinct mechanisms, including protein turnover, transcription complex formation and selective enzyme recruitment. MINDy is generally applicable to study the post-translational modulation of mammalian TFs in any cellular context. As such it can be used to dissect context-specific signaling pathways and combinatorial transcriptional regulation.
Collapse
|
48
|
Margolin AA, Califano A. Theory and limitations of genetic network inference from microarray data. Ann N Y Acad Sci 2007; 1115:51-72. [PMID: 17925348 DOI: 10.1196/annals.1407.019] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Since the advent of gene expression microarray technology more than 10 years ago, many computational approaches have been developed aimed at using statistical associations between mRNA abundance profiles to predict transcriptional regulatory interactions. The ultimate goal is to develop causal network models describing the transcriptional influences that genes exert on each other (via their protein products), which can be used to predict network disruptions (e.g., mutations) leading to a disease phenotype, as well as the appropriate therapeutic intervention. However, microarray data measure only a small component of the interacting variables in a genetic regulatory network, as cells are known to regulate gene expression via many diverse mechanisms. Although many researchers have acknowledged the questionable interpretation of statistical dependencies between mRNA profiles, very little work has been done on theoretically characterizing the nature of inferred dependencies using models that account for unobserved interacting variables. In this work, we review the theory behind reverse engineering algorithms derived from three separate disciplines-system control theory, graphical models, and information theory-and highlight several mathematical relationships between the various methods. We then apply recent theoretical work on constructing graphical models with latent variables to the context of reverse engineering genetic networks. We demonstrate that even the addition of simple latent variables induces statistical dependencies between non-directly interacting (e.g., co-regulated) genes that cannot be eliminated by conditioning on any observed variables.
Collapse
|
49
|
Margolin AA, Wang K, Lim WK, Kustagi M, Nemenman I, Califano A. Reverse engineering cellular networks. Nat Protoc 2007; 1:662-71. [PMID: 17406294 DOI: 10.1038/nprot.2006.106] [Citation(s) in RCA: 238] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
We describe a computational protocol for the ARACNE algorithm, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. Similar to other algorithms, ARACNE predicts potential functional associations among genes, or novel functions for uncharacterized genes, by identifying statistical dependencies between gene products. However, based on biochemical validation, literature searches and DNA binding site enrichment analysis, ARACNE has also proven effective in identifying bona fide transcriptional targets, even in complex mammalian networks. Thus we envision that predictions made by ARACNE, especially when supplemented with prior knowledge or additional data sources, can provide appropriate hypotheses for the further investigation of cellular networks. While the examples in this protocol use only gene expression profile data, the algorithm's theoretical basis readily extends to a variety of other high-throughput measurements, such as pathway-specific or genome-wide proteomics, microRNA and metabolomics data. As these data become readily available, we expect that ARACNE might prove increasingly useful in elucidating the underlying interaction models. For a microarray data set containing approximately 10,000 probes, reconstructing the network around a single probe completes in several minutes using a desktop computer with a Pentium 4 processor. Reconstructing a genome-wide network generally requires a computational cluster, especially if the recommended bootstrapping procedure is used.
Collapse
|
50
|
|