Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Taskesen E, Huisman SM, Mahfouz A, Krijthe JH, de Ridder J, van de Stolpe A, van den Akker E, Verheagh W, Reinders MJ. Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics. Sci Rep 2016;6:24949. [PMID: 27109935 DOI: 10.1038/srep24949] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Accepted: 04/07/2016] [Indexed: 02/02/2023] Open

For:	Taskesen E, Huisman SM, Mahfouz A, Krijthe JH, de Ridder J, van de Stolpe A, van den Akker E, Verheagh W, Reinders MJ. Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics. Sci Rep 2016;6:24949. [PMID: 27109935 DOI: 10.1038/srep24949] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Accepted: 04/07/2016] [Indexed: 02/02/2023] Open

Number

Cited by Other Article(s)

Belova T, Biondi N, Hsieh PH, Lutsik P, Chudasama P, Kuijjer M. Heterogeneity in the gene regulatory landscape of leiomyosarcoma. NAR Cancer 2023;5:zcad037. [PMID: 37492373 PMCID: PMC10365024 DOI: 10.1093/narcan/zcad037] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 07/06/2023] [Accepted: 07/18/2023] [Indexed: 07/27/2023] Open

Marquardt A, Kollmannsberger P, Krebs M, Argentiero A, Knott M, Solimando AG, Kerscher AG. Visual Clustering of Transcriptomic Data from Primary and Metastatic Tumors-Dependencies and Novel Pitfalls. Genes (Basel) 2022;13:genes13081335. [PMID: 35893071 PMCID: PMC9394300 DOI: 10.3390/genes13081335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 07/20/2022] [Accepted: 07/23/2022] [Indexed: 02/06/2023] Open

Zhang Z, Hernandez K, Savage J, Li S, Miller D, Agrawal S, Ortuno F, Staudt LM, Heath A, Grossman RL. Uniform genomic data analysis in the NCI Genomic Data Commons. Nat Commun 2021;12:1226. [PMID: 33619257 PMCID: PMC7900240 DOI: 10.1038/s41467-021-21254-9] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 01/14/2021] [Indexed: 12/28/2022] Open

Zhao Y, Pan Z, Namburi S, Pattison A, Posner A, Balachander S, Paisie CA, Reddi HV, Rueter J, Gill AJ, Fox S, Raghav KPS, Flynn WF, Tothill RW, Li S, Karuturi RKM, George J. CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence. EBioMedicine 2020;61:103030. [PMID: 33039710 PMCID: PMC7553237 DOI: 10.1016/j.ebiom.2020.103030] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 09/10/2020] [Accepted: 09/11/2020] [Indexed: 12/12/2022] Open

Abstract

BACKGROUND

Cancer of unknown primary (CUP), representing approximately 3-5% of all malignancies, is defined as metastatic cancer where a primary site of origin cannot be found despite a standard diagnostic workup. Because knowledge of a patient's primary cancer remains fundamental to their treatment, CUP patients are significantly disadvantaged and most have a poor survival outcome. Developing robust and accessible diagnostic methods for resolving cancer tissue of origin, therefore, has significant value for CUP patients.

METHODS

We developed an RNA-based classifier called CUP-AI-Dx that utilizes a 1D Inception convolutional neural network (1D-Inception) model to infer a tumor's primary tissue of origin. CUP-AI-Dx was trained using the transcriptional profiles of 18,217 primary tumours representing 32 cancer types from The Cancer Genome Atlas project (TCGA) and International Cancer Genome Consortium (ICGC). Gene expression data was ordered by gene chromosomal coordinates as input to the 1D-CNN model, and the model utilizes multiple convolutional kernels with different configurations simultaneously to improve generality. The model was optimized through extensive hyperparameter tuning, including different max-pooling layers and dropout settings. For 11 tumour types, we also developed a random forest model that can classify the tumour's molecular subtype according to prior TCGA studies. The optimised CUP-AI-Dx tissue of origin classifier was tested on 394 metastatic samples from 11 tumour types from TCGA and 92 formalin-fixed paraffin-embedded (FFPE) samples representing 18 cancer types from two clinical laboratories. The CUP-AI-Dx molecular subtype was also independently tested on independent ovarian and breast cancer microarray datasets FINDINGS: CUP-AI-Dx identifies the primary site with an overall top-1-accuracy of 98.54% in cross-validation and 96.70% on a test dataset. When applied to two independent clinical-grade RNA-seq datasets generated from two different institutes from the US and Australia, our model predicted the primary site with a top-1-accuracy of 86.96% and 72.46% respectively.

INTERPRETATION

The CUP-AI-Dx predicts tumour primary site and molecular subtype with high accuracy and therefore can be used to assist the diagnostic work-up of cancers of unknown primary or uncertain origin using a common and accessible genomics platform.

FUNDING

NIH R35 GM133562, NCI P30 CA034196, Victorian Cancer Agency Australia.

Collapse

Affiliation(s)

Yue Zhao The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Ziwei Pan The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
Sandeep Namburi The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Andrew Pattison Department of Clinical Pathology and Centre for Cancer Research, University of Melbourne, Parkville, Melbourne, Australia
Atara Posner Department of Clinical Pathology and Centre for Cancer Research, University of Melbourne, Parkville, Melbourne, Australia
Shiva Balachander Department of Clinical Pathology and Centre for Cancer Research, University of Melbourne, Parkville, Melbourne, Australia
Carolyn A Paisie The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Honey V Reddi The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA
Jens Rueter The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA
Anthony J Gill Cancer Diagnosis and Pathology Group, Kolling Institute of Medical Research, Royal North Shore Hospital, St Leonards, New South Wales 2065 Australia; NSW Health Pathology, Department of Anatomical Pathology, Royal North Shore Hospital, Sydney, New South Wales 2065 Australia; Department of Anatomical Pathology, Douglass Hanly Moir Pathology, Macquarie Park, New South Wales 2113 Australia; University of Sydney, Sydney, New South Wales 2006 Australia
Stephen Fox Peter MacCallum Cancer Centre, Department of Pathology, University of Melbourne, Victoria, Australia
Kanwal P S Raghav Department of Gastrointestinal Medical Oncology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
William F Flynn The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
Richard W Tothill Department of Clinical Pathology and Centre for Cancer Research, University of Melbourne, Parkville, Melbourne, Australia; Peter MacCallum Cancer Centre, Parkville, Melbourne, Australia.
Sheng Li The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA; Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA; Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA.
R Krishna Murthy Karuturi The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA; Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA.
Joshy George The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA; The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA.

Collapse

Godichon-Baggioni A, Maugis-Rabusseau C, Rau A. Multiview cluster aggregation and splitting, with an application to multiomic breast cancer data. Ann Appl Stat 2020. [DOI: 10.1214/19-aoas1317] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

González-Reymúndez A, Vázquez AI. Multi-omic signatures identify pan-cancer classes of tumors beyond tissue of origin. Sci Rep 2020;10:8341. [PMID: 32433524 PMCID: PMC7239905 DOI: 10.1038/s41598-020-65119-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 04/07/2020] [Indexed: 02/08/2023] Open

Coretto P, Serra A, Tagliaferri R. Robust clustering of noisy high-dimensional gene expression data for patients subtyping. Bioinformatics 2019;34:4064-4072. [PMID: 29939219 DOI: 10.1093/bioinformatics/bty502] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Accepted: 06/19/2018] [Indexed: 12/12/2022] Open

Abstract

Motivation

One of the most important research areas in personalized medicine is the discovery of disease sub-types with relevance in clinical applications. This is usually accomplished by exploring gene expression data with unsupervised clustering methodologies. Then, with the advent of multiple omics technologies, data integration methodologies have been further developed to obtain better performances in patient separability. However, these methods do not guarantee the survival separability of the patients in different clusters.

Results

We propose a new methodology that first computes a robust and sparse correlation matrix of the genes, then decomposes it and projects the patient data onto the first m spectral components of the correlation matrix. After that, a robust and adaptive to noise clustering algorithm is applied. The clustering is set up to optimize the separation between survival curves estimated cluster-wise. The method is able to identify clusters that have different omics signatures and also statistically significant differences in survival time. The proposed methodology is tested on five cancer datasets downloaded from The Cancer Genome Atlas repository. The proposed method is compared with the Similarity Network Fusion (SNF) approach, and model based clustering based on Student's t-distribution (TMIX). Our method obtains a better performance in terms of survival separability, even if it uses a single gene expression view compared to the multi-view approach of the SNF method. Finally, a pathway based analysis is accomplished to highlight the biological processes that differentiate the obtained patient groups.

Availability and implementation

Our R source code is available online at https://github.com/angy89/RobustClusteringPatientSubtyping.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Abrams ZB, Zucker M, Wang M, Asiaee Taheri A, Abruzzo LV, Coombes KR. Thirty biologically interpretable clusters of transcription factors distinguish cancer type. BMC Genomics 2018;19:738. [PMID: 30305013 PMCID: PMC6180590 DOI: 10.1186/s12864-018-5093-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Accepted: 09/19/2018] [Indexed: 12/27/2022] Open

Abstract

Background

Transcription factors are essential regulators of gene expression and play critical roles in development, differentiation, and in many cancers. To carry out their regulatory programs, they must cooperate in networks and bind simultaneously to sites in promoter or enhancer regions of genes. We hypothesize that the mRNA co-expression patterns of transcription factors can be used both to learn how they cooperate in networks and to distinguish between cancer types.

Results

We recently developed a new algorithm, Thresher, that combines principal component analysis, outlier filtering, and von Mises-Fisher mixture models to cluster genes (in this case, transcription factors) based on expression, determining the optimal number of clusters in the process. We applied Thresher to the RNA-Seq expression data of 486 transcription factors from more than 10,000 samples of 33 kinds of cancer studied in The Cancer Genome Atlas (TCGA). We found that 30 clusters of transcription factors from a 29-dimensional principal component space were able to distinguish between most cancer types, and could separate tumor samples from normal controls. Moreover, each cluster of transcription factors could be either (i) linked to a tissue-specific expression pattern or (ii) associated with a fundamental biological process such as cell cycle, angiogenesis, apoptosis, or cytoskeleton. Clusters of the second type were more likely also to be associated with embryonically lethal mouse phenotypes.

Conclusions

Using our approach, we have shown that the mRNA expression patterns of transcription factors contain most of the information needed to distinguish different cancer types. The Thresher method is capable of discovering biologically interpretable clusters of genes. It can potentially be applied to other gene sets, such as signaling pathways, to decompose them into simpler, yet biologically meaningful, components.

Electronic supplementary material

The online version of this article (10.1186/s12864-018-5093-z) contains supplementary material, which is available to authorized users.

Collapse

Baali I, Acar DAE, Aderinwale TW, HafezQorani S, Kazan H. Predicting clinical outcomes in neuroblastoma with genomic data integration. Biol Direct 2018;13:20. [PMID: 30621745 PMCID: PMC6889397 DOI: 10.1186/s13062-018-0223-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 09/03/2018] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Neuroblastoma is a heterogeneous disease with diverse clinical outcomes. Current risk group models require improvement as patients within the same risk group can still show variable prognosis. Recently collected genome-wide datasets provide opportunities to infer neuroblastoma subtypes in a more unified way. Within this context, data integration is critical as different molecular characteristics can contain complementary signals. To this end, we utilized the genomic datasets available for the SEQC cohort patients to develop supervised and unsupervised models that can predict disease prognosis.

RESULTS

Our supervised model trained on the SEQC cohort can accurately predict overall survival and event-free survival profiles of patients in two independent cohorts. We also performed extensive experiments to assess the prediction accuracy of high risk patients and patients without MYCN amplification. Our results from this part suggest that clinical endpoints can be predicted accurately across multiple cohorts. To explore the data in an unsupervised manner, we used an integrative clustering strategy named multi-view kernel k-means (MVKKM) that can effectively integrate multiple high-dimensional datasets with varying weights. We observed that integrating different gene expression datasets results in a better patient stratification compared to using these datasets individually. Also, our identified subgroups provide a better Cox regression model fit compared to the existing risk group definitions.

CONCLUSION

Altogether, our results indicate that integration of multiple genomic characterizations enables the discovery of subtypes that improve over existing definitions of risk groups. Effective prediction of survival times will have a direct impact on choosing the right therapies for patients.

REVIEWERS

This article was reviewed by Susmita Datta, Wenzhong Xiao and Ziv Shkedy.

Collapse

Parimbelli E, Marini S, Sacchi L, Bellazzi R. Patient similarity for precision medicine: A systematic review. J Biomed Inform 2018;83:87-96. [PMID: 29864490 DOI: 10.1016/j.jbi.2018.06.001] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 05/16/2018] [Accepted: 06/01/2018] [Indexed: 12/19/2022]

Sorting Five Human Tumor Types Reveals Specific Biomarkers and Background Classification Genes. Sci Rep 2018;8:8180. [PMID: 29802335 PMCID: PMC5970138 DOI: 10.1038/s41598-018-26310-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 05/10/2018] [Indexed: 12/16/2022] Open