1
|
Chen J. Timed hazard networks: Incorporating temporal difference for oncogenetic analysis. PLoS One 2023; 18:e0283004. [PMID: 36928529 PMCID: PMC10019724 DOI: 10.1371/journal.pone.0283004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 03/01/2023] [Indexed: 03/18/2023] Open
Abstract
Oncogenetic graphical models are crucial for understanding cancer progression by analyzing the accumulation of genetic events. These models are used to identify statistical dependencies and temporal order of genetic events, which helps design targeted therapies. However, existing algorithms do not account for temporal differences between samples in oncogenetic analysis. This paper introduces Timed Hazard Networks (TimedHN), a new statistical model that uses temporal differences to improve accuracy and reliability. TimedHN models the accumulation process as a continuous-time Markov chain and includes an efficient gradient computation algorithm for optimization. Our simulation experiments demonstrate that TimedHN outperforms current state-of-the-art graph reconstruction methods. We also compare TimedHN with existing methods on a luminal breast cancer dataset, highlighting its potential utility. The Matlab implementation and data are available at https://github.com/puar-playground/TimedHN.
Collapse
Affiliation(s)
- Jian Chen
- Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, United States of America
- * E-mail:
| |
Collapse
|
2
|
Pinheiro D, Santander-Jimenéz S, Ilic A. PhyloMissForest: a random forest framework to construct phylogenetic trees with missing data. BMC Genomics 2022; 23:377. [PMID: 35585494 PMCID: PMC9116704 DOI: 10.1186/s12864-022-08540-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 04/01/2022] [Indexed: 11/10/2022] Open
Abstract
Background In the pursuit of a better understanding of biodiversity, evolutionary biologists rely on the study of phylogenetic relationships to illustrate the course of evolution. The relationships among natural organisms, depicted in the shape of phylogenetic trees, not only help to understand evolutionary history but also have a wide range of additional applications in science. One of the most challenging problems that arise when building phylogenetic trees is the presence of missing biological data. More specifically, the possibility of inferring wrong phylogenetic trees increases proportionally to the amount of missing values in the input data. Although there are methods proposed to deal with this issue, their applicability and accuracy is often restricted by different constraints. Results We propose a framework, called PhyloMissForest, to impute missing entries in phylogenetic distance matrices and infer accurate evolutionary relationships. PhyloMissForest is built upon a random forest structure that infers the missing entries of the input data, based on the known parts of it. PhyloMissForest contributes with a robust and configurable framework that incorporates multiple search strategies and machine learning, complemented by phylogenetic techniques, to provide a more accurate inference of lost phylogenetic distances. We evaluate our framework by examining three real-world datasets, two DNA-based sequence alignments and one containing amino acid data, and two additional instances with simulated DNA data. Moreover, we follow a design of experiments methodology to define the hyperparameter values of our algorithm, which is a concise method, preferable in comparison to the well-known exhaustive parameters search. By varying the percentages of missing data from 5% to 60%, we generally outperform the state-of-the-art alternative imputation techniques in the tests conducted on real DNA data. In addition, significant improvements in execution time are observed for the amino acid instance. The results observed on simulated data also denote the attainment of improved imputations when dealing with large percentages of missing data. Conclusions By merging multiple search strategies, machine learning, and phylogenetic techniques, PhyloMissForest provides a highly customizable and robust framework for phylogenetic missing data imputation, with significant topological accuracy and effective speedups over the state of the art. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-022-08540-6).
Collapse
Affiliation(s)
- Diogo Pinheiro
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol 9, Lisboa, 1000-029, Portugal
| | - Sergio Santander-Jimenéz
- Department of Computer and Communications Technologies, University of Extremadura, Campus universitario s/n, Cáceres, 10003, Spain
| | - Aleksandar Ilic
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol 9, Lisboa, 1000-029, Portugal.
| |
Collapse
|
3
|
Sun X, Zhang J, Nie Q. Inferring latent temporal progression and regulatory networks from cross-sectional transcriptomic data of cancer samples. PLoS Comput Biol 2021; 17:e1008379. [PMID: 33667222 PMCID: PMC7968745 DOI: 10.1371/journal.pcbi.1008379] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 03/17/2021] [Accepted: 02/15/2021] [Indexed: 12/19/2022] Open
Abstract
Unraveling molecular regulatory networks underlying disease progression is critically important for understanding disease mechanisms and identifying drug targets. The existing methods for inferring gene regulatory networks (GRNs) rely mainly on time-course gene expression data. However, most available omics data from cross-sectional studies of cancer patients often lack sufficient temporal information, leading to a key challenge for GRN inference. Through quantifying the latent progression using random walks-based manifold distance, we propose a latent-temporal progression-based Bayesian method, PROB, for inferring GRNs from the cross-sectional transcriptomic data of tumor samples. The robustness of PROB to the measurement variabilities in the data is mathematically proved and numerically verified. Performance evaluation on real data indicates that PROB outperforms other methods in both pseudotime inference and GRN inference. Applications to bladder cancer and breast cancer demonstrate that our method is effective to identify key regulators of cancer progression or drug targets. The identified ACSS1 is experimentally validated to promote epithelial-to-mesenchymal transition of bladder cancer cells, and the predicted FOXM1-targets interactions are verified and are predictive of relapse in breast cancer. Our study suggests new effective ways to clinical transcriptomic data modeling for characterizing cancer progression and facilitates the translation of regulatory network-based approaches into precision medicine.
Collapse
Affiliation(s)
- Xiaoqiang Sun
- Key Laboratory of Tropical Disease Control, Chinese Ministry of Education; Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- School of Mathematics, Sun Yat-sen University, Guangzhou, China
| | - Ji Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong, China
| | - Qing Nie
- Department of Mathematics and Department of Developmental & Cell Biology, NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, California, United States of America
| |
Collapse
|
4
|
Tao Y, Lei H, Lee AV, Ma J, Schwartz R. Neural Network Deconvolution Method for Resolving Pathway-Level Progression of Tumor Clonal Expression Programs With Application to Breast Cancer Brain Metastases. Front Physiol 2020; 11:1055. [PMID: 33013452 PMCID: PMC7499245 DOI: 10.3389/fphys.2020.01055] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 07/31/2020] [Indexed: 02/03/2023] Open
Abstract
Metastasis is the primary mechanism by which cancer results in mortality and there are currently no reliable treatment options once it occurs, making the metastatic process a critical target for new diagnostics and therapeutics. Treating metastasis before it appears is challenging, however, in part because metastases may be quite distinct genomically from the primary tumors from which they presumably emerged. Phylogenetic studies of cancer development have suggested that changes in tumor genomics over stages of progression often result from shifts in the abundance of clonal cellular populations, as late stages of progression may derive from or select for clonal populations rare in the primary tumor. The present study develops computational methods to infer clonal heterogeneity and dynamics across progression stages via deconvolution and clonal phylogeny reconstruction of pathway-level expression signatures in order to reconstruct how these processes might influence average changes in genomic signatures over progression. We show, via application to a study of gene expression in a collection of matched breast primary tumor and metastatic samples, that the method can infer coarse-grained substructure and stromal infiltration across the metastatic transition. The results suggest that genomic changes observed in metastasis, such as gain of the ErbB signaling pathway, are likely caused by early events in clonal evolution followed by expansion of minor clonal populations in metastasis, a finding that may have translational implications for early detection or prevention of metastasis.
Collapse
Affiliation(s)
- Yifeng Tao
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA, United States
| | - Haoyun Lei
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA, United States
| | - Adrian V Lee
- Department of Pharmacology and Chemical Biology, UPMC Hillman Cancer Center, Magee-Womens Research Institute, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Russell Schwartz
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
5
|
Tao Y, Lei H, Fu X, Lee AV, Ma J, Schwartz R. Robust and accurate deconvolution of tumor populations uncovers evolutionary mechanisms of breast cancer metastasis. Bioinformatics 2020; 36:i407-i416. [PMID: 32657393 PMCID: PMC7355293 DOI: 10.1093/bioinformatics/btaa396] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Cancer develops and progresses through a clonal evolutionary process. Understanding progression to metastasis is of particular clinical importance, but is not easily analyzed by recent methods because it generally requires studying samples gathered years apart, for which modern single-cell sequencing is rarely an option. Revealing the clonal evolution mechanisms in the metastatic transition thus still depends on unmixing tumor subpopulations from bulk genomic data. METHODS We develop a novel toolkit called robust and accurate deconvolution (RAD) to deconvolve biologically meaningful tumor populations from multiple transcriptomic samples spanning the two progression states. RAD uses gene module compression to mitigate considerable noise in RNA, and a hybrid optimizer to achieve a robust and accurate solution. Finally, we apply a phylogenetic algorithm to infer how associated cell populations adapt across the metastatic transition via changes in expression programs and cell-type composition. RESULTS We validated the superior robustness and accuracy of RAD over alternative algorithms on a real dataset, and validated the effectiveness of gene module compression on both simulated and real bulk RNA data. We further applied the methods to a breast cancer metastasis dataset, and discovered common early events that promote tumor progression and migration to different metastatic sites, such as dysregulation of ECM-receptor, focal adhesion and PI3k-Akt pathways. AVAILABILITY AND IMPLEMENTATION The source code of the RAD package, models, experiments and technical details such as parameters, is available at https://github.com/CMUSchwartzLab/RAD. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yifeng Tao
- Department of computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Haoyun Lei
- Department of computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15213, USA
| | - Xuecong Fu
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Adrian V Lee
- Department of Pharmacology and Chemical Biology, UPMC Hillman Cancer Center, Magee-Womens Research Institute, Pittsburgh, PA 15213, USA
| | - Jian Ma
- Department of computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Russell Schwartz
- Department of computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
6
|
Deng AC, Sun XQ. Dynamic gene regulatory network reconstruction and analysis based on clinical transcriptomic data of colorectal cancer. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020; 17:3224-3239. [PMID: 32987526 DOI: 10.3934/mbe.2020183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Inferring dynamic regulatory networks that rewire at different stages is a reasonable way to understand the mechanisms underlying cancer development. In this study, we reconstruct the stage-specific gene regulatory networks (GRNs) for colorectal cancer to understand dynamic changes of gene regulations along different disease stages. We combined multiple sets of clinical transcriptomic data of colorectal cancer patients and employed a supervised approach to select initial gene set for network construction. We then developed a dynamical system-based optimization method to infer dynamic GRNs by incorporating mutual information-based network sparsification and a dynamic cascade technique into an ordinary differential equations model. Dynamic GRNs at four different stages of colorectal cancer were reconstructed and analyzed. Several important genes were revealed based on the rewiring of the reconstructed GRNs. Our study demonstrated that reconstructing dynamic GRNs based on clinical transcriptomic profiling allows us to detect the dynamic trend of gene regulation as well as reveal critical genes for cancer development which may be important candidates of master regulators for further experimental test.
Collapse
Affiliation(s)
- An Cheng Deng
- School of Life Science, Sun Yat-sen University, Guangzhou 510275, China
| | - Xiao Qiang Sun
- Key Laboratory of Tropical Disease Control, Chinese Ministry of Education, Zhong-Shan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
7
|
Hu YC, Tiwari S, Mishra KK, Trivedi MC. Phylogenetics Algorithms and Applications. AMBIENT COMMUNICATIONS AND COMPUTER SYSTEMS 2018. [PMCID: PMC7123334 DOI: 10.1007/978-981-13-5934-7_17] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Phylogenetics is a powerful approach in finding evolution of current day species. By studying phylogenetic trees, scientists gain a better understanding of how species have evolved while explaining the similarities and differences among species. The phylogenetic study can help in analysing the evolution and the similarities among diseases and viruses, and further help in prescribing their vaccines against them. This paper explores computational solutions for building phylogeny of species along with highlighting benefits of alignment-free methods of phylogenetics. The paper has also discussed the application of phylogenetic study in disease diagnosis and evolution.
Collapse
Affiliation(s)
- Yu-Chen Hu
- Department of Computer Science and Information Management, Providence University, Taichung, Taiwan
| | - Shailesh Tiwari
- Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, Uttar Pradesh India
| | - Krishn K. Mishra
- Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology, Allahabad, Uttar Pradesh India
| | - Munesh C. Trivedi
- Department of Computer Science and Engineering, ABES Engineering College, Ghaziabad, Uttar Pradesh India
| |
Collapse
|
8
|
Abstract
Rapid advances in high-throughput sequencing and a growing realization of the importance of evolutionary theory to cancer genomics have led to a proliferation of phylogenetic studies of tumour progression. These studies have yielded not only new insights but also a plethora of experimental approaches, sometimes reaching conflicting or poorly supported conclusions. Here, we consider this body of work in light of the key computational principles underpinning phylogenetic inference, with the goal of providing practical guidance on the design and analysis of scientifically rigorous tumour phylogeny studies. We survey the range of methods and tools available to the researcher, their key applications, and the various unsolved problems, closing with a perspective on the prospects and broader implications of this field.
Collapse
Affiliation(s)
- Russell Schwartz
- Department of Biological Sciences and Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, USA
| | - Alejandro A Schäffer
- Computational Biology Branch, National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
9
|
Riester M, Wu HJ, Zehir A, Gönen M, Moreira AL, Downey RJ, Michor F. Distance in cancer gene expression from stem cells predicts patient survival. PLoS One 2017; 12:e0173589. [PMID: 28333954 PMCID: PMC5363813 DOI: 10.1371/journal.pone.0173589] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 02/23/2017] [Indexed: 12/13/2022] Open
Abstract
The degree of histologic cellular differentiation of a cancer has been associated with prognosis but is subjectively assessed. We hypothesized that information about tumor differentiation of individual cancers could be derived objectively from cancer gene expression data, and would allow creation of a cancer phylogenetic framework that would correlate with clinical, histologic and molecular characteristics of the cancers, as well as predict prognosis. Here we utilized mRNA expression data from 4,413 patient samples with 7 diverse cancer histologies to explore the utility of ordering samples by their distance in gene expression from that of stem cells. A differentiation baseline was obtained by including expression data of human embryonic stem cells (hESC) and human mesenchymal stem cells (hMSC) for solid tumors, and of hESC and CD34+ cells for liquid tumors. We found that the correlation distance (the degree of similarity) between the gene expression profile of a tumor sample and that of stem cells orients cancers in a clinically coherent fashion. For all histologies analyzed (including carcinomas, sarcomas, and hematologic malignancies), patients with cancers with gene expression patterns most similar to that of stem cells had poorer overall survival. We also found that the genes in all undifferentiated cancers of diverse histologies that were most differentially expressed were associated with up-regulation of specific oncogenes and down-regulation of specific tumor suppressor genes. Thus, a stem cell-oriented phylogeny of cancers allows for the derivation of a novel cancer gene expression signature found in all undifferentiated forms of diverse cancer histologies, that is competitive in predicting overall survival in cancer patients compared to previously published prediction models, and is coherent in that gene expression was associated with up-regulation of specific oncogenes and down-regulation of specific tumor suppressor genes associated with regulation of the multicellular state.
Collapse
Affiliation(s)
- Markus Riester
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, and Department of Biostatistics, Harvard School of Public Health, Boston, MA, United States of America
| | - Hua-Jun Wu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, and Department of Biostatistics, Harvard School of Public Health, Boston, MA, United States of America
| | - Ahmet Zehir
- Cell Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY United States of America
| | - Mithat Gönen
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY United States of America
| | - Andre L. Moreira
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY United States of America
| | - Robert J. Downey
- Thoracic Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY United States of America
- * E-mail: (RJD); (FM)
| | - Franziska Michor
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, and Department of Biostatistics, Harvard School of Public Health, Boston, MA, United States of America
- * E-mail: (RJD); (FM)
| |
Collapse
|
10
|
Somarelli JA, Ware KE, Kostadinov R, Robinson JM, Amri H, Abu-Asab M, Fourie N, Diogo R, Swofford D, Townsend JP. PhyloOncology: Understanding cancer through phylogenetic analysis. Biochim Biophys Acta Rev Cancer 2016; 1867:101-108. [PMID: 27810337 DOI: 10.1016/j.bbcan.2016.10.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 10/14/2016] [Accepted: 10/26/2016] [Indexed: 11/30/2022]
Abstract
Despite decades of research and an enormity of resultant data, cancer remains a significant public health problem. New tools and fresh perspectives are needed to obtain fundamental insights, to develop better prognostic and predictive tools, and to identify improved therapeutic interventions. With increasingly common genome-scale data, one suite of algorithms and concepts with potential to shed light on cancer biology is phylogenetics, a scientific discipline used in diverse fields. From grouping subsets of cancer samples to tracing subclonal evolution during cancer progression and metastasis, the use of phylogenetics is a powerful systems biology approach. Well-developed phylogenetic applications provide fast, robust approaches to analyze high-dimensional, heterogeneous cancer data sets. This article is part of a Special Issue entitled: Evolutionary principles - heterogeneity in cancer?, edited by Dr. Robert A. Gatenby.
Collapse
Affiliation(s)
- Jason A Somarelli
- Duke Cancer Institute and the Department of Medicine, Duke University Medical Center, Durham, NC 27710, United States.
| | - Kathryn E Ware
- Duke Cancer Institute and the Department of Medicine, Duke University Medical Center, Durham, NC 27710, United States
| | - Rumen Kostadinov
- Pediatric Oncology, School of Medicine, Johns Hopkins University, United States
| | - Jeffrey M Robinson
- Anatomy Department, College of Medicine, Howard University, Washington, DC 20059, United States; Digestive Disorders Unit, National Institute of Nursing Research, NIH, Bethesda, MD 20892, United States
| | - Hakima Amri
- Department of Biochemistry and Cellular and Molecular Biology, Georgetown University Medical Center, Washington, DC 20007, United States
| | - Mones Abu-Asab
- Section of Ultrastructural Biology, National Eye Institute, NIH, Bethesda, MD 20892, United States
| | - Nicolaas Fourie
- Digestive Disorders Unit, National Institute of Nursing Research, NIH, Bethesda, MD 20892, United States
| | - Rui Diogo
- Anatomy Department, College of Medicine, Howard University, Washington, DC 20059, United States
| | - David Swofford
- Department of Biology, Duke University Trinity College of Arts and Sciences, Durham, NC 27710, United States
| | - Jeffrey P Townsend
- Department of Biostatistics, Yale University, United States; Department of Ecology and Evolutionary Biology, Yale University, United States; Department of Program in Computational Biology and Bioinformatics, Yale University, United States.
| |
Collapse
|
11
|
Nalbantoglu S, Abu-Asab M, Tan M, Zhang X, Cai L, Amri H. Study of Clinical Survival and Gene Expression in a Sample of Pancreatic Ductal Adenocarcinoma by Parsimony Phylogenetic Analysis. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2016; 20:442-7. [PMID: 27428255 PMCID: PMC4968342 DOI: 10.1089/omi.2016.0059] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the rapidly growing forms of pancreatic cancer with a poor prognosis and less than 5% 5-year survival rate. In this study, we characterized the genetic signatures and signaling pathways related to survival from PDAC, using a parsimony phylogenetic algorithm. We applied the parsimony phylogenetic algorithm to analyze the publicly available whole-genome in silico array analysis of a gene expression data set in 25 early-stage human PDAC specimens. We explain here that the parsimony phylogenetics is an evolutionary analytical method that offers important promise to uncover clonal (driver) and nonclonal (passenger) aberrations in complex diseases. In our analysis, parsimony and statistical analyses did not identify significant correlations between survival times and gene expression values. Thus, the survival rankings did not appear to be significantly different between patients for any specific gene (p > 0.05). Also, we did not find correlation between gene expression data and tumor stage in the present data set. While the present analysis was unable to identify in this relatively small sample of patients a molecular signature associated with pancreatic cancer prognosis, we suggest that future research and analyses with the parsimony phylogenetic algorithm in larger patient samples are worthwhile, given the devastating nature of pancreatic cancer and its early diagnosis, and the need for novel data analytic approaches. The future research practices might want to place greater emphasis on phylogenetics as one of the analytical paradigms, as our findings presented here are on the cusp of this shift, especially in the current era of Big Data and innovation policies advocating for greater data sharing and reanalysis.
Collapse
Affiliation(s)
- Sinem Nalbantoglu
- Department of Biochemistry, Cellular and Molecular Biology, School of Medicine, Georgetown University, Washington, DC
| | - Mones Abu-Asab
- Laboratory of Immunology, Section of Immunopathology, National Eye Institute, Bethesda, Maryland
| | - Ming Tan
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Xuemin Zhang
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Ling Cai
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC
| | - Hakima Amri
- Department of Biochemistry, Cellular and Molecular Biology, School of Medicine, Georgetown University, Washington, DC
| |
Collapse
|
12
|
Xu Y, Qiu P, Roysam B. Unsupervised Discovery of Subspace Trends. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015; 37:2131-2145. [PMID: 26353189 DOI: 10.1109/tpami.2015.2394475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper presents unsupervised algorithms for discovering previously unknown subspace trends in high-dimensional data sets without the benefit of prior information. A subspace trend is a sustained pattern of gradual/progressive changes within an unknown subset of feature dimensions. A fundamental challenge to subspace trend discovery is the presence of irrelevant data dimensions, noise, outliers, and confusion from multiple subspace trends driven by independent factors that are mixed in with each other. These factors can obscure the trends in conventional dimension reduction & projection based data visualizations. To overcome these limitations, we propose a novel graph-theoretic neighborhood similarity measure for detecting concordant progressive changes across data dimensions. Using this measure, we present an unsupervised algorithm for trend-relevant feature selection, subspace trend discovery, quantification of trend strength, and validation. Our method successfully identified verifiable subspace trends in diverse synthetic and real-world biomedical datasets. Visualizations derived from the selected trend-relevant features revealed biologically meaningful hidden subspace trend(s) that were obscured by irrelevant features and noise. Although our examples are drawn from the biological domain, the proposed algorithm is broadly applicable to exploratory analysis of high-dimensional data including visualization, hypothesis generation, knowledge discovery, and prediction in diverse other applications.
Collapse
|
13
|
Yuan K, Sakoparnig T, Markowetz F, Beerenwinkel N. BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biol 2015; 16:36. [PMID: 25786108 PMCID: PMC4359483 DOI: 10.1186/s13059-015-0592-6] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 01/21/2015] [Indexed: 11/28/2022] Open
Abstract
Cancer has long been understood as a somatic evolutionary process, but many details of tumor progression remain elusive. Here, we present BitPhylogenyBitPhylogeny, a probabilistic framework to reconstruct intra-tumor evolutionary pathways. Using a full Bayesian approach, we jointly estimate the number and composition of clones in the sample as well as the most likely tree connecting them. We validate our approach in the controlled setting of a simulation study and compare it against several competing methods. In two case studies, we demonstrate how BitPhylogeny BitPhylogeny reconstructs tumor phylogenies from methylation patterns in colon cancer and from single-cell exomes in myeloproliferative neoplasm.
Collapse
Affiliation(s)
- Ke Yuan
- />University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, UK
| | - Thomas Sakoparnig
- />Department of Biosystems Science and Engineering, ETH Zurich, Basel Switzerland
- />SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- />Current address: Biozentrum, University of Basel, Basel, Switzerland
| | - Florian Markowetz
- />University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, UK
| | - Niko Beerenwinkel
- />Department of Biosystems Science and Engineering, ETH Zurich, Basel Switzerland
- />SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
14
|
Bellido M, Stirewalt DL, Zhao LP, Radich JP. Use of gene expression microarrays for the study of acute leukemia. Expert Rev Mol Diagn 2014; 6:733-47. [PMID: 17009907 DOI: 10.1586/14737159.6.5.733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Genetic lesions found in acute leukemia drive the pathology of the disease in addition to forming reliable classifications of prognosis. However, there is still a reasonable heterogeneity of response among cases with the same genetic lesion. Moreover, many leukemia cases have no detectable genetic marker and these cases have marked heterogeneity of response. How can we learn more about the genes and pathways involved with leukemogenesis and response in the midst of such complexity? Gene expression microarrays are experimental platforms that allow for the simultaneous evaluation of the thousands of mRNA transcripts (the 'transcriptome'). This technology has revolutionized the study of leukemia, giving insight into genes and pathways involved in disease response and the biology involved in specific translocations.
Collapse
Affiliation(s)
- Mar Bellido
- Fred Hutchinson Cancer Research Center, Clinical Research Division, Public Health Sciences Division, 1100 Fairview Ave N., Seattle, WA 98109, USA.
| | | | | | | |
Collapse
|
15
|
A graph spectrum based geometric biclustering algorithm. J Theor Biol 2013; 317:200-11. [DOI: 10.1016/j.jtbi.2012.10.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Revised: 10/04/2012] [Accepted: 10/06/2012] [Indexed: 11/22/2022]
|
16
|
Pennington G, Smith CA, Shackney S, Schwartz R. RECONSTRUCTING TUMOR PHYLOGENIES FROM HETEROGENEOUS SINGLE-CELL DATA. J Bioinform Comput Biol 2011; 5:407-27. [PMID: 17589968 DOI: 10.1142/s021972000700259x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2006] [Revised: 12/03/2006] [Accepted: 12/11/2006] [Indexed: 01/08/2023]
Abstract
Studies of gene expression in cancerous tumors have revealed that tumors presenting indistinguishable symptoms in the clinic can be substantially different entities at the molecular level. The ability to distinguish between these genetically distinct cancers will make possible more accurate prognoses and more finely targeted therapeutics, provided we can characterize commonly occurring cancer sub-types and the specific molecular abnormalities that produce them. We develop a new method for identifying these common tumor progression pathways by applying phylogeny inference algorithms to single-cell assays, taking advantage of information on tumor heterogeneity lost to prior microarray-based approaches. We combine this approach with expectation maximization to infer unknown parameters used in the phylogeny construction. We further develop new algorithms to merge inferred trees across different assays. We validate the expectation maximization method on simulated data and demonstrate the combined approach on a set of fluorescent in situ hybridization (FISH) data measuring cell-by-cell gene and chromosome copy numbers in a large sample of breast cancers. The results further validate the proposed computational methods by showing consistency with several previous findings on these cancers and provide novel insights into the mechanisms of tumor progression in these patients.
Collapse
Affiliation(s)
- Gregory Pennington
- Computer Science Department, Carnegie Mellon University, 4400 Fifth Ave., Pittsburgh, PA 15213, USA.
| | | | | | | |
Collapse
|
17
|
Qiu P, Gentles AJ, Plevritis SK. Discovering biological progression underlying microarray samples. PLoS Comput Biol 2011; 7:e1001123. [PMID: 21533210 PMCID: PMC3077357 DOI: 10.1371/journal.pcbi.1001123] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Accepted: 03/16/2011] [Indexed: 11/27/2022] Open
Abstract
In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression. We present a novel computational approach, Sample Progression Discovery (SPD), to discover biological progression underlying a microarray dataset. In contrast to the majority of microarray data analysis methods which identify differences between sample groups (normal vs. cancer, treated vs. control), SPD aims to identify an underlying progression among individual samples, both within and across sample groups. We validated SPD's ability to discover biological progression using datasets of cell cycle, B-cell differentiation, and mouse embryonic stem cell differentiation. We view SPD as a hypothesis generation tool when applied to datasets where the progression is unclear. For example, when applied to a microarray dataset of cancer samples, SPD assumes that the cancer samples collected from individual patients represent different stages during an intrinsic progression underlying cancer development. The inferred relationship among the samples may therefore indicate a trajectory or hierarchy of cancer progression, which serves as a hypothesis to be tested. SPD is not limited to microarray data analysis, and can be applied to a variety of high-dimensional datasets. We implemented SPD using MATLAB graphical user interface, which is available at http://icbp.stanford.edu/software/SPD/.
Collapse
Affiliation(s)
- Peng Qiu
- Department of Radiology, Stanford University, Stanford, California, USA.
| | | | | |
Collapse
|
18
|
Li X, Chen J, Lü B, Peng S, Desper R, Lai M. -8p12-23 and +20q are predictors of subtypes and metastatic pathways in colorectal cancer: construction of tree models using comparative genomic hybridization data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 15:37-47. [PMID: 21194300 DOI: 10.1089/omi.2010.0101] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A substantial body of evidence suggests the genetic heterogeneous pattern and multiple pathways in colorectal cancer initiation and progression. In this study, we construct a branching tree and multiple distance-based tree models to elucidate these genetic patterns and pathways in colorectal cancer by using a data set comprised of 244 cases of comparative genomic hybridization. We identify the six most common gains of chromosomal regions of 7p (37.0%), 7q11-32 (34.8%), 8q (48.3%), 13q (49.1%), 20p (36.1%), and 20q (50.4%), and the nine most common losses of 1p13-36 (30.9%), 4p15 (24.3%), 4q33-34 (24.3%), 8p12-23 (50.9%), 15q13-14 (23.5%), 15q24-25 (24.3%), 17p (34.8%), 18p (36.5%), and 18q (61.7%) in colorectal cancer. We classify colorectal cancer into two distinct groups: one preceding with -8p12-23, and the other with +20q. The sample-based classification tree also demonstrates that colorectal cancer can be classified into multiple subtypes marked by -8p12-23 and +20q. By comparing chromosomal abnormalities between primary and metastatic colorectal cancer, we identify five potential metastatic pathways: (-18q, -18p), (-8p12-23, -4p15, -4q33-34), (+20q, +20p), (+20q, +7p, +7q11-32), and +8q. -8p12-23 and +20q are inferred to be the two marker events of colorectal cancer metastasis. The current oncogenetic tree models may contribute to our understanding towards molecular genetics in colorectal cancer. Particularly, the metastatic pathways we describe may provide pivotal clues for metastatic candidate genes, and thus impact on the prediction and intervention of metastatic colorectal cancer.
Collapse
Affiliation(s)
- Xiaobo Li
- Department of Pathology, School of Medicine, Zhejiang University, Hangzhou 310058, People's Republic of China
| | | | | | | | | | | |
Collapse
|
19
|
A differentiation-based phylogeny of cancer subtypes. PLoS Comput Biol 2010; 6:e1000777. [PMID: 20463876 PMCID: PMC2865519 DOI: 10.1371/journal.pcbi.1000777] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2009] [Accepted: 04/02/2010] [Indexed: 12/20/2022] Open
Abstract
Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors. Gene expression profiling of malignancies is often held to demonstrate genes that are “up-regulated” or “down-regulated”, but the appropriate frame of reference against which observations should be compared has not been determined. Fully differentiated somatic cells arise from stem cells, with changes in gene expression that can be experimentally determined. If cancers arise as the result of an abruption of the differentiation process, then poorly differentiated cancers would have a gene expression more similar to stem cells than to normal differentiated tissue, and well differentiated cancers would have a gene expression more similar to fully differentiated cells than to stem cells. In this paper, we describe a novel computational algorithm that allows orientation of cancer gene expression between the poles of the gene expression of stem cells and of fully differentiated tissue. Our methodology allows the construction of a multi-branched phylogeny of human malignancies and can be used to identify genes related to differentiation as well as novel therapeutic targets.
Collapse
|
20
|
Schwartz R, Shackney SE. Applying unmixing to gene expression data for tumor phylogeny inference. BMC Bioinformatics 2010; 11:42. [PMID: 20089185 PMCID: PMC2823708 DOI: 10.1186/1471-2105-11-42] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Accepted: 01/20/2010] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity. RESULTS The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development. CONCLUSIONS Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.
Collapse
Affiliation(s)
- Russell Schwartz
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA USA
| | - Stanley E Shackney
- Departments of Human Oncology and Human Genetics, Drexel University School of Medicine, Pittsburgh, PA USA
| |
Collapse
|
21
|
Mathematical modeling of carcinogenesis based on chromosome aberration data. Chin J Cancer Res 2009. [DOI: 10.1007/s11670-009-0240-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
22
|
Park Y, Shackney S, Schwartz R. Network-based inference of cancer progression from microarray data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009; 6:200-212. [PMID: 19407345 DOI: 10.1109/tcbb.2008.126] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Cancer cells exhibit a common phenotype of uncontrolled cell growth, but this phenotype may arise from many different combinations of mutations. By inferring how cells evolve in individual tumors, a process called cancer progression, we may be able to identify important mutational events for different tumor types, potentially leading to new therapeutics and diagnostics. Prior work has shown that it is possible to infer frequent progression pathways by using gene expression profiles to estimate "distances" between tumors. Here, we apply gene network models to improve these estimates of evolutionary distance by controlling for correlations among coregulated genes. We test three variants of this approach: one using an optimized best-fit network, another using sampling to infer a high-confidence subnetwork, and one using a modular network inferred from clusters of similarly expressed genes. Application to lung cancer and breast cancer microarray data sets shows small improvements in phylogenies when correcting from the optimized network and more substantial improvements when correcting from the sampled or modular networks. Our results suggest that a network correction approach improves estimates of tumor similarity, but sophisticated network models are needed to control for the large hypothesis space and sparse data currently available.
Collapse
Affiliation(s)
- Yongjin Park
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | | | | |
Collapse
|
23
|
Abu-Asab M, Chaouchi M, Amri H. Evolutionary medicine: A meaningful connection between omics, disease, and treatment. Proteomics Clin Appl 2008; 2:122-134. [PMID: 18458745 PMCID: PMC2367146 DOI: 10.1002/prca.200780047] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2007] [Indexed: 12/11/2022]
Abstract
The evolutionary nature of diseases requires that their omics be analyzed by evolution-compatible analytical tools such as parsimony phylogenetics in order to reveal common mutations and pathways' modifications. Since the heterogeneity of the omics data renders some analytical tools such as phenetic clustering and Bayesian likelihood inefficient, a parsimony phylogenetic paradigm seems to connect between the omics and medicine. It offers a seamless, dynamic, predictive, and multidimensional analytical approach that reveals biological classes, and disease ontogenies; its analysis can be translated into practice for early detection, diagnosis, biomarker identification, prognosis, and assessment of treatment. Parsimony phylogenetics identifies classes of specimens, the clades, by their shared derived expressions, the synapomorphies, which are also the potential biomarkers for the classes that they delimit. Synapomorphies are determined through polarity assessment (ancestral vs. derived) of m/z or gene-expression values and parsimony analysis; this process also permits intra and interplatform comparability and produces higher concordance between platforms. Furthermore, major trends in the data are also interpreted from the graphical representation of the data as a tree diagram termed cladogram; it depicts directionality of change, identifies the transitional patterns from healthy to diseased, and can be developed into a predictive tool for early detection.
Collapse
Affiliation(s)
- Mones Abu-Asab
- Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mohamed Chaouchi
- Department of Physiology and Biophysics, School of Medicine, Georgetown University, Washington, DC, USA
| | - Hakima Amri
- Department of Physiology and Biophysics, School of Medicine, Georgetown University, Washington, DC, USA
| |
Collapse
|
24
|
Zhao H, Liew AWC, Xie X, Yan H. A new geometric biclustering algorithm based on the Hough transform for analysis of large-scale microarray data. J Theor Biol 2007; 251:264-74. [PMID: 18199458 DOI: 10.1016/j.jtbi.2007.11.030] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2007] [Revised: 10/17/2007] [Accepted: 11/29/2007] [Indexed: 11/30/2022]
Abstract
Biclustering is an important tool in microarray analysis when only a subset of genes co-regulates in a subset of conditions. Different from standard clustering analyses, biclustering performs simultaneous classification in both gene and condition directions in a microarray data matrix. However, the biclustering problem is inherently intractable and computationally complex. In this paper, we present a new biclustering algorithm based on the geometrical viewpoint of coherent gene expression profiles. In this method, we perform pattern identification based on the Hough transform in a column-pair space. The algorithm is especially suitable for the biclustering analysis of large-scale microarray data. Our studies show that the approach can discover significant biclusters with respect to the increased noise level and regulatory complexity. Furthermore, we also test the ability of our method to locate biologically verifiable biclusters within an annotated set of genes.
Collapse
Affiliation(s)
- Hongya Zhao
- Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong.
| | | | | | | |
Collapse
|
25
|
Jazbec J, Todorovski L, Jereb B. Classification tree analysis of second neoplasms in survivors of childhood cancer. BMC Cancer 2007; 7:27. [PMID: 17270060 PMCID: PMC1802085 DOI: 10.1186/1471-2407-7-27] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Accepted: 02/02/2007] [Indexed: 11/22/2022] Open
Abstract
Background Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. Methods In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. Results In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. Conclusion The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies.
Collapse
Affiliation(s)
- Janez Jazbec
- Division of oncology and hematology, Department of Pediatrics, Medical Center, Vrazov trg 1, Ljubljana, Slovenia
| | | | - Berta Jereb
- Institute of Oncology, Zaloška 2, Ljubljana, Slovenia
| |
Collapse
|
26
|
Foulkes WD. BRCA1 and BRCA2: chemosensitivity, treatment outcomes and prognosis. Fam Cancer 2006; 5:135-42. [PMID: 16736282 DOI: 10.1007/s10689-005-2832-5] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2004] [Accepted: 02/09/2005] [Indexed: 01/04/2023]
Abstract
BRCA1 and BRCA2 are important breast and ovarian cancer susceptibility genes, and mutations in these two genes confer lifetime risks of breast cancer of up to 80% and ovarian cancer risks of up to 40%. Clinico-pathological studies have identified features that are specific to BRCA1-related breast cancer, but this has been more difficult for BRCA2-related breast cancer. Ovarian cancers due to BRCA1 or BRCA2 mutations cannot usually be distinguished from their non-hereditary counterparts on morphological grounds, but micro-array data suggest that differences do exist. Prognostic studies have shown that breast cancer in a BRCA1 mutation carrier is likely to have a similar, or worse, outcome than that occurring in a BRCA2- or non-carrier of the same age. By contrast, most studies indicate that women developing a BRCA1/2-related ovarian cancer have an improved survival compared with non-carriers, particularly if they receive platinum-based therapy. In support of this, in vitro chemo-sensitivity studies have found that human cells lacking BRCA1 may be particularly sensitive to cisplatinum and to other drugs that cause double-strand breaks in DNA. Nevertheless, in breast cancer, little is known regarding clinically important differences in response to chemotherapy between BRCA1/2 mutation carriers and non-carriers, and between different chemotherapeutic regimens within existing series of BRCA1/2 mutation carriers. There are no published prospective studies. It is hoped that, in the near future, randomised controlled trials will be started with the aim of answering these important clinical questions.
Collapse
Affiliation(s)
- William D Foulkes
- Program in Cancer Genetics, Departments of Oncology and Human Genetics, McGill University, Montreal, Quebec, Canada, H2W 1S6.
| |
Collapse
|
27
|
Abstract
BRCA2 was identified in 1995, one year after BRCA1. In terms of knowledge of the function of its product, BRCA2 has remained the less well-characterised gene. Both BRCA1 and BRCA2 are closely implicated in the repair of double-strand breaks in DNA by homologous recombination, but beyond that a function for BRCA2 has been hard to discern. A recent study has extended the function of BRCA2 to the regulation of cell cleavage and separation. Other groups have also shown how BRCA2, RAD51 and DSS1 co-exist in a ménage à trois and how the disruption of any one of the three cohabitants can have disastrous consequences for the cell.
Collapse
Affiliation(s)
- Teresa M Rudkin
- Department of Human Genetics, McGill University, 2300 Tupper Street, A620, Montreal, H3H 1P3, Canada
| | | |
Collapse
|
28
|
Okoh AI, . MKB, . OOO, . EO. The Culturable Microbial and Chemical Qualities of Some Waters Used for Drinking and Domestic Purposes in a Typical Rural Setting of Southwestern Nigeria. ACTA ACUST UNITED AC 2005. [DOI: 10.3923/jas.2005.1041.1048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|