1
|
Fu D, Weng X, Su Y, Hong B, Zhao A, Lin J. Establishing a model composed of immune-related gene-modules to predict tumor immunotherapy response. Sci Rep 2024; 14:16630. [PMID: 39025898 PMCID: PMC11258235 DOI: 10.1038/s41598-024-67742-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 07/15/2024] [Indexed: 07/20/2024] Open
Abstract
At present, tumor immunotherapy has been widely applied to treat various cancers. However, the accuracy of predicting treatment efficacy has not yet achieved a significant breakthrough. This study aimed to construct a prediction model based on the modified WGCNA algorithm to precisely judge the anti-tumor immune response. First, we used a murine colon cancer model to screen corresponding DEGs according to different groups. GSEA was used to analyze the potential mechanisms of the immune-related DEGs (irDEGs) in each group. Subsequently, the intersection of the irDEGs in every group was acquired, and 7 gene-modules were mapped. Finally, 4 gene-modules including cogenes, antiPD-1 immu-genes, chemo immu-genes and comb immu-genes, were selected for subsequent study. Furthermore, a clinical dataset of gastric cancer patients receiving immunotherapy was enrolled, and the irDEGs were identified. A total of 34 vital irDEGs were obtained from the intersections of the vital irDEGs and the four gene-modules. Next, the vital irDEGs were analyzed by the modified WGCNA algorithm, and the correlation coefficients between the 4 gene-modules and the response status to immunotherapy were calculated. Thus, a prediction model based on correlation coefficients was built, and the corresponding model scores were acquired. The AUC calculated according to the model score was 0.727, which was non-inferior to that of the ESTIMATE score and the TIDE score. Meanwhile, the AUC calculated according to the classification of the model scores was 0.705, which was non-inferior to that of the ESTIMATE classification and the TIDE classification. The prediction accuracy of the model was validated in clinical datasets of other cancers.
Collapse
Affiliation(s)
- Deqiang Fu
- Department of Oncology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Xiaoyuan Weng
- Thyroid and Breast Surgery, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
- Quanzhou Medical College, Quanzhou, China
| | - Yunxia Su
- Department of Oncology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Binhuang Hong
- Department of Oncology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Aiyue Zhao
- Department of Oncology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China.
| | - Jianqing Lin
- Thyroid and Breast Surgery, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China.
| |
Collapse
|
2
|
Sun YE, Zhou HJ, Li JJ. Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species. Bioinformatics 2021; 37:1225-1233. [PMID: 32814973 DOI: 10.1093/bioinformatics/btaa741] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Revised: 05/20/2020] [Accepted: 08/13/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Gene clustering is a widely used technique that has enabled computational prediction of unknown gene functions within a species. However, it remains a challenge to refine gene function prediction by leveraging evolutionarily conserved genes in another species. This challenge calls for a new computational algorithm to identify gene co-clusters in two species, so that genes in each co-cluster exhibit similar expression levels in each species and strong conservation between the species. RESULTS Here, we develop the bipartite tight spectral clustering (BiTSC) algorithm, which identifies gene co-clusters in two species based on gene orthology information and gene expression data. BiTSC novelly implements a formulation that encodes gene orthology as a bipartite network and gene expression data as node covariates. This formulation allows BiTSC to adopt and combine the advantages of multiple unsupervised learning techniques: kernel enhancement, bipartite spectral clustering, consensus clustering, tight clustering and hierarchical clustering. As a result, BiTSC is a flexible and robust algorithm capable of identifying informative gene co-clusters without forcing all genes into co-clusters. Another advantage of BiTSC is that it does not rely on any distributional assumptions. Beyond cross-species gene co-clustering, BiTSC also has wide applications as a general algorithm for identifying tight node co-clusters in any bipartite network with node covariates. We demonstrate the accuracy and robustness of BiTSC through comprehensive simulation studies. In a real data example, we use BiTSC to identify conserved gene co-clusters of Drosophila melanogaster and Caenorhabditis elegans, and we perform a series of downstream analysis to both validate BiTSC and verify the biological significance of the identified co-clusters. AVAILABILITY AND IMPLEMENTATION The Python package BiTSC is open-access and available at https://github.com/edensunyidan/BiTSC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yidan Eden Sun
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA
| | - Heather J Zhou
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA.,Department of Human Genetics, University of California, Los Angeles, CA 90095-7088, USA.,Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766, USA
| |
Collapse
|
3
|
Lu J, Cao X, Zhong S. A likelihood approach to testing hypotheses on the co-evolution of epigenome and genome. PLoS Comput Biol 2018; 14:e1006673. [PMID: 30586383 PMCID: PMC6324829 DOI: 10.1371/journal.pcbi.1006673] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 01/08/2019] [Accepted: 11/26/2018] [Indexed: 01/03/2023] Open
Abstract
Central questions to epigenome evolution include whether interspecies changes of histone modifications are independent of evolutionary changes of DNA, and if there is dependence whether they depend on any specific types of DNA sequence changes. Here, we present a likelihood approach for testing hypotheses on the co-evolution of genome and histone modifications. The gist of this approach is to convert evolutionary biology hypotheses into probabilistic forms, by explicitly expressing the joint probability of multispecies DNA sequences and histone modifications, which we refer to as a class of Joint Evolutionary Model for the Genome and the Epigenome (JEMGE). JEMGE can be summarized as a mixture model of four components representing four evolutionary hypotheses, namely dependence and independence of interspecies epigenomic variations to underlying sequence substitutions and to underlying sequence insertions and deletions (indels). We implemented a maximum likelihood method to fit the models to the data. Based on comparison of likelihoods, we inferred whether interspecies epigenomic variations depended on substitution or indels in local genomic sequences based on DNase hypersensitivity and spermatid H3K4me3 ChIP-seq data from human and rhesus macaque. Approximately 5.5% of homologous regions in the genomes exhibited H3K4me3 modification in either species, among which approximately 67% homologous regions exhibited local-sequence-dependent interspecies H3K4me3 variations. Substitutions accounted for less local-sequence-dependent H3K4me3 variations than indels. Among transposon-mediated indels, ERV1 insertions and L1 insertions were most strongly associated with H3K4me3 gains and losses, respectively. By initiating probabilistic formulation on the co-evolution of genomes and epigenomes, JEMGE helps to bring evolutionary biology principles to comparative epigenomic studies. Epigenetic modifications play a significant role in gene regulations and thus heavily influence phenotypic outcomes. Whereas cross-species epigenomic comparisons have been fruitful in revealing the function of epigenetic modifications, it still remains unclear how the epigenome changes across species. A central question in epigenome evolution studies is whether interspecies epigenomic variations rely on genomic changes in cis and, if partially yes, whether different genomic changes have distinct impacts. To tackle this question, we initiated a likelihood-based approach, in which different hypotheses related to the co-evolution of the genome and the epigenome could be converted into probabilistic models. By fitting the models to actual data, each model yielded a likelihood, and the hypothesis corresponded to the largest likelihood was selected as most supported by observed data. In this work, we focused on the influence of two types of underlying sequence changes: substitutions, and insertions and deletions (indels). We quantitatively assessed the dependence of H3K4me3 variations on substitutions and indels between human and rhesus, and separated their relative impacts within each genomic region with H3K4me3. The methodology presented here provides a framework for modeling the epigenome together with the genome and a quantitative approach to test different evolutionary hypotheses.
Collapse
Affiliation(s)
- Jia Lu
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Xiaoyi Cao
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Sheng Zhong
- Department of Bioengineering, University of California San Diego, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
4
|
Chemical compound-based direct reprogramming for future clinical applications. Biosci Rep 2018; 38:BSR20171650. [PMID: 29739872 PMCID: PMC5938430 DOI: 10.1042/bsr20171650] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Revised: 03/29/2018] [Accepted: 04/11/2018] [Indexed: 12/14/2022] Open
Abstract
Recent studies have revealed that a combination of chemical compounds enables direct reprogramming from one somatic cell type into another without the use of transgenes by regulating cellular signaling pathways and epigenetic modifications. The generation of induced pluripotent stem (iPS) cells generally requires virus vector-mediated expression of multiple transcription factors, which might disrupt genomic integrity and proper cell functions. The direct reprogramming is a promising alternative to rapidly prepare different cell types by bypassing the pluripotent state. Because the strategy also depends on forced expression of exogenous lineage-specific transcription factors, the direct reprogramming in a chemical compound-based manner is an ideal approach to further reduce the risk for tumorigenesis. So far, a number of reported research efforts have revealed that combinations of chemical compounds and cell-type specific medium transdifferentiate somatic cells into desired cell types including neuronal cells, glial cells, neural stem cells, brown adipocytes, cardiomyocytes, somatic progenitor cells, and pluripotent stem cells. These desired cells rapidly converted from patient-derived autologous fibroblasts can be applied for their own transplantation therapy to avoid immune rejection. However, complete chemical compound-induced conversions remain challenging particularly in adult human-derived fibroblasts compared with mouse embryonic fibroblasts (MEFs). This review summarizes up-to-date progress in each specific cell type and discusses prospects for future clinical application toward cell transplantation therapy.
Collapse
|
5
|
Bialkowska AB, Yang VW, Mallipattu SK. Krüppel-like factors in mammalian stem cells and development. Development 2017; 144:737-754. [PMID: 28246209 DOI: 10.1242/dev.145441] [Citation(s) in RCA: 79] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Krüppel-like factors (KLFs) are a family of zinc-finger transcription factors that are found in many species. Recent studies have shown that KLFs play a fundamental role in regulating diverse biological processes such as cell proliferation, differentiation, development and regeneration. Of note, several KLFs are also crucial for maintaining pluripotency and, hence, have been linked to reprogramming and regenerative medicine approaches. Here, we review the crucial functions of KLFs in mammalian embryogenesis, stem cell biology and regeneration, as revealed by studies of animal models. We also highlight how KLFs have been implicated in human diseases and outline potential avenues for future research.
Collapse
Affiliation(s)
- Agnieszka B Bialkowska
- Division of Gastroenterology, Department of Medicine, Stony Brook University School of Medicine, Stony Brook, NY 11794-8176, USA
| | - Vincent W Yang
- Division of Gastroenterology, Department of Medicine, Stony Brook University School of Medicine, Stony Brook, NY 11794-8176, USA.,Department of Physiology and Biophysics, Stony Brook University School of Medicine, Stony Brook, NY 11794-8176, USA
| | - Sandeep K Mallipattu
- Division of Nephrology, Department of Medicine, Stony Brook University School of Medicine, Stony Brook, NY 11794-8176, USA
| |
Collapse
|
6
|
Differences in the Early Development of Human and Mouse Embryonic Stem Cells. PLoS One 2015; 10:e0140803. [PMID: 26473594 PMCID: PMC4608779 DOI: 10.1371/journal.pone.0140803] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 09/30/2015] [Indexed: 01/22/2023] Open
Abstract
We performed a systematic analysis of gene expression features in early (10–21 days) development of human vs mouse embryonic cells (hESCs vs mESCs). Many development features were found to be conserved, and a majority of differentially regulated genes have similar expression change in both organisms. The similarity is especially evident, when gene expression profiles are clustered together and properties of clustered groups of genes are compared. First 10 days of mESC development match the features of hESC development within 21 days, in accordance with the differences in population doubling time in human and mouse ESCs. At the same time, several important differences are seen. There is a clear difference in initial expression change of transcription factors and stimulus responsive genes, which may be caused by the difference in experimental procedures. However, we also found that some biological processes develop differently; this can clearly be shown, for example, for neuron and sensory organ development. Some groups of genes show peaks of the expression levels during the development and these peaks cannot be claimed to happen at the same time points in the two organisms, as well as for the same groups of (orthologous) genes. We also detected a larger number of upregulated genes during development of mESCs as compared to hESCs. The differences were quantified by comparing promoters of related genes. Most of gene groups behave similarly and have similar transcription factor (TF) binding sites on their promoters. A few groups of genes have similar promoters, but are expressed differently in two species. Interestingly, there are groups of genes expressed similarly, although they have different promoters, which can be shown by comparing their TF binding sites. Namely, a large group of similarly expressed cell cycle-related genes is found to have discrepant TF binding properties in mouse vs human.
Collapse
|
7
|
Callihan P, Ali MW, Salazar H, Quach N, Wu X, Stice SL, Hooks SB. Convergent regulation of neuronal differentiation and Erk and Akt kinases in human neural progenitor cells by lysophosphatidic acid, sphingosine 1-phosphate, and LIF: specific roles for the LPA1 receptor. ASN Neuro 2014; 6:6/6/1759091414558416. [PMID: 25424429 PMCID: PMC4357610 DOI: 10.1177/1759091414558416] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The bioactive lysophospholipids lysophosphatidic acid (LPA) and sphingosine 1-phosphate (S1P) have diverse effects on the developing nervous system and neural progenitors, but the molecular basis for their pleiotropic effects is poorly understood. We previously defined LPA and S1P signaling in proliferating human neural progenitor (hNP) cells, and the current study investigates their role in neuronal differentiation of these cells. Differentiation in the presence of LPA or S1P significantly enhanced cell survival and decreased expression of neuronal markers. Further, the LPA receptor antagonist Ki16425 fully blocked the effects of LPA, and differentiation in the presence of Ki16425 dramatically enhanced neurite length. LPA and S1P robustly activated Erk, but surprisingly both strongly suppressed Akt activation. Ki16425 and pertussis toxin blocked LPA activation of Erk but not LPA inhibition of Akt, suggesting distinct receptor and G-protein subtypes mediate these effects. Finally, we explored cross talk between lysophospholipid signaling and the cytokine leukemia inhibitory factor (LIF). LPA/S1P effects on neuronal differentiation were amplified in the presence of LIF. Similarly, the ability of LPA/S1P to regulate Erk and Akt was impacted by the presence of LIF; LIF enhanced the inhibitory effect of LPA/S1P on Akt phosphorylation, while LIF blunted the activation of Erk by LPA/S1P. Taken together, our results suggest that LPA and S1P enhance survival and inhibit neuronal differentiation of hNP cells, and LPA1 is critical for the effect of LPA. The pleiotropic effects of LPA may reflect differences in receptor subtype expression or cross talk with LIF receptor signaling.
Collapse
Affiliation(s)
- Phillip Callihan
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, USA
| | - Mourad W Ali
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, USA
| | - Hector Salazar
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, USA
| | - Nhat Quach
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, USA
| | - Xian Wu
- Department of Animal and Dairy Science, Regenerative Bioscience Center, University of Georgia, Athens, GA, USA
| | - Steven L Stice
- Department of Animal and Dairy Science, Regenerative Bioscience Center, University of Georgia, Athens, GA, USA
| | - Shelley B Hooks
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, USA
| |
Collapse
|
8
|
Clinical significance of the stem cell gene Oct-4 in cervical cancer. Tumour Biol 2014; 35:5339-45. [DOI: 10.1007/s13277-014-1696-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2013] [Accepted: 01/26/2014] [Indexed: 12/11/2022] Open
|
9
|
Chambers EV, Bickmore WA, Semple CA. Divergence of mammalian higher order chromatin structure is associated with developmental loci. PLoS Comput Biol 2013; 9:e1003017. [PMID: 23592965 PMCID: PMC3617018 DOI: 10.1371/journal.pcbi.1003017] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 02/18/2013] [Indexed: 02/03/2023] Open
Abstract
Several recent studies have examined different aspects of mammalian higher order chromatin structure - replication timing, lamina association and Hi-C inter-locus interactions - and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution.
Collapse
Affiliation(s)
- Emily V. Chambers
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Wendy A. Bickmore
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Colin A. Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
10
|
Wang J, Wu G, Chen L, Zhang W. Cross-species transcriptional network analysis reveals conservation and variation in response to metal stress in cyanobacteria. BMC Genomics 2013; 14:112. [PMID: 23421563 PMCID: PMC3598940 DOI: 10.1186/1471-2164-14-112] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2012] [Accepted: 02/13/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND As one of the most dominant bacterial groups on Earth, cyanobacteria play a pivotal role in the global carbon cycling and the Earth atmosphere composition. Understanding their molecular responses to environmental perturbations has important scientific and environmental values. Since important biological processes or networks are often evolutionarily conserved, the cross-species transcriptional network analysis offers a useful strategy to decipher conserved and species-specific transcriptional mechanisms that cells utilize to deal with various biotic and abiotic disturbances, and it will eventually lead to a better understanding of associated adaptation and regulatory networks. RESULTS In this study, the Weighted Gene Co-expression Network Analysis (WGCNA) approach was used to establish transcriptional networks for four important cyanobacteria species under metal stress, including iron depletion and high copper conditions. Cross-species network comparison led to discovery of several core response modules and genes possibly essential to metal stress, as well as species-specific hub genes for metal stresses in different cyanobacteria species, shedding light on survival strategies of cyanobacteria responding to different environmental perturbations. CONCLUSIONS The WGCNA analysis demonstrated that the application of cross-species transcriptional network analysis will lead to novel insights to molecular response to environmental changes which will otherwise not be achieved by analyzing data from a single species.
Collapse
Affiliation(s)
- Jiangxin Wang
- School of Chemical Engineering & Technology, Tianjin University, 300072, Tianjin, People's Republic of China
| | | | | | | |
Collapse
|
11
|
Abstract
Chimeric animals generated from livestock-induced pluripotent stem cells (iPSCs) have opened the door of opportunity to genetically manipulate species for the production of biomedical models, improving traits of agricultural importance and potentially providing a system to test novel iPSC therapies. The potential of pluripotent stem cells in livestock has long been recognized, with many attempts being chronicled to isolate, culture and characterize pluripotent cells from embryos. However, in most cases, livestock stem cells derived from embryonic sources have failed to reach a pluripotent state marked by the inability to form chimeric animals. The in-depth understanding of core pluripotency factors and the realization of how these factors can be harnessed to reprogram adult cells into an induced pluripotent state has changed the paradigm of livestock stem cells. In this review, we will examine the advancements in iPSC technology in mammalian and avian livestock species.
Collapse
Affiliation(s)
- Y Lu
- Department of Animal and Dairy Science, Regenerative Bioscience Center, University of Georgia, Athens, GA 30602, USA
| | | | | | | |
Collapse
|
12
|
Zhong W, Zhang T, Zhu Y, Liu JS. CORRELATION PURSUIT: FORWARD STEPWISE VARIABLE SELECTION FOR INDEX MODELS. J R Stat Soc Series B Stat Methodol 2012; 74:849-870. [PMID: 23243388 PMCID: PMC3519449 DOI: 10.1111/j.1467-9868.2011.01026.x] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In this article, a stepwise procedure, correlation pursuit (COP), is developed for variable selection under the sufficient dimension reduction framework, in which the response variable Y is influenced by the predictors X(1), X(2), …, X(p) through an unknown function of a few linear combinations of them. Unlike linear stepwise regression, COP does not impose a special form of relationship (such as linear) between the response variable and the predictor variables. The COP procedure selects variables that attain the maximum correlation between the transformed response and the linear combination of the variables. Various asymptotic properties of the COP procedure are established, and in particular, its variable selection performance under diverging number of predictors and sample size has been investigated. The excellent empirical performance of the COP procedure in comparison with existing methods are demonstrated by both extensive simulation studies and a real example in functional genomics.
Collapse
Affiliation(s)
- Wenxuan Zhong
- Department of Statistics, University of Illinois at Urbana Champaign, Champaign, IL 61820
| | - Tingting Zhang
- Department of Statistics, University of Virginia, Charlottesville, VA 22904
| | - Yu Zhu
- Department of Statistics, Purdue University, West Lafayette, IN 47907
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA 02138
| |
Collapse
|
13
|
Palmer NP, Schmid PR, Berger B, Kohane IS. A gene expression profile of stem cell pluripotentiality and differentiation is conserved across diverse solid and hematopoietic cancers. Genome Biol 2012; 13:R71. [PMID: 22909066 PMCID: PMC3491371 DOI: 10.1186/gb-2012-13-8-r71] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Accepted: 08/21/2012] [Indexed: 01/06/2023] Open
Abstract
Background Understanding the fundamental mechanisms of tumorigenesis remains one of the most pressing problems in modern biology. To this end, stem-like cells with tumor-initiating potential have become a central focus in cancer research. While the cancer stem cell hypothesis presents a compelling model of self-renewal and partial differentiation, the relationship between tumor cells and normal stem cells remains unclear. Results We identify, in an unbiased fashion, mRNA transcription patterns associated with pluripotent stem cells. Using this profile, we derive a quantitative measure of stem cell-like gene expression activity. We show how this 189 gene signature stratifies a variety of stem cell, malignant and normal tissue samples by their relative plasticity and state of differentiation within Concordia, a diverse gene expression database consisting of 3,209 Affymetrix HGU133+ 2.0 microarray assays. Further, the orthologous murine signature correctly orders a time course of differentiating embryonic mouse stem cells. Finally, we demonstrate how this stem-like signature serves as a proxy for tumor grade in a variety of solid tumors, including brain, breast, lung and colon. Conclusions This core stemness gene expression signature represents a quantitative measure of stem cell-associated transcriptional activity. Broadly, the intensity of this signature correlates to the relative level of plasticity and differentiation across all of the human tissues analyzed. The fact that the intensity of this signature is also capable of differentiating histological grade for a variety of human malignancies suggests potential therapeutic and diagnostic implications.
Collapse
|
14
|
Abstract
PURPOSE To explore the expression of stem cell genes in breast cancer and the relationship between stem cell gene expression and clinical and pathological characteristics and prognosis of breast cancer. BACKGROUND By now, stem cell differentiation-related genes and the relationship between the genes and clinic-pathological characteristics and prognosis of breast cancer are still unclear. MATERIALS AND METHODS CD44+/CD24- tumor cells were selected by Flow cytometry. The differential expression of genes between CD44+/CD24- tumor cells and non-CD44+/CD24- tumor cells were detected by RT(2) Profiler™ PCR Array. The expression of stem cell gene Octamer-4 (Oct-4) was analyzed by immunohistochemistry staining and the relationship between Oct-4 and clinicopathological parameters of breast cancer was determined. RESULTS Seven different genes including stem cell differentiation-related factors (CD44, Oct-4, and nestin), cell cycle regulators (APC and CDC2), and growth factors (HGF and TGF) were detected as significantly differently expressed between CD44+/CD24- tumor cells and non-CD44+/CD24- tumor cells. Oct-4 protein expressed significantly higher in cancerous tissues than adjacent-tumor tissues (P = 0.001). Moreover, we observed that the expression of Oct-4 protein was related to histological type, lymph node status and molecular type of breast cancer (P = 0.001, 0.006, and 0.001, respectively). After survival analysis, the cases with highly expressed Oct-4 protein attained a significantly poorer postoperative disease-specific survival than those with none/low expressed Oct-4 protein (P = 0.001). In the Cox regression test, tumor size, histological type, disease stage, lymph node metastasis, Her-2 and Oct-4 were detected as the independent prognostic factors (P = 0.031, 0.012, 0.001, 0.002, 0.030, and 0.003, respectively). CONCLUSIONS Oct-4 was highly expressed in CD44+/CD24- tumor cells, and may be a potential biomarker for the initiation, progression, and differentiation of breast cancer.
Collapse
|
15
|
Xie D, Chen CC, He X, Cao X, Zhong S. Towards an evolutionary model of transcription networks. PLoS Comput Biol 2011; 7:e1002064. [PMID: 21695281 PMCID: PMC3111474 DOI: 10.1371/journal.pcbi.1002064] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2010] [Accepted: 04/08/2011] [Indexed: 11/18/2022] Open
Abstract
DNA evolution models made invaluable contributions to comparative genomics, although it seemed formidable to include non-genomic features into these models. In order to build an evolutionary model of transcription networks (TNs), we had to forfeit the substitution model used in DNA evolution and to start from modeling the evolution of the regulatory relationships. We present a quantitative evolutionary model of TNs, subjecting the phylogenetic distance and the evolutionary changes of cis-regulatory sequence, gene expression and network structure to one probabilistic framework. Using the genome sequences and gene expression data from multiple species, this model can predict regulatory relationships between a transcription factor (TF) and its target genes in all species, and thus identify TN re-wiring events. Applying this model to analyze the pre-implantation development of three mammalian species, we identified the conserved and re-wired components of the TNs downstream to a set of TFs including Oct4, Gata3/4/6, cMyc and nMyc. Evolutionary events on the DNA sequence that led to turnover of TF binding sites were identified, including a birth of an Oct4 binding site by a 2nt deletion. In contrast to recent reports of large interspecies differences of TF binding sites and gene expression patterns, the interspecies difference in TF-target relationship is much smaller. The data showed increasing conservation levels from genomic sequences to TF-DNA interaction, gene expression, TN, and finally to morphology, suggesting that evolutionary changes are larger at molecular levels and smaller at functional levels. The data also showed that evolutionarily older TFs are more likely to have conserved target genes, whereas younger TFs tend to have larger re-wiring rates. DNA evolution models made invaluable contributions to comparative genomic studies. Still lacking is an evolutionary model of transcription networks (TNs). To develop such a model, we had to forfeit the substitution model used in DNA evolution and to start from modeling the evolution of the regulatory relationships, and then subject the phylogenetic distance and the multi-species DNA sequence and gene expression data to one probabilistic framework. This model enabled us to infer the evolutionary changes of transcriptional regulatory relationships. Applying this model to analyze three yeast species, we found the anaerobic phenotype in two species was associated with the evolutionary loss of a larger cis-regulatory motif than previously thought. Analyzing three mammalian species, we found increasing conservation levels from genomic sequences to transcription factor-DNA interaction, gene expression, TN, and finally to morphology, suggesting that evolutionary changes are larger at molecular levels and smaller at functional levels. We also found that evolutionarily younger TFs are more likely to regulate different target genes in different species.
Collapse
Affiliation(s)
- Dan Xie
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Chieh-Chun Chen
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Xin He
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, United States of America
| | - Xiaoyi Cao
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Sheng Zhong
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail:
| |
Collapse
|
16
|
Zarrineh P, Fierro AC, Sánchez-Rodríguez A, De Moor B, Engelen K, Marchal K. COMODO: an adaptive coclustering strategy to identify conserved coexpression modules between organisms. Nucleic Acids Res 2010; 39:e41. [PMID: 21149270 PMCID: PMC3074154 DOI: 10.1093/nar/gkq1275] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Increasingly large-scale expression compendia for different species are becoming available. By exploiting the modularity of the coexpression network, these compendia can be used to identify biological processes for which the expression behavior is conserved over different species. However, comparing module networks across species is not trivial. The definition of a biologically meaningful module is not a fixed one and changing the distance threshold that defines the degree of coexpression gives rise to different modules. As a result when comparing modules across species, many different partially overlapping conserved module pairs across species exist and deciding which pair is most relevant is hard. Therefore, we developed a method referred to as conserved modules across organisms (COMODO) that uses an objective selection criterium to identify conserved expression modules between two species. The method uses as input microarray data and a gene homology map and provides as output pairs of conserved modules and searches for the pair of modules for which the number of sharing homologs is statistically most significant relative to the size of the linked modules. To demonstrate its principle, we applied COMODO to study coexpression conservation between the two well-studied bacteria Escherichia coli and Bacillus subtilis. COMODO is available at: http://homes.esat.kuleuven.be/∼kmarchal/Supplementary_Information_Zarrineh_2010/comodo/index.html.
Collapse
Affiliation(s)
- Peyman Zarrineh
- Department of Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 20, 3001 Leuven, Belgium
| | | | | | | | | | | |
Collapse
|
17
|
Warsow G, Greber B, Falk SSI, Harder C, Siatkowski M, Schordan S, Som A, Endlich N, Schöler H, Repsilber D, Endlich K, Fuellen G. ExprEssence--revealing the essence of differential experimental data in the context of an interaction/regulation net-work. BMC SYSTEMS BIOLOGY 2010; 4:164. [PMID: 21118483 PMCID: PMC3012047 DOI: 10.1186/1752-0509-4-164] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2010] [Accepted: 11/30/2010] [Indexed: 12/15/2022]
Abstract
Background Experimentalists are overwhelmed by high-throughput data and there is an urgent need to condense information into simple hypotheses. For example, large amounts of microarray and deep sequencing data are becoming available, describing a variety of experimental conditions such as gene knockout and knockdown, the effect of interventions, and the differences between tissues and cell lines. Results To address this challenge, we developed a method, implemented as a Cytoscape plugin called ExprEssence. As input we take a network of interaction, stimulation and/or inhibition links between genes/proteins, and differential data, such as gene expression data, tracking an intervention or development in time. We condense the network, highlighting those links across which the largest changes can be observed. Highlighting is based on a simple formula inspired by the law of mass action. We can interactively modify the threshold for highlighting and instantaneously visualize results. We applied ExprEssence to three scenarios describing kidney podocyte biology, pluripotency and ageing: 1) We identify putative processes involved in podocyte (de-)differentiation and validate one prediction experimentally. 2) We predict and validate the expression level of a transcription factor involved in pluripotency. 3) Finally, we generate plausible hypotheses on the role of apoptosis, cell cycle deregulation and DNA repair in ageing data obtained from the hippocampus. Conclusion Reducing the size of gene/protein networks to the few links affected by large changes allows to screen for putative mechanistic relationships among the genes/proteins that are involved in adaptation to different experimental conditions, yielding important hypotheses, insights and suggestions for new experiments. We note that we do not focus on the identification of 'active subnetworks'. Instead we focus on the identification of single links (which may or may not form subnetworks), and these single links are much easier to validate experimentally than submodules. ExprEssence is available at http://sourceforge.net/projects/expressence/.
Collapse
Affiliation(s)
- Gregor Warsow
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Ernst-Heydemann-Strasse 8, Rostock, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Callihan P, Mumaw J, Machacek DW, Stice SL, Hooks SB. Regulation of stem cell pluripotency and differentiation by G protein coupled receptors. Pharmacol Ther 2010; 129:290-306. [PMID: 21073897 DOI: 10.1016/j.pharmthera.2010.10.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2010] [Accepted: 10/08/2010] [Indexed: 01/25/2023]
Abstract
Stem cell-based therapeutics have the potential to effectively treat many terminal and debilitating human diseases, but the mechanisms by which their growth and differentiation are regulated are incompletely defined. Recent data from multiple systems suggest major roles for G protein coupled receptor (GPCR) pathways in regulating stem cell function in vivo and in vitro. The goal of this review is to illustrate common ground between the growing field of stem cell therapeutics and the long-established field of G protein coupled receptor signaling. Herein, we briefly introduce basic stem cell biology and discuss how several conserved pathways regulate pluripotency and differentiation in mouse and human stem cells. We further discuss general mechanisms by which GPCR signaling may impact these pluripotency and differentiation pathways, and summarize specific examples of receptors from each of the major GPCR subfamilies that have been shown to regulate stem cell function. Finally, we discuss possible therapeutic implications of GPCR regulation of stem cell function.
Collapse
Affiliation(s)
- Phillip Callihan
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA, United States
| | | | | | | | | |
Collapse
|
19
|
Kuo D, Tan K, Zinman G, Ravasi T, Bar-Joseph Z, Ideker T. Evolutionary divergence in the fungal response to fluconazole revealed by soft clustering. Genome Biol 2010; 11:R77. [PMID: 20653936 PMCID: PMC2926788 DOI: 10.1186/gb-2010-11-7-r77] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Revised: 07/09/2010] [Accepted: 07/23/2010] [Indexed: 11/25/2022] Open
Abstract
Background Fungal infections are an emerging health risk, especially those involving yeast that are resistant to antifungal agents. To understand the range of mechanisms by which yeasts can respond to anti-fungals, we compared gene expression patterns across three evolutionarily distant species - Saccharomyces cerevisiae, Candida glabrata and Kluyveromyces lactis - over time following fluconazole exposure. Results Conserved and diverged expression patterns were identified using a novel soft clustering algorithm that concurrently clusters data from all species while incorporating sequence orthology. The analysis suggests complementary strategies for coping with ergosterol depletion by azoles - Saccharomyces imports exogenous ergosterol, Candida exports fluconazole, while Kluyveromyces does neither, leading to extreme sensitivity. In support of this hypothesis we find that only Saccharomyces becomes more azole resistant in ergosterol-supplemented media; that this depends on sterol importers Aus1 and Pdr11; and that transgenic expression of sterol importers in Kluyveromyces alleviates its drug sensitivity. Conclusions We have compared the dynamic transcriptional responses of three diverse yeast species to fluconazole treatment using a novel clustering algorithm. This approach revealed significant divergence among regulatory programs associated with fluconazole sensitivity. In future, such approaches might be used to survey a wider range of species, drug concentrations and stimuli to reveal conserved and divergent molecular response pathways.
Collapse
Affiliation(s)
- Dwight Kuo
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | | | | | | | | | | |
Collapse
|