Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Qin LX, Self SG. The clustering of regression models method with applications in gene expression data. Biometrics 2006;62:526-33. [PMID: 16918917 DOI: 10.1111/j.1541-0420.2005.00498.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

For:	Qin LX, Self SG. The clustering of regression models method with applications in gene expression data. Biometrics 2006;62:526-33. [PMID: 16918917 DOI: 10.1111/j.1541-0420.2005.00498.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Seemingly unrelated clusterwise linear regression for contaminated data. Stat Pap (Berl) 2022. [DOI: 10.1007/s00362-022-01344-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Wang T, Yu L, Leurgans SE, Wilson RS, Bennett DA, Boyle PA. Conditional functional clustering for longitudinal data with heterogeneous nonlinear patterns. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Mou X, Zhang H, Arshad SH. Identifying intergenerational patterns of correlated methylation sites. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Zhang T, Lin G. Generalized k -means in GLMs with applications to the outbreak of COVID-19 in the United States. Comput Stat Data Anal 2021;159:107217. [PMID: 33723467 PMCID: PMC7943386 DOI: 10.1016/j.csda.2021.107217] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 02/28/2021] [Accepted: 02/28/2021] [Indexed: 11/30/2022]

Novoa A, Richardson DM, Pyšek P, Meyerson LA, Bacher S, Canavan S, Catford JA, Čuda J, Essl F, Foxcroft LC, Genovesi P, Hirsch H, Hui C, Jackson MC, Kueffer C, Le Roux JJ, Measey J, Mohanty NP, Moodley D, Müller-Schärer H, Packer JG, Pergl J, Robinson TB, Saul WC, Shackleton RT, Visser V, Weyl OLF, Yannelli FA, Wilson JRU. Invasion syndromes: a systematic approach for predicting biological invasions and facilitating effective management. Biol Invasions 2020. [DOI: 10.1007/s10530-020-02220-w] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Han S, Zhang H, Sheng W, Arshad H. The nested joint clustering via Dirichlet process mixture model. J STAT COMPUT SIM 2019;89:815-830. [DOI: 10.1080/00949655.2019.1572756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Mixtures of multivariate contaminated normal regression models. Stat Pap (Berl) 2017. [DOI: 10.1007/s00362-017-0964-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Dynamic Type-2 Fuzzy Dependent Dirichlet Regression Mixture clustering model. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2017.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Han S, Zhang H, Karmaus W, Roberts G, Arshad H. Adjusting background noise in cluster analyses of longitudinal data. Comput Stat Data Anal 2017;109:93-104. [PMID: 28603324 PMCID: PMC5464744 DOI: 10.1016/j.csda.2016.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Han S, Zhang H, Lockett GA, Mukherjee N, Holloway JW, Karmaus W. Identifying heterogeneous transgenerational DNA methylation sites via clustering in beta regression. Ann Appl Stat 2015. [DOI: 10.1214/15-aoas865] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Yang S, Cui X, Fang Z. BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations. BMC Bioinformatics 2014;15:74. [PMID: 24629125 PMCID: PMC4003822 DOI: 10.1186/1471-2105-15-74] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 03/10/2014] [Indexed: 11/17/2022] Open

Mankad S, Michailidis G. Biclustering Three-Dimensional Data Arrays With Plaid Models. J Comput Graph Stat 2014. [DOI: 10.1080/10618600.2013.851608] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Ng SK, McLachlan GJ, Wang K, Nagymanyoki Z, Liu S, Ng SW. Inference on differences between classes using cluster-specific contrasts of mixed effects. Biostatistics 2014;16:98-112. [PMID: 24963011 DOI: 10.1093/biostatistics/kxu028] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

The detection of differentially expressed (DE) genes, that is, genes whose expression levels vary between two or more classes representing different experimental conditions (say, diseases), is one of the most commonly studied problems in bioinformatics. For example, the identification of DE genes between distinct disease phenotypes is an important first step in understanding and developing treatment drugs for the disease. We present a novel approach to the problem of detecting DE genes that is based on a test statistic formed as a weighted (normalized) cluster-specific contrast in the mixed effects of the mixture model used in the first instance to cluster the gene profiles into a manageable number of clusters. The key factor in the formation of our test statistic is the use of gene-specific mixed effects in the cluster-specific contrast. It thus means that the (soft) assignment of a given gene to a cluster is not crucial. This is because in addition to class differences between the (estimated) fixed effects terms for a cluster, gene-specific class differences also contribute to the cluster-specific contributions to the final form of the test statistic. The proposed test statistic can be used where the primary aim is to rank the genes in order of evidence against the null hypothesis of no DE. We also show how a P-value can be calculated for each gene for use in multiple hypothesis testing where the intent is to control the false discovery rate (FDR) at some desired level. With the use of publicly available and simulated datasets, we show that the proposed contrast-based approach outperforms other methods commonly used for the detection of DE genes both in a ranking context with lower proportion of false discoveries and in a multiple hypothesis testing context with higher power for a specified level of the FDR.

Collapse

Kim HJ, Luo J, Kim J, Chen HS, Feuer EJ. Clustering of trend data using joinpoint regression models. Stat Med 2014;33:4087-103. [PMID: 24895073 DOI: 10.1002/sim.6221] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Revised: 03/06/2014] [Accepted: 05/07/2014] [Indexed: 11/11/2022]

Coffey N, Hinde J, Holian E. Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2013.04.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Eng KH, Hanlon BM. Discrete mixture modeling to address genetic heterogeneity in time-to-event regression. ACTA ACUST UNITED AC 2014;30:1690-7. [PMID: 24532723 PMCID: PMC4058947 DOI: 10.1093/bioinformatics/btu065] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Qin LX, Breeden L, Self SG. Finding gene clusters for a replicated time course study. BMC Res Notes 2014;7:60. [PMID: 24460656 PMCID: PMC3906880 DOI: 10.1186/1756-0500-7-60] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Accepted: 01/15/2014] [Indexed: 11/12/2022] Open

Shi J, Qin LX. CORM: An R Package Implementing the Clustering of Regression Models Method for Gene Clustering. Cancer Inform 2014;13:11-3. [PMID: 25452684 PMCID: PMC4218679 DOI: 10.4137/cin.s13967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Revised: 07/21/2014] [Accepted: 07/21/2014] [Indexed: 11/05/2022] Open

Komárek A, Komárková L. Clustering for multivariate continuous and discrete longitudinal data. Ann Appl Stat 2013. [DOI: 10.1214/12-aoas580] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Robinson JF, Piersma AH. Toxicogenomic approaches in developmental toxicology testing. Methods Mol Biol 2013;947:451-73. [PMID: 23138921 DOI: 10.1007/978-1-62703-131-8_31] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Wang K, Ng SK, McLachlan GJ. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects. BMC Bioinformatics 2012;13:300. [PMID: 23151154 PMCID: PMC3574839 DOI: 10.1186/1471-2105-13-300] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 11/07/2012] [Indexed: 11/26/2022] Open

Abstract

Background

Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models.

Results

We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases.

Conclusions

Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data.

Collapse

Tarpey T, Petkova E, Lu Y, Govindarajulu U. Optimal Partitioning for Linear Mixed Effects Models: Applications to Identifying Placebo Responders. J Am Stat Assoc 2012;105:968-977. [PMID: 21494314 DOI: 10.1198/jasa.2010.ap08713] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Blackstock AJ, Manatunga AK, Park Y, Jones DP, Yu T. Clustering Based on Periodicity in High-Throughput Time Course Data. Stat Anal Data Min 2011;4:579-589. [PMID: 23762213 DOI: 10.1002/sam.10137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Villarroel L, Marshall G, Barón AE. Cluster analysis using multivariate mixed effects models. Stat Med 2009;28:2552-65. [PMID: 19536743 DOI: 10.1002/sim.3632] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Li L, Lu Y, Qin LX, Bar-Joseph Z, Werner-Washburne M, Breeden LL. Budding yeast SSD1-V regulates transcript levels of many longevity genes and extends chronological life span in purified quiescent cells. Mol Biol Cell 2009;20:3851-64. [PMID: 19570907 DOI: 10.1091/mbc.e09-04-0347] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open

Yuan Y, Li CT, Wilson R. Partial mixture model for tight clustering of gene expression time-course. BMC Bioinformatics 2008;9:287. [PMID: 18564420 PMCID: PMC2492882 DOI: 10.1186/1471-2105-9-287] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2007] [Accepted: 06/18/2008] [Indexed: 11/29/2022] Open

Abstract

Background

Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On the other hand, there has been extensive use of maximum likelihood techniques for model parameter estimation. By contrast, the minimum distance estimator has been largely ignored.

Results

In this paper we show the inherent robustness of the minimum distance estimator that makes it a powerful tool for parameter estimation in model-based time-course clustering. To apply minimum distance estimation, a partial mixture model that can naturally incorporate replicate information and allow scattered genes is formulated. We provide experimental results of simulated data fitting, where the minimum distance estimator demonstrates superior performance to the maximum likelihood estimator. Both biological and statistical validations are conducted on a simulated dataset and two real gene expression datasets. Our proposed partial regression clustering algorithm scores top in Gene Ontology driven evaluation, in comparison with four other popular clustering algorithms.

Conclusion

For the first time partial mixture model is successfully extended to time-course data analysis. The robustness of our partial regression clustering algorithm proves the suitability of the combination of both partial mixture model and minimum distance estimator in this field. We show that tight clustering not only is capable to generate more profound understanding of the dataset under study well in accordance to established biological knowledge, but also presents interesting new hypotheses during interpretation of clustering results. In particular, we provide biological evidences that scattered genes can be relevant and are interesting subjects for study, in contrast to prevailing opinion.

Collapse

Qin LX. An integrative analysis of microRNA and mRNA expression--a case study. Cancer Inform 2008;6:369-79. [PMID: 19259417 PMCID: PMC2623315 DOI: 10.4137/cin.s633] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Testing the significance of cell-cycle patterns in time-course microarray data using nonparametric quadratic inference functions. Comput Stat Data Anal 2008. [DOI: 10.1016/j.csda.2007.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Turner HL, Bailey TC, Krzanowski WJ, Hemingway CA. Biclustering models for structured microarray data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2005;2:316-29. [PMID: 17044169 DOI: 10.1109/tcbb.2005.49] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]