Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Li Q, Wang S, Huang CC, Yu M, Shao J. Meta-analysis based variable selection for gene expression data. Biometrics 2014;70:872-80. [PMID: 25196635 DOI: 10.1111/biom.12213] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 05/01/2014] [Accepted: 05/01/2014] [Indexed: 11/28/2022]

For:	Li Q, Wang S, Huang CC, Yu M, Shao J. Meta-analysis based variable selection for gene expression data. Biometrics 2014;70:872-80. [PMID: 25196635 DOI: 10.1111/biom.12213] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 05/01/2014] [Accepted: 05/01/2014] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Chang C, Dai Z, Oh J, Long Q. Integrative Learning of Structured High-Dimensional Data from Multiple Datasets. Stat Anal Data Min 2023;16:120-134. [PMID: 37213790 PMCID: PMC10195070 DOI: 10.1002/sam.11601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 10/14/2022] [Indexed: 11/11/2022]

Ma X, Kundu S. Multi-task Learning with High-Dimensional Noisy Images. J Am Stat Assoc 2022;119:650-663. [PMID: 38660581 PMCID: PMC11035991 DOI: 10.1080/01621459.2022.2140052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 10/17/2022] [Indexed: 10/31/2022]

Huang HH, Rao H, Miao R, Liang Y. A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression. BMC Bioinformatics 2022;23:353. [PMID: 35999505 PMCID: PMC9396780 DOI: 10.1186/s12859-022-04887-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 08/10/2022] [Indexed: 12/22/2022] Open

Abstract

Background

Gene expression analysis can provide useful information for analyzing complex biological mechanisms. However, many reported findings are unrepeatable due to small sample sizes relative to a large number of genes and the low signal-to-noise ratios of most gene expression datasets.

Results

Meta-analysis of multi-data sets is an efficient method for tackling the above problem. To improve the performance of meta-analysis, we propose a novel meta-analysis framework. It consists of two parts: (1) a novel data augmentation strategy. Various cross-platform normalization methods exist, which can preserve original biological information of gene expression datasets from different angles and add different “perturbations” to the dataset. Using such perturbation, we provide a feasible means for gene expression data augmentation; (2) elastic data shared lasso (DSL-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{L}}}_{\mathbf{2}}$$\end{document}L2). The DSL-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{L}}_{\mathbf{2}}$$\end{document}L2 method spans the continuum between individual models for each dataset and one model for all datasets. It also overcomes the shortcomings of the data shared lasso method when dealing with highly correlated features. Comprehensive simulation experiment results show that the proposed method has high prediction and gene selection performance. We then apply the proposed method to non-small cell lung cancer (NSCLC) blood gene expression data in order to identify key tumor-related genes. The outcomes of our experiment indicate that the method could be used for identifying a set of robust disease-related gene signatures that may be used for NSCLC early diagnosis or prognosis or even targeting.

Conclusion

We propose a novel and effective meta-analysis method for biological research, extrapolating and integrating information from multiple gene expression datasets.

Collapse

Hu Z, Zhou Y, Tong T. Meta-Analyzing Multiple Omics Data With Robust Variable Selection. Front Genet 2021;12:656826. [PMID: 34290735 PMCID: PMC8288516 DOI: 10.3389/fgene.2021.656826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 05/24/2021] [Indexed: 12/03/2022] Open

Chen H, He Y, Ji J, Shi Y. The sparse group lasso for high-dimensional integrative linear discriminant analysis with application to alzheimer's disease prediction. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1800011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Cui J, Shu J. Circulating microRNA trafficking and regulation: computational principles and practice. Brief Bioinform 2020;21:1313-1326. [PMID: 31504144 PMCID: PMC7412956 DOI: 10.1093/bib/bbz079] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/07/2019] [Accepted: 06/07/2019] [Indexed: 01/18/2023] Open

Zhang H, Li SJ, Zhang H, Yang ZY, Ren YQ, Xia LY, Liang Y. Meta-Analysis Based on Nonconvex Regularization. Sci Rep 2020;10:5755. [PMID: 32238826 PMCID: PMC7113298 DOI: 10.1038/s41598-020-62473-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 03/06/2020] [Indexed: 01/10/2023] Open

Xia Y, Li L, Lockhart SN, Jagust WJ. Simultaneous Covariance Inference for Multimodal Integrative Analysis. J Am Stat Assoc 2020;115:1279-1291. [PMID: 33867602 DOI: 10.1080/01621459.2019.1623040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Rashid NU, Li Q, Yeh JJ, Ibrahim JG. Modeling Between-Study Heterogeneity for Improved Replicability in Gene Signature Selection and Clinical Prediction. J Am Stat Assoc 2019;115:1125-1138. [PMID: 33012902 DOI: 10.1080/01621459.2019.1671197] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Yang ZY, Liu XY, Shu J, Zhang H, Ren YQ, Xu ZB, Liang Y. Multi-view based integrative analysis of gene expression data for identifying biomarkers. Sci Rep 2019;9:13504. [PMID: 31534156 PMCID: PMC6751173 DOI: 10.1038/s41598-019-49967-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 08/30/2019] [Indexed: 01/05/2023] Open

Huo Z, Song C, Tseng G. BAYESIAN LATENT HIERARCHICAL MODEL FOR TRANSCRIPTOMIC META-ANALYSIS TO DETECT BIOMARKERS WITH CLUSTERED META-PATTERNS OF DIFFERENTIAL EXPRESSION SIGNALS. Ann Appl Stat 2019;13:340-366. [PMID: 31007807 PMCID: PMC6472949 DOI: 10.1214/18-aoas1188] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Long NP, Park S, Anh NH, Nghi TD, Yoon SJ, Park JH, Lim J, Kwon SW. High-Throughput Omics and Statistical Learning Integration for the Discovery and Validation of Novel Diagnostic Signatures in Colorectal Cancer. Int J Mol Sci 2019;20:E296. [PMID: 30642095 PMCID: PMC6358915 DOI: 10.3390/ijms20020296] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 12/31/2018] [Accepted: 01/04/2019] [Indexed: 02/07/2023] Open

Zhang K, Geng W, Zhang S. Network-based logistic regression integration method for biomarker identification. BMC SYSTEMS BIOLOGY 2018;12:135. [PMID: 30598085 PMCID: PMC6311907 DOI: 10.1186/s12918-018-0657-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Abstract

Background

Many mathematical and statistical models and algorithms have been proposed to do biomarker identification in recent years. However, the biomarkers inferred from different datasets suffer a lack of reproducibilities due to the heterogeneity of the data generated from different platforms or laboratories. This motivates us to develop robust biomarker identification methods by integrating multiple datasets.

Methods

In this paper, we developed an integrative method for classification based on logistic regression. Different constant terms are set in the logistic regression model to measure the heterogeneity of the samples. By minimizing the differences of the constant terms within the same dataset, both the homogeneity within the same dataset and the heterogeneity in multiple datasets can be kept. The model is formulated as an optimization problem with a network penalty measuring the differences of the constant terms. The L₁ penalty, elastic penalty and network related penalties are added to the objective function for the biomarker discovery purpose. Algorithms based on proximal Newton method are proposed to solve the optimization problem.

Results

We first applied the proposed method to the simulated datasets. Both the AUC of the prediction and the biomarker identification accuracy are improved. We then applied the method to two breast cancer gene expression datasets. By integrating both datasets, the prediction AUC is improved over directly merging the datasets and MetaLasso. And it’s comparable to the best AUC when doing biomarker identification in an individual dataset. The identified biomarkers using network related penalty for variables were further analyzed. Meaningful subnetworks enriched by breast cancer were identified.

Conclusion

A network-based integrative logistic regression model is proposed in the paper. It improves both the prediction and biomarker identification accuracy.

Electronic supplementary material

The online version of this article (10.1186/s12918-018-0657-8) contains supplementary material, which is available to authorized users.

Collapse

Li Q, Li L. Integrative linear discriminant analysis with guaranteed error rate improvement. Biometrika 2018;105:917-930. [PMID: 31762476 DOI: 10.1093/biomet/asy047] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Shu J, Silva BVRE, Gao T, Xu Z, Cui J. Dynamic and Modularized MicroRNA Regulation and Its Implication in Human Cancers. Sci Rep 2017;7:13356. [PMID: 29042600 PMCID: PMC5645395 DOI: 10.1038/s41598-017-13470-5] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 09/26/2017] [Indexed: 12/19/2022] Open

Li Q, Yu M, Wang S. A Statistical Framework for Pathway and Gene Identification from Integrative Analysis. J MULTIVARIATE ANAL 2017;156:1-17. [PMID: 28943673 PMCID: PMC5606168 DOI: 10.1016/j.jmva.2016.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Kim S, Jhong JH, Lee J, Koo JY. Meta-analytic support vector machine for integrating multiple omics data. BioData Min 2017;10:2. [PMID: 28149325 PMCID: PMC5270233 DOI: 10.1186/s13040-017-0126-8] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 01/11/2017] [Indexed: 11/10/2022] Open

Richardson S, Tseng GC, Sun W. Statistical Methods in Integrative Genomics. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2016;3:181-209. [PMID: 27482531 PMCID: PMC4963036 DOI: 10.1146/annurev-statistics-041715-033506] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]