Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tamayo P, Steinhardt G, Liberzon A, Mesirov JP. The limitations of simple gene set enrichment analysis assuming gene independence. Stat Methods Med Res 2016;25:472-87. [PMID: 23070592 PMCID: PMC3758419 DOI: 10.1177/0962280212460441] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

For:	Tamayo P, Steinhardt G, Liberzon A, Mesirov JP. The limitations of simple gene set enrichment analysis assuming gene independence. Stat Methods Med Res 2016;25:472-87. [PMID: 23070592 PMCID: PMC3758419 DOI: 10.1177/0962280212460441] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Vemuri K, Kumar S, Chen L, Verzi MP. Dynamic RNA polymerase II occupancy drives differentiation of the intestine under the direction of HNF4. Cell Rep 2024;43:114242. [PMID: 38768033 PMCID: PMC11264335 DOI: 10.1016/j.celrep.2024.114242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/03/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024] Open

Koopmans F. GOAT: efficient and robust identification of gene set enrichment. Commun Biol 2024;7:744. [PMID: 38898151 PMCID: PMC11187187 DOI: 10.1038/s42003-024-06454-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 06/14/2024] [Indexed: 06/21/2024] Open

Candia J, Ferrucci L. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks. PLoS One 2024;19:e0302696. [PMID: 38753612 PMCID: PMC11098418 DOI: 10.1371/journal.pone.0302696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/09/2024] [Indexed: 05/18/2024] Open

Abstract

Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools.

Collapse

Mahzarnia A, Lutz MW, Badea A. A Continuous Extension of Gene Set Enrichment Analysis Using the Likelihood Ratio Test Statistics Identifies Vascular Endothelial Growth Factor as a Candidate Pathway for Alzheimer's Disease via ITGA5. J Alzheimers Dis 2024;97:635-648. [PMID: 38160360 PMCID: PMC10836573 DOI: 10.3233/jad-230934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/01/2023] [Indexed: 01/03/2024]

Vemuri K, Kumar S, Chen L, Verzi MP. Dynamic RNA Polymerase II Recruitment Drives Differentiation of the Intestine under the direction of HNF4. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.08.566322. [PMID: 37986803 PMCID: PMC10659318 DOI: 10.1101/2023.11.08.566322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Chen L, Qiu X, Dupre A, Pellon-Cardenas O, Fan X, Xu X, Rout P, Walton KD, Burclaff J, Zhang R, Fang W, Ofer R, Logerfo A, Vemuri K, Bandyopadhyay S, Wang J, Barbet G, Wang Y, Gao N, Perekatt AO, Hu W, Magness ST, Spence JR, Verzi MP. TGFB1 induces fetal reprogramming and enhances intestinal regeneration. Cell Stem Cell 2023;30:1520-1537.e8. [PMID: 37865088 PMCID: PMC10841757 DOI: 10.1016/j.stem.2023.09.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 07/03/2023] [Accepted: 09/28/2023] [Indexed: 10/23/2023]

Affiliation(s)

Lei Chen School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing 210096, China.
Xia Qiu Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Abigail Dupre Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Oscar Pellon-Cardenas Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Xiaojiao Fan School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing 210096, China
Xiaoting Xu School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing 210096, China
Prateeksha Rout Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Katherine D Walton Department of Internal Medicine, Division of Gastroenterology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
Joseph Burclaff Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill, and North Carolina State University, Chapel Hill, NC 27695, USA; Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC 27599, USA
Ruolan Zhang School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing 210096, China
Wenxin Fang School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing 210096, China
Rachel Ofer Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Alexandra Logerfo Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Kiranmayi Vemuri Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA
Sheila Bandyopadhyay Department of Biological Sciences, Rutgers University-Newark, Newark, NJ 07102, USA
Jianming Wang Department of Radiation Oncology, Rutgers Cancer Institute of New Jersey, Rutgers University-New Brunswick, New Brunswick, NJ 08903, USA
Gaetan Barbet Child Health Institute of New Jersey, Rutgers University-New Brunswick, New Brunswick, NJ 08901, USA
Yan Wang Center for Translation Medicine Research and Development, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
Nan Gao Department of Biological Sciences, Rutgers University-Newark, Newark, NJ 07102, USA
Ansu O Perekatt Department of Chemistry and Chemical Biology, Stevens Institute of Technology, Hoboken, NJ 07030, USA
Wenwei Hu Department of Radiation Oncology, Rutgers Cancer Institute of New Jersey, Rutgers University-New Brunswick, New Brunswick, NJ 08903, USA
Scott T Magness Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill, and North Carolina State University, Chapel Hill, NC 27695, USA; Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC 27599, USA
Jason R Spence Department of Internal Medicine, Division of Gastroenterology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Biomedical Engineering, University of Michigan College of Engineering, Ann Arbor, MI 48109, USA
Michael P Verzi Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 00854, USA; Rutgers Cancer Institute of New Jersey, Rutgers University-New Brunswick, New Brunswick, NJ 08903, USA; Rutgers Center for Lipid Research, New Jersey Institute for Food, Nutrition, and Health, Rutgers University-New Brunswick, New Brunswick, NJ 08901, USA; NIEHS Center for Environmental Exposures and Disease (CEED), Rutgers EOHSI, Piscataway, NJ 08854, USA.

Collapse

McGovern KC, Nixon MP, Silverman JD. Addressing erroneous scale assumptions in microbe and gene set enrichment analysis. PLoS Comput Biol 2023;19:e1011659. [PMID: 37983251 PMCID: PMC10695402 DOI: 10.1371/journal.pcbi.1011659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 12/04/2023] [Accepted: 11/04/2023] [Indexed: 11/22/2023] Open

Zhu S, Liu N, Gong H, Liu F, Yan G. Identification of biomarkers and sex differences in the placenta of fetal growth restriction. J Obstet Gynaecol Res 2023;49:2324-2336. [PMID: 37553225 DOI: 10.1111/jog.15735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 06/20/2023] [Indexed: 08/10/2023]

Mahzarnia A, Lutz MW, Badea A. A Continuous Extension of Gene Set Enrichment Analysis using the Likelihood Ratio Test Statistics Identifies VEGF as a Candidate Pathway for Alzheimer's disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.22.554319. [PMID: 37662249 PMCID: PMC10473614 DOI: 10.1101/2023.08.22.554319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Chen S, Zhou Z, Wang Y, Chen S, Jiang J. Machine learning-based identification of cuproptosis-related markers and immune infiltration in severe community-acquired pneumonia. THE CLINICAL RESPIRATORY JOURNAL 2023;17:618-628. [PMID: 37279744 PMCID: PMC10363779 DOI: 10.1111/crj.13633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 04/24/2023] [Accepted: 05/06/2023] [Indexed: 06/08/2023]

Zhao K, Rhee SY. Interpreting omics data with pathway enrichment analysis. Trends Genet 2023;39:308-319. [PMID: 36750393 DOI: 10.1016/j.tig.2023.01.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 11/24/2022] [Accepted: 01/13/2023] [Indexed: 02/09/2023]

Griffin AT, Vlahos LJ, Chiuzan C, Califano A. NaRnEA: An Information Theoretic Framework for Gene Set Analysis. ENTROPY (BASEL, SWITZERLAND) 2023;25:e25030542. [PMID: 36981431 PMCID: PMC10048242 DOI: 10.3390/e25030542] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 03/03/2023] [Accepted: 03/13/2023] [Indexed: 05/26/2023]

Chen L, Dupre A, Qiu X, Pellon-Cardenas O, Walton KD, Wang J, Perekatt AO, Hu W, Spence JR, Verzi MP. TGFB1 Induces Fetal Reprogramming and Enhances Intestinal Regeneration. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523825. [PMID: 36711781 PMCID: PMC9882197 DOI: 10.1101/2023.01.13.523825] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Affiliation(s)

Lei Chen School of Life Science and Technology, Key Laboratory of Developmental Genes and Human Disease, Southeast University, Nanjing, China
Abigail Dupre Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA
Xia Qiu Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA
Oscar Pellon-Cardenas Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA
Katherine D. Walton Department of Internal Medicine, Gastroenterology, University of Michigan Medical School, Ann Arbor, MI, USA Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI, USA
Jianming Wang Department of Radiation Oncology, Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
Ansu O. Perekatt Department of Chemistry and Chemical Biology, Stevens Institute of Technology, Hoboken, NJ, USA
Wenwei Hu Department of Radiation Oncology, Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
Jason R. Spence Department of Internal Medicine, Gastroenterology, University of Michigan Medical School, Ann Arbor, MI, USA Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI, USA Department of Biomedical Engineering, University of Michigan College of Engineering, Ann Arbor, MI, USA
Michael P. Verzi Department of Genetics, Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA Rutgers Center for Lipid Research, New Jersey Institute for Food, Nutrition & Health, Rutgers University, New Brunswick, NJ, USA Member of the NIEHS Center for Environmental Exposures and Disease (CEED), Rutgers EOHSI Piscataway, NJ, USA Lead Contact

Collapse

Aberasturi DT, Piegorsch WW, Bedrick EJ, Lussier YA. Accounting for extra-binomial variability with differentially expressed genetic pathway data: a collaborative bioinformatic study. Stat (Int Stat Inst) 2023;12:e518. [PMID: 37885703 PMCID: PMC10601968 DOI: 10.1002/sta4.518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 10/21/2022] [Indexed: 10/28/2023]

Hypergraph geometry reflects higher-order dynamics in protein interaction networks. Sci Rep 2022;12:20879. [PMID: 36463292 PMCID: PMC9719542 DOI: 10.1038/s41598-022-24584-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 11/17/2022] [Indexed: 12/05/2022] Open

Makrooni MA, O’Shea D, Geeleher P, Seoighe C. Random-effects meta-analysis of effect sizes as a unified framework for gene set analysis. PLoS Comput Biol 2022;18:e1010278. [PMID: 36197939 PMCID: PMC9576052 DOI: 10.1371/journal.pcbi.1010278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 10/17/2022] [Accepted: 09/18/2022] [Indexed: 11/06/2022] Open

Abstract

Gene set analysis (GSA) remains a common step in genome-scale studies because it can reveal insights that are not apparent from results obtained for individual genes. Many different computational tools are applied for GSA, which may be sensitive to different types of signals; however, most methods implicitly test whether there are differences in the distribution of the effect of some experimental condition between genes in gene sets of interest. We have developed a unifying framework for GSA that first fits effect size distributions, and then tests for differences in these distributions between gene sets. These differences can be in the proportions of genes that are perturbed or in the sign or size of the effects. Inspired by statistical meta-analysis, we take into account the uncertainty in effect size estimates by reducing the influence of genes with greater uncertainty on the estimation of distribution parameters. We demonstrate, using simulation and by application to real data, that this approach provides significant gains in performance over existing methods. Furthermore, the statistical tests carried out are defined in terms of effect sizes, rather than the results of prior statistical tests measuring these changes, which leads to improved interpretability and greater robustness to variation in sample sizes.

The role of gene set analysis is to identify groups of genes that are perturbed in a genomics experiment. There are many tools available for this task and they do not all test for the same types of changes. Here we propose a new way to carry out gene set analysis that involves first working out the distribution of the group effect in the gene set and then comparing this distribution to the equivalent distribution in other genes. Tests performed by existing tools for gene set analysis can be related to different comparisons in these distributions of group effects. A unified framework for gene set analysis provides for more explicit null hypotheses against which to test sets of genes for different types of responses to the experimental conditions. These results are more interpretable, because the group effect distributions can be compared visually, providing an indication of how the experimental effect differs between the gene sets.

Collapse

Chen B, Zhang J, Wang T, Shao C, Miao L, Zhang S, Shang X. Investigating the evolution process of lung adenocarcinoma via random walk and dynamic network analysis. Front Genet 2022;13:953801. [PMID: 36246662 PMCID: PMC9559577 DOI: 10.3389/fgene.2022.953801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 09/05/2022] [Indexed: 11/30/2022] Open

Nine quick tips for pathway enrichment analysis. PLoS Comput Biol 2022;18:e1010348. [PMID: 35951505 PMCID: PMC9371296 DOI: 10.1371/journal.pcbi.1010348] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

High-Depth Transcriptome Reveals Differences in Natural Haploid Ginkgo biloba L. Due to the Effect of Reduced Gene Dosage. Int J Mol Sci 2022;23:ijms23168958. [PMID: 36012222 PMCID: PMC9409250 DOI: 10.3390/ijms23168958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/31/2022] [Accepted: 08/10/2022] [Indexed: 12/13/2022] Open

Yen NTH, Park SM, Thu VTA, Phat NK, Cho YS, Yoon S, Shin JG, Kim DH, Oh JH, Long NP. Genome-wide gene expression analysis reveals molecular insights into the drug-induced toxicity of nephrotoxic agents. Life Sci 2022;306:120801. [PMID: 35850247 DOI: 10.1016/j.lfs.2022.120801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/30/2022] [Accepted: 07/09/2022] [Indexed: 11/17/2022]

Affiliation(s)

Nguyen Thi Hai Yen Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea
Se-Myo Park Department of Predictive Toxicology, Korea Institute of Toxicology, Daejeon 34114, Republic of Korea
Vo Thuy Anh Thu Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea
Nguyen Ky Phat Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea
Yong-Soon Cho Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea
Seokjoo Yoon Department of Predictive Toxicology, Korea Institute of Toxicology, Daejeon 34114, Republic of Korea
Jae-Gook Shin Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea
Dong Hyun Kim Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea
Jung-Hwa Oh Department of Predictive Toxicology, Korea Institute of Toxicology, Daejeon 34114, Republic of Korea.
Nguyen Phuoc Long Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 614-735, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan 614-735, Republic of Korea.

Collapse

Saravanakumar K, Santosh SS, Ahamed MA, Sathiyaseelan A, Sultan G, Irfan N, Ali DM, Wang MH. Bioinformatics strategies for studying the molecular mechanisms of fungal extracellular vesicles with a focus on infection and immune responses. Brief Bioinform 2022;23:6632620. [PMID: 35794708 DOI: 10.1093/bib/bbac250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/16/2022] [Accepted: 05/28/2022] [Indexed: 01/19/2023] Open

Pathway importance by graph convolutional network and Shapley additive explanations in gene expression phenotype of diffuse large B-cell lymphoma. PLoS One 2022;17:e0269570. [PMID: 35749395 PMCID: PMC9231717 DOI: 10.1371/journal.pone.0269570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 05/09/2022] [Indexed: 11/30/2022] Open

Mubeen S, Tom Kodamullil A, Hofmann-Apitius M, Domingo-Fernández D. On the influence of several factors on pathway enrichment analysis. Brief Bioinform 2022;23:bbac143. [PMID: 35453140 PMCID: PMC9116215 DOI: 10.1093/bib/bbac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/21/2022] [Accepted: 03/30/2022] [Indexed: 02/01/2023] Open

Wijesooriya K, Jadaan SA, Perera KL, Kaur T, Ziemann M. Urgent need for consistent standards in functional enrichment analysis. PLoS Comput Biol 2022;18:e1009935. [PMID: 35263338 PMCID: PMC8936487 DOI: 10.1371/journal.pcbi.1009935] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 03/21/2022] [Accepted: 02/18/2022] [Indexed: 11/25/2022] Open

Abstract

Gene set enrichment tests (a.k.a. functional enrichment analysis) are among the most frequently used methods in computational biology. Despite this popularity, there are concerns that these methods are being applied incorrectly and the results of some peer-reviewed publications are unreliable. These problems include the use of inappropriate background gene lists, lack of false discovery rate correction and lack of methodological detail. To ascertain the frequency of these issues in the literature, we performed a screen of 186 open-access research articles describing functional enrichment results. We find that 95% of analyses using over-representation tests did not implement an appropriate background gene list or did not describe this in the methods. Failure to perform p-value correction for multiple tests was identified in 43% of analyses. Many studies lacked detail in the methods section about the tools and gene sets used. An extension of this survey showed that these problems are not associated with journal or article level bibliometrics. Using seven independent RNA-seq datasets, we show misuse of enrichment tools alters results substantially. In conclusion, most published functional enrichment studies suffered from one or more major flaws, highlighting the need for stronger standards for enrichment analysis.

Functional enrichment analysis is a commonly used technique to identify trends in large scale biological datasets. In biomedicine, functional enrichment analysis of gene expression data is frequently applied to identify disease and drug mechanisms. While enrichment tests were once primarily conducted with complicated computer scripts, web-based tools are becoming more widely used. Users can paste a list of genes into a website and receive enrichment results in a matter of seconds. Despite the popularity of these tools, there are concerns that statistical problems and incomplete reporting are compromising research quality. In this article, we conducted a systematic examination of published enrichment analyses and assessed whether (i) any statistical flaws were present and (ii) sufficient methodological detail is provided such that the study could be replicated. We found that lack of methodological detail and errors in statistical analysis were widespread, which undermines the reliability and reproducibility of these research articles. A set of best practices is urgently needed to raise the quality of published work.

Collapse

Woodward AA, Taylor DM, Goldmuntz E, Mitchell LE, Agopian A, Moore JH, Urbanowicz RJ. Gene-Interaction-Sensitive enrichment analysis in congenital heart disease. BioData Min 2022;15:4. [PMID: 35151364 PMCID: PMC8841104 DOI: 10.1186/s13040-022-00287-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 01/17/2022] [Indexed: 11/24/2022] Open

Hamdaoui Q, Zekri Y, Richard S, Aubert D, Guyot R, Markossian S, Gauthier K, Gaie-Levrel F, Bencsik A, Flamant F. Prenatal exposure to paraquat and nanoscaled TiO₂ aerosols alters the gene expression of the developing brain. CHEMOSPHERE 2022;287:132253. [PMID: 34543901 DOI: 10.1016/j.chemosphere.2021.132253] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 09/03/2021] [Accepted: 09/13/2021] [Indexed: 06/13/2023]

Wong LM, Li WT, Shende N, Tsai JC, Ma J, Chakladar J, Gnanasekar A, Qu Y, Dereschuk K, Wang-Rodriguez J, Ongkeko WM. Analysis of the immune landscape in virus-induced cancers using a novel integrative mechanism discovery approach. Comput Struct Biotechnol J 2021;19:6240-6254. [PMID: 34900135 PMCID: PMC8636736 DOI: 10.1016/j.csbj.2021.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 11/11/2021] [Accepted: 11/11/2021] [Indexed: 11/17/2022] Open

Affiliation(s)

Lindsay M. Wong Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Wei Tse Li Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Neil Shende Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Joseph C. Tsai Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Jiayan Ma Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Jaideep Chakladar Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Aditi Gnanasekar Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Yuanhao Qu Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Kypros Dereschuk Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Jessica Wang-Rodriguez Department of Pathology, University of California San Diego, La Jolla, CA 92093, USA Pathology Service, VA San Diego Healthcare System, San Diego, CA 92161, USA
Weg M. Ongkeko Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA, 92093, USA Research Service, VA San Diego Healthcare System, San Diego, CA 92161, USA Corresponding author at: Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of California, San Diego, La Jolla, CA 92093, USA.

Collapse

Establishment and Validation of an MTORC1 Signaling-Related Gene Signature to Predict Overall Survival in Patients with Hepatocellular Carcinoma. BIOMED RESEARCH INTERNATIONAL 2021;2021:6299472. [PMID: 34853791 PMCID: PMC8629633 DOI: 10.1155/2021/6299472] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/01/2021] [Accepted: 11/05/2021] [Indexed: 12/14/2022]

Maleki F, Ovens K, McQuillan I, Kusalik AJ. Silver: Forging almost Gold Standard Datasets. Genes (Basel) 2021;12:genes12101523. [PMID: 34680918 PMCID: PMC8535810 DOI: 10.3390/genes12101523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/19/2021] [Accepted: 09/22/2021] [Indexed: 11/16/2022] Open

Huang G, Zhang H, Qu Y, Huang K, Gong X, Wei J, Du H. ARMT: An automatic RNA-seq data mining tool based on comprehensive and integrative analysis in cancer research. Comput Struct Biotechnol J 2021;19:4426-4434. [PMID: 34471489 PMCID: PMC8379379 DOI: 10.1016/j.csbj.2021.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/19/2021] [Accepted: 08/06/2021] [Indexed: 11/02/2022] Open

Joly JH, Lowry WE, Graham NA. Differential Gene Set Enrichment Analysis: a statistical approach to quantify the relative enrichment of two gene sets. Bioinformatics 2021;36:5247-5254. [PMID: 32692836 DOI: 10.1093/bioinformatics/btaa658] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Revised: 06/24/2020] [Accepted: 07/15/2020] [Indexed: 01/30/2023] Open

Bryan J, Mandan A, Kamat G, Gottschalk WK, Badea A, Adams KJ, Thompson JW, Colton CA, Mukherjee S, Lutz MW. Likelihood ratio statistics for gene set enrichment in Alzheimer's disease pathways. Alzheimers Dement 2021;17:561-573. [PMID: 33480182 PMCID: PMC8044005 DOI: 10.1002/alz.12223] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Chen L, Cao W, Aita R, Aldea D, Flores J, Gao N, Bonder EM, Ellison CE, Verzi MP. Three-dimensional interactions between enhancers and promoters during intestinal differentiation depend upon HNF4. Cell Rep 2021;34:108679. [PMID: 33503426 PMCID: PMC7899294 DOI: 10.1016/j.celrep.2020.108679] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 10/23/2020] [Accepted: 12/30/2020] [Indexed: 12/20/2022] Open

Application of Transcriptional Gene Modules to Analysis of Caenorhabditis elegans' Gene Expression Data. G3-GENES GENOMES GENETICS 2020;10:3623-3638. [PMID: 32759329 PMCID: PMC7534440 DOI: 10.1534/g3.120.401270] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene Set Analysis: Challenges, Opportunities, and Future Research. Front Genet 2020;11:654. [PMID: 32695141 PMCID: PMC7339292 DOI: 10.3389/fgene.2020.00654] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/29/2020] [Indexed: 12/14/2022] Open

Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges. ENTROPY 2020;22:e22040427. [PMID: 33286201 PMCID: PMC7516904 DOI: 10.3390/e22040427] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/18/2020] [Accepted: 04/03/2020] [Indexed: 12/22/2022]

Pfeil J, Sanders LM, Anastopoulos I, Lyle AG, Weinstein AS, Xue Y, Blair A, Beale HC, Lee A, Leung SG, Dinh PT, Shah AT, Breese MR, Devine WP, Bjork I, Salama SR, Sweet-Cordero EA, Haussler D, Vaske OM. Hydra: A mixture modeling framework for subtyping pediatric cancer cohorts using multimodal gene expression signatures. PLoS Comput Biol 2020;16:e1007753. [PMID: 32275708 PMCID: PMC7176284 DOI: 10.1371/journal.pcbi.1007753] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 04/22/2020] [Accepted: 02/28/2020] [Indexed: 01/21/2023] Open

Affiliation(s)

Jacob Pfeil Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Lauren M. Sanders Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, California, United States of America
Ioannis Anastopoulos Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
A. Geoffrey Lyle Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, California, United States of America
Alana S. Weinstein Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Yuanqing Xue Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Andrew Blair Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Holly C. Beale Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, California, United States of America
Alex Lee Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
Stanley G. Leung Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
Phuong T. Dinh Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
Avanthi Tayi Shah Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
Marcus R. Breese Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
W. Patrick Devine Department of Anatomic Pathology, University of California, San Francisco, California, San Francisco, United States of America
Isabel Bjork Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Sofie R. Salama Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
E. Alejandro Sweet-Cordero Department of Pediatrics, Division of Hematology and Oncology, University of California, San Francisco, San Francisco, California, United States of America
David Haussler Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California, United States of America Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America
Olena Morozova Vaske Genomics Institute, University of California, Santa Cruz, Santa Cruz, California, United States of America Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, California, United States of America

Collapse

Yuan K, Feng Y, Wang H, Zhao L, Wang W, Wang T, Feng Y, Huang G, Xu A. FGL2 is positively correlated with enhanced antitumor responses mediated by T cells in lung adenocarcinoma. PeerJ 2020;8:e8654. [PMID: 32206449 PMCID: PMC7075367 DOI: 10.7717/peerj.8654] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 01/28/2020] [Indexed: 12/17/2022] Open

Abstract

Lung cancer is the most common malignant tumor, accounting for 25% of cancer-related deaths and 14% of new cancers worldwide. Lung adenocarcinoma is the most common type of pulmonary cancer. Although there have been some improvements in the traditional therapy of lung cancer, the outcome and prognosis of patients remain poor. Lung cancer is the leading cause of cancer-related deaths worldwide, with 1.8 million new cases being diagnosed each year. Precision medicine based on genetic alterations is considered a new strategy of lung cancer treatment that requires highly specific biomarkers for precision diagnosis and treatment. Fibrinogen-like protein 2 (FGL2) plays important roles in both innate and adaptive immunity. However, the diagnostic value of FGL2 in lung cancer is largely unknown. In this study, we systematically investigated the expression profile and potential functions of FGL2 in lung adenocarcinoma. We used the TCGA and Oncomine datasets to compare the FGL2 expression levels between lung adenocarcinoma and adjacent normal tissues. We utilized the GEPIA, PrognoScan and Kaplan-Meier plotter databases to analyze the relationship between FGL2 expression and the survival of lung adenocarcinoma patients. Then, we investigated the potential roles of FGL2 in lung adenocarcinoma with the TIMER database and functional enrichment analyses. We found that FGL2 expression was significantly lower in lung adenocarcinoma tissue compared with adjacent normal tissue. A high expression level of FGL2 was correlated with better prognostic outcomes of lung adenocarcinoma patients, including overall survival and progression-free survival. FGL2 was positively correlated with the infiltration of immune cells, including dendritic cells, CD8+ T cells, macrophages, B cells, and CD4+ T cells, in lung adenocarcinoma. Functional enrichment analyses also showed that a high expression level of FGL2 was positively correlated with enhanced T cell activities, especially CD8+ T cell activation. Thus, we propose that high FGL2 expression, which is positively associated with enhanced antitumor activities mediated by T cells, is a beneficial marker for lung adenocarcinoma treatment outcomes.

Collapse

Chang HC, Chu CP, Lin SJ, Hsiao CK. Network hub-node prioritization of gene regulation with intra-network association. BMC Bioinformatics 2020;21:101. [PMID: 32164570 PMCID: PMC7069025 DOI: 10.1186/s12859-020-3444-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 03/06/2020] [Indexed: 11/10/2022] Open

Abstract

Background

To identify and prioritize the influential hub genes in a gene-set or biological pathway, most analyses rely on calculation of marginal effects or tests of statistical significance. These procedures may be inappropriate since hub nodes are common connection points and therefore may interact with other nodes more often than non-hub nodes do. Such dependence among gene nodes can be conjectured based on the topology of the pathway network or the correlation between them.

Results

Here we develop a pathway activity score incorporating the marginal (local) effects of gene nodes as well as intra-network affinity measures. This score summarizes the expression levels in a gene-set/pathway for each sample, with weights on local and network information, respectively. The score is next used to examine the impact of each node through a leave-one-out evaluation. To illustrate the procedure, two cancer studies, one involving RNA-Seq from breast cancer patients with high-grade ductal carcinoma in situ and one microarray expression data from ovarian cancer patients, are used to assess the performance of the procedure, and to compare with existing methods, both ones that do and do not take into consideration correlation and network information. The hub nodes identified by the proposed procedure in the two cancer studies are known influential genes; some have been included in standard treatments and some are currently considered in clinical trials for target therapy. The results from simulation studies show that when marginal effects are mild or weak, the proposed procedure can still identify causal nodes, whereas methods relying only on marginal effect size cannot.

Conclusions

The NetworkHub procedure proposed in this research can effectively utilize the network information in combination with local effects derived from marker values, and provide a useful and complementary list of recommendations for prioritizing causal hubs.

Collapse

Geistlinger L, Csaba G, Santarelli M, Ramos M, Schiffer L, Turaga N, Law C, Davis S, Carey V, Morgan M, Zimmer R, Waldron L. Toward a gold standard for benchmarking gene set enrichment analysis. Brief Bioinform 2020;22:545-556. [PMID: 32026945 PMCID: PMC7820859 DOI: 10.1093/bib/bbz158] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 10/11/2019] [Accepted: 11/09/2019] [Indexed: 12/22/2022] Open

Lauria A, Peirone S, Giudice MD, Priante F, Rajan P, Caselle M, Oliviero S, Cereda M. Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles. Nucleic Acids Res 2020;48:1730-1747. [PMID: 31889184 PMCID: PMC7038995 DOI: 10.1093/nar/gkz1208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 12/05/2019] [Accepted: 12/17/2019] [Indexed: 12/31/2022] Open

Park HW, Weiss ST. Understanding the Molecular Mechanisms of Asthma through Transcriptomics. ALLERGY, ASTHMA & IMMUNOLOGY RESEARCH 2020;12:399-411. [PMID: 32141255 PMCID: PMC7061151 DOI: 10.4168/aair.2020.12.3.399] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 01/01/2020] [Accepted: 01/11/2020] [Indexed: 12/18/2022]

Zyla J, Marczyk M, Domaszewska T, Kaufmann SHE, Polanska J, Weiner J. Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms. Bioinformatics 2019;35:5146-5154. [PMID: 31165139 PMCID: PMC6954644 DOI: 10.1093/bioinformatics/btz447] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 05/08/2019] [Accepted: 06/10/2019] [Indexed: 01/12/2023] Open

Mandelboum S, Manber Z, Elroy-Stein O, Elkon R. Recurrent functional misinterpretation of RNA-seq data caused by sample-specific gene length bias. PLoS Biol 2019;17:e3000481. [PMID: 31714939 PMCID: PMC6850523 DOI: 10.1371/journal.pbio.3000481] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Accepted: 10/08/2019] [Indexed: 11/19/2022] Open

Abstract

Data normalization is a critical step in RNA sequencing (RNA-seq) analysis, aiming to remove systematic effects from the data to ensure that technical biases have minimal impact on the results. Analyzing numerous RNA-seq datasets, we detected a prevalent sample-specific length effect that leads to a strong association between gene length and fold-change estimates between samples. This stochastic sample-specific effect is not corrected by common normalization methods, including reads per kilobase of transcript length per million reads (RPKM), Trimmed Mean of M values (TMM), relative log expression (RLE), and quantile and upper-quartile normalization. Importantly, we demonstrate that this bias causes recurrent false positive calls by gene-set enrichment analysis (GSEA) methods, thereby leading to frequent functional misinterpretation of the data. Gene sets characterized by markedly short genes (e.g., ribosomal protein genes) or long genes (e.g., extracellular matrix genes) are particularly prone to such false calls. This sample-specific length bias is effectively removed by the conditional quantile normalization (cqn) and EDASeq methods, which allow the integration of gene length as a sample-specific covariate. Consequently, using these normalization methods led to substantial reduction in GSEA false results while retaining true ones. In addition, we found that application of gene-set tests that take into account gene–gene correlations attenuates false positive rates caused by the length bias, but statistical power is reduced as well. Our results advocate the inspection and correction of sample-specific length biases as default steps in RNA-seq analysis pipelines and reiterate the need to account for intergene correlations when performing gene-set enrichment tests to lessen false interpretation of transcriptomic data.

Analysis of numerous RNA-seq datasets reveals a recurrent sample-specific length bias that causes frequent false positive calls by gene-set enrichment analyses, leading to functional misinterpretation of the data. Its removal requires methods that allow the integration of gene length as sample-specific covariate.

Collapse

Maleki F, Ovens K, McQuillan I, Kusalik AJ. Size matters: how sample size affects the reproducibility and specificity of gene set analysis. Hum Genomics 2019;13:42. [PMID: 31639047 PMCID: PMC6805317 DOI: 10.1186/s40246-019-0226-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Glazko G, Zybailov B, Emmert-Streib F, Baranova A, Rahmatallah Y. Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study. PLoS One 2019;14:e0221444. [PMID: 31437237 PMCID: PMC6705791 DOI: 10.1371/journal.pone.0221444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/06/2019] [Indexed: 01/10/2023] Open

Sun Y, Ling C. Analysis of the long non-coding RNA LINC01614 in non-small cell lung cancer. Medicine (Baltimore) 2019;98:e16437. [PMID: 31348244 PMCID: PMC6708815 DOI: 10.1097/md.0000000000016437] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Improving the power of gene set enrichment analyses. BMC Bioinformatics 2019;20:257. [PMID: 31101008 PMCID: PMC6525372 DOI: 10.1186/s12859-019-2850-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 04/25/2019] [Indexed: 01/06/2023] Open

Abstract

BACKGROUND

Set enrichment methods are commonly used to analyze high-dimensional molecular data and gain biological insight into molecular or clinical phenotypes. One important category of analysis methods employs an enrichment score, which is created from ranked univariate correlations between phenotype and each molecular attribute. Estimates of the significance of the associations are determined via a null distribution generated from phenotype permutation. We investigate some statistical properties of this method and demonstrate how alternative assessments of enrichment can be used to increase the statistical power of such analyses to detect associations between phenotype and biological processes and pathways.

RESULTS

For this category of set enrichment analysis, the null distribution is largely independent of the number of samples with available molecular data. Hence, providing the sample cohort is not too small, we show that increased statistical power to identify associations between biological processes and phenotype can be achieved by splitting the cohort into two halves and using the average of the enrichment scores evaluated for each half as an alternative test statistic. Further, we demonstrate that this principle can be extended by averaging over multiple random splits of the cohort into halves. This enables the calculation of an enrichment statistic and associated p value of arbitrary precision, independent of the exact random splits used.

CONCLUSIONS

It is possible to increase the statistical power of gene set enrichment analyses that employ enrichment scores created from running sums of univariate phenotype-attribute correlations and phenotype-permutation generated null distributions. This increase can be achieved by using alternative test statistics that average enrichment scores calculated for splits of the dataset. Apart from the special case of a close balance between up- and down-regulated genes within a gene set, statistical power can be improved, or at least maintained, by this method down to small sample sizes, where accurate assessment of univariate phenotype-gene correlations becomes unfeasible.

Collapse

Chen L, Toke NH, Luo S, Vasoya RP, Fullem RL, Parthasarathy A, Perekatt AO, Verzi MP. A reinforcing HNF4-SMAD4 feed-forward module stabilizes enterocyte identity. Nat Genet 2019;51:777-785. [PMID: 30988513 PMCID: PMC6650150 DOI: 10.1038/s41588-019-0384-0] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 02/28/2019] [Indexed: 12/30/2022]

Qin W, Wang X, Zhao H, Lu H. A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis. Front Genet 2019;10:293. [PMID: 31031796 PMCID: PMC6473067 DOI: 10.3389/fgene.2019.00293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 03/19/2019] [Indexed: 12/25/2022] Open