1
|
Martínez-García G, Estrada K, Lira-Amaya JJ, Santamaria-Epinosa RM, Lopez-Arellano ME, Sciutto-Conde EL, Rojas-Martinez C, Alvarez-Martínez JA, Sánchez-Flores A, Figueroa-Millán JV. Comparative Analysis of Immune Response Genes Induced by a Virulent or Attenuated Strain of Babesia bigemina. Int J Mol Sci 2025; 26:487. [PMID: 39859202 PMCID: PMC11764604 DOI: 10.3390/ijms26020487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 12/30/2024] [Accepted: 01/07/2025] [Indexed: 01/27/2025] Open
Abstract
RNA-seq technology has been widely used for the characterization of the transcriptome profile induced by several diseases in both humans and animals. In the present study, RNA-seq was used to identify the differential expression of genes associated with the immune response in cattle infected with two different strains of Babesia bigemina, both derived from the same Mexican field isolate, which exhibit distinct phenotypic characteristics: the virulent strain, capable of producing acute clinical signs, and the attenuated strain, capable of stimulating a protective immune response when used as an immunogen with an efficacy greater than 80%. The differential gene expression analysis performed revealed a total of 620 differentially expressed genes (DEGs). However, the intersection of the edgeR and DESeq2 programs used in the bioinformatics analysis only identified 247 DEGs, of which 108 genes were enriched to be closely correlated with the bovine immune response based on gene ontology terms; most of the DEGs obtained encode proteins associated with the major histocompatibility complex, immunoglobulins, and T-cell surface receptors. The infection caused by the attenuated strain induced higher transcription of immune response genes compared to the infection caused by the virulent strain; nonetheless, in both infections, a greater down-regulation than up-regulation was observed. Different immunoglobulin-associated genes were found to be up-regulated in the group inoculated with the attenuated strain, whereas these were down-regulated in the virulent strain-inoculated group. In addition, an up-regulation of the HSPA6, CD163, and SLC11a1 genes was observed in the group inoculated with the virulent strain, previously reported in other Apicomplexan infections. The findings provide relevant information that could contribute to clarifying the immune response associated with an acute bovine babesiosis infection by B. bigemina.
Collapse
Affiliation(s)
- Grecia Martínez-García
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
- Facultad de Medicina Veterinaria y Zootecnia, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Karel Estrada
- Unidad Universitaria de Secuenciación Masiva y Bioinformática, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca 62209, Mexico; (K.E.); (A.S.-F.)
| | - José J. Lira-Amaya
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| | - Rebeca M. Santamaria-Epinosa
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| | - María E. Lopez-Arellano
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| | - Edda L. Sciutto-Conde
- Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| | - Carmen Rojas-Martinez
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| | - Jesus A. Alvarez-Martínez
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| | - Alejandro Sánchez-Flores
- Unidad Universitaria de Secuenciación Masiva y Bioinformática, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca 62209, Mexico; (K.E.); (A.S.-F.)
| | - Julio V. Figueroa-Millán
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62550, Mexico; (G.M.-G.); (J.J.L.-A.); (R.M.S.-E.); (M.E.L.-A.); (C.R.-M.); (J.A.A.-M.)
| |
Collapse
|
2
|
Rahmatallah Y, Glazko G. Improving data interpretability with new differential sample variance gene set tests. RESEARCH SQUARE 2024:rs.3.rs-4888767. [PMID: 39315246 PMCID: PMC11419169 DOI: 10.21203/rs.3.rs-4888767/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Background Gene set analysis methods have played a major role in generating biological interpretations from omics data such as gene expression datasets. However, most methods focus on detecting homogenous pattern changes in mean expression and methods detecting pattern changes in variance remain poorly explored. While a few studies attempted to use gene-level variance analysis, such approach remains under-utilized. When comparing two phenotypes, gene sets with distinct changes in subgroups under one phenotype are overlooked by available methods although they reflect meaningful biological differences between two phenotypes. Multivariate sample-level variance analysis methods are needed to detect such pattern changes. Results We use ranking schemes based on minimum spanning tree to generalize the Cramer-Von Mises and Anderson-Darling univariate statistics into multivariate gene set analysis methods to detect differential sample variance or mean. We characterize these methods in addition to two methods developed earlier using simulation results with different parameters. We apply the developed methods to microarray gene expression dataset of prednisolone-resistant and prednisolone-sensitive children diagnosed with B-lineage acute lymphoblastic leukemia and bulk RNA-sequencing gene expression dataset of benign hyperplastic polyps and potentially malignant sessile serrated adenoma/polyps. One or both of the two compared phenotypes in each of these datasets have distinct molecular subtypes that contribute to heterogeneous differences. Our results show that methods designed to detect differential sample variance are able to detect specific hallmark signaling pathways associated with the two compared phenotypes as documented in available literature. Conclusions The results in this study demonstrate the usefulness of methods designed to detect differential sample variance in providing biological interpretations when biologically relevant but heterogeneous changes between two phenotypes are prevalent in specific signaling pathways. Software implementation of the developed methods is available with detailed documentation from Bioconductor package GSAR. The available methods are applicable to gene expression datasets in a normalized matrix form and could be used with other omics datasets in a normalized matrix form with available collection of feature sets.
Collapse
Affiliation(s)
- Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| | - Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| |
Collapse
|
3
|
Wang X, Lian Q, Dong H, Xu S, Su Y, Wu X. Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae014. [PMID: 39049508 PMCID: PMC11423854 DOI: 10.1093/gpbjnl/qzae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 06/20/2023] [Accepted: 06/25/2023] [Indexed: 07/27/2024]
Abstract
Gene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA sequencing (RNA-seq) data, which helps to decipher single-cell heterogeneity and cell type-specific variability by incorporating prior knowledge from functional gene sets. Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a powerful technique for interrogating single-cell chromatin-based gene regulation, and genes or gene sets with dynamic regulatory potentials can be regarded as cell type-specific markers as if in single-cell RNA-seq (scRNA-seq). However, there are few GSS tools specifically designed for scATAC-seq, and the applicability and performance of RNA-seq GSS tools on scATAC-seq data remain to be investigated. Here, we systematically benchmarked ten GSS tools, including four bulk RNA-seq tools, five scRNA-seq tools, and one scATAC-seq method. First, using matched scATAC-seq and scRNA-seq datasets, we found that the performance of GSS tools on scATAC-seq data was comparable to that on scRNA-seq, suggesting their applicability to scATAC-seq. Then, the performance of different GSS tools was extensively evaluated using up to ten scATAC-seq datasets. Moreover, we evaluated the impact of gene activity conversion, dropout imputation, and gene set collections on the results of GSS. Results show that dropout imputation can significantly promote the performance of almost all GSS tools, while the impact of gene activity conversion methods or gene set collections on GSS performance is more dependent on GSS tools or datasets. Finally, we provided practical guidelines for choosing appropriate preprocessing methods and GSS tools in different application scenarios.
Collapse
Affiliation(s)
- Xi Wang
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Haoyu Dong
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Shuo Xu
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Yaru Su
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| |
Collapse
|
4
|
Candia J, Ferrucci L. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks. PLoS One 2024; 19:e0302696. [PMID: 38753612 PMCID: PMC11098418 DOI: 10.1371/journal.pone.0302696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/09/2024] [Indexed: 05/18/2024] Open
Abstract
Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools.
Collapse
Affiliation(s)
- Julián Candia
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| | - Luigi Ferrucci
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD, United States of America
| |
Collapse
|
5
|
Lin Q, Cai B, Ke R, Chen L, Ni X, Liu H, Lin X, Wang B, Shan X. Integrative bioinformatics and experimental validation of hub genetic markers in acne vulgaris: Toward personalized diagnostic and therapeutic strategies. J Cosmet Dermatol 2024; 23:1777-1799. [PMID: 38268224 DOI: 10.1111/jocd.16152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 12/10/2023] [Indexed: 01/26/2024]
Abstract
BACKGROUND Acne vulgaris is a widespread chronic inflammatory dermatological condition. The precise molecular and genetic mechanisms of its pathogenesis remain incompletely understood. This research synthesizes existing databases, targeting a comprehensive exploration of core genetic markers. METHODS Gene expression datasets (GSE6475, GSE108110, and GSE53795) were retrieved from the GEO. Differentially expressed genes (DEGs) were identified using the limma package. Enrichment analyses were conducted using GSVA for pathway assessment and clusterProfiler for GO and KEGG analyses. PPI networks and immune cell infiltration were analyzed using the STRING database and ssGSEA, respectively. We investigated the correlation between hub gene biomarkers and immune cell infiltration using Spearman's rank analysis. ROC curve analysis validated the hub genes' diagnostic accuracy. miRNet, TarBase v8.0, and ChEA3 identified miRNA/transcription factor-gene interactions, while DrugBank delineated drug-gene interactions. Experiments utilized HaCaT cells stimulated with Propionibacterium acnes, treated with retinoic acid and methotrexate, and evaluated using RT-qPCR, ELISA, western blot, lentiviral transduction, CCK-8, wound-healing, and transwell assays. RESULTS There were 104 genes with consistent differences across the three datasets of paired acne and normal skin. Functional analyses emphasized the significant enrichment of these DEGs in immune-related pathways. PPI network analysis pinpointed hub genes PTPRC, CXCL8, ITGB2, and MMP9 as central players in acne pathogenesis. Elevated levels of specific immune cell infiltration in acne lesions corroborated the inflammatory nature of the disease. ROC curve analysis identified the acne diagnostic potential of four hub genes. Key miRNAs, particularly hsa-mir-124-3p, and central transcription factors like TFEC were noted as significant regulators. In vitro validation using HaCaT cells confirmed the upregulation of hub genes following Propionibacterium acnes exposure, while CXCL8 knockdown reduced pro-inflammatory cytokines, cell proliferation, and migration. DrugBank insights led to the exploration of retinoic acid and methotrexate, both of which mitigated gene expression upsurge and inflammatory mediator secretion. CONCLUSION This comprehensive study elucidated pivotal genes associated with acne pathogenesis, notably PTPRC, CXCL8, ITGB2, and MMP9. The findings underscore potential biomarkers, therapeutic targets, and the therapeutic potential of agents like retinoic acid and methotrexate. The congruence between bioinformatics and experimental validations suggests promising avenues for personalized acne treatments.
Collapse
Affiliation(s)
- Qian Lin
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
- Fujian Key Laboratory of Translational Research in Cancer and Neurodegenerative Diseases, Institute for Translational Medicine, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian, China
| | - Beichen Cai
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
- Fujian Key Laboratory of Translational Research in Cancer and Neurodegenerative Diseases, Institute for Translational Medicine, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian, China
| | - Ruonan Ke
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
- Key Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, Fuzhou, Fujian, China
| | - Lu Chen
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
| | - Xuejun Ni
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
| | - Hekun Liu
- Fujian Key Laboratory of Translational Research in Cancer and Neurodegenerative Diseases, Institute for Translational Medicine, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian, China
| | - Xinjian Lin
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
- Key Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, Fuzhou, Fujian, China
| | - Biao Wang
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
- Fujian Key Laboratory of Translational Research in Cancer and Neurodegenerative Diseases, Institute for Translational Medicine, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian, China
| | - Xiuying Shan
- Department of Plastic Surgery, The First Affiliated Hospital of Fujian Medical University, Fuzhou, Fujian, China
- Department of Plastic Surgery, National Regional Medical Center, Binhai Campus of the First Affiliated Hospital, Fujian Medical University, Fuzhou, Fujian, China
| |
Collapse
|
6
|
Wei S, Shen H, Zhang Y, Liu C, Li S, Yao J, Jin Z, Yu H. Integrative analysis of single-cell and bulk transcriptome data reveal the significant role of macrophages in lupus nephritis. Arthritis Res Ther 2024; 26:84. [PMID: 38610007 PMCID: PMC11010324 DOI: 10.1186/s13075-024-03311-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 03/18/2024] [Indexed: 04/14/2024] Open
Abstract
OBJECTIVE We attempted to identify abnormal immune cell components and signaling pathways in lupus nephritis (LN) and to identify potential therapeutic targets. METHODS Differentially expressed genes (DEGs) between LN and normal kidney tissues were identified from bulk transcriptome data, and functional annotation was performed. The phenotypic changes in macrophages and aberrant intercellular signaling communications within immune cells were imputed from LN scRNA-seq data using trajectory analysis and verified using immunofluorescence staining. Finally, lentivirus-mediated overexpression of LGALS9, the gene encoding Galectin 9, in THP-1 cells was used to study the functional effect of this gene on monocytic cells. RESULTS From bulk transcriptome data, a significant activation of interferon (IFN) signaling was observed, and its intensity showed a significantly positive correlation with the abundance of infiltrating macrophages in LN. Analysis of scRNA-seq data revealed 17 immune cell clusters, with macrophages showing the highest enrichment of intercellular signal communication in LN. Trajectory analysis revealed macrophages in LN undergo a phenotypic change from inflammatory patrolling macrophages to phagocytic and then to antigen-presenting macrophages, and secrete various pro-inflammatory factors and complement components. LGALS9 was found significantly upregulated in macrophages in LN, which was confirmed by the immunofluorescence assay. Gene functional study showed that LGALS9 overexpression in THP-1 cells significantly elicited pro-inflammatory activation, releasing multiple immune cell chemoattractants. CONCLUSION Our results present an important pathophysiological role for macrophages in LN, and our preliminary results demonstrate significant pro-inflammatory effects of LGALS9 gene in LN macrophages.
Collapse
Affiliation(s)
- Shuping Wei
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China
| | - Haiyun Shen
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China
| | - Yidan Zhang
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China
| | - Chunrui Liu
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China
| | - Shoushan Li
- Department of oncology, The Siyang Hospital of Chinese Traditional Medicine, 15 Jiefangbei Road, Zhongxing district, Siyang country, Suqian, 223798, Jiangsu, PR China
| | - Jing Yao
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China
| | - Zhibin Jin
- Department of Ultrasound, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, 321 Zhongshan Road, Nanjing, 210008, Jiangsu, PR China.
| | - Hongliang Yu
- Department of oncology, The Siyang Hospital of Chinese Traditional Medicine, 15 Jiefangbei Road, Zhongxing district, Siyang country, Suqian, 223798, Jiangsu, PR China.
- Department of radiation oncology, The Affiliated Cancer Hospital of Nanjing Medical University & Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research, 42 Baiziting Road, Nanjing, 210007, Jiangsu, PR China.
| |
Collapse
|
7
|
Deng T, Liang M, Du L, Li K, Li J, Qian L, Xue Q, Qiu S, Xu L, Zhang L, Gao X, Li J, Lan X, Gao H. Transcriptome Analysis of Compensatory Growth and Meat Quality Alteration after Varied Restricted Feeding Conditions in Beef Cattle. Int J Mol Sci 2024; 25:2704. [PMID: 38473950 DOI: 10.3390/ijms25052704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 02/17/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open
Abstract
Compensatory growth (CG) is a physiological response that accelerates growth following a period of nutrient limitation, with the potential to improve growth efficiency and meat quality in cattle. However, the underlying molecular mechanisms remain poorly understood. In this study, 60 Huaxi cattle were divided into one ad libitum feeding (ALF) group and two restricted feeding groups (75% restricted, RF75; 50% restricted, RF50) undergoing a short-term restriction period followed by evaluation of CG. Detailed comparisons of growth performance during the experimental period, as well as carcass and meat quality traits, were conducted, complemented by a comprehensive transcriptome analysis of the longissimus dorsi muscle using differential expression analysis, gene set enrichment analysis (GSEA), gene set variation analysis (GSVA), and weighted correlation network analysis (WGCNA). The results showed that irrespective of the restriction degree, the restricted animals exhibited CG, achieving final body weights comparable to the ALF group. Compensating animals showed differences in meat quality traits, such as pH, cooking loss, and fat content, compared to the ALF group. Transcriptomic analysis revealed 57 genes and 31 pathways differentially regulated during CG, covering immune response, acid-lipid metabolism, and protein synthesis. Notably, complement-coagulation-fibrinolytic system synergy was identified as potentially responsible for meat quality optimization in RF75. This study provides novel and valuable genetic insights into the regulatory mechanisms of CG in beef cattle.
Collapse
Affiliation(s)
- Tianyu Deng
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
- Shaanxi Key Laboratory of Molecular Biology for Agriculture, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Mang Liang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Lili Du
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Keanning Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Jinnan Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Li Qian
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Qingqing Xue
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Shiyuan Qiu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Lingyang Xu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Lupei Zhang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xue Gao
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Junya Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Xianyong Lan
- Shaanxi Key Laboratory of Molecular Biology for Agriculture, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Huijiang Gao
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| |
Collapse
|
8
|
Caballé-Mestres A, Berenguer-Llergo A, Stephan-Otto Attolini C. Roastgsa: a comparison of rotation-based scores for gene set enrichment analysis. BMC Bioinformatics 2023; 24:408. [PMID: 37904108 PMCID: PMC10617084 DOI: 10.1186/s12859-023-05510-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 10/02/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Gene-wise differential expression is usually the first major step in the statistical analysis of high-throughput data obtained from techniques such as microarrays or RNA-sequencing. The analysis at gene level is often complemented by interrogating the data in a broader biological context that considers as unit of measure groups of genes that may have a common function or biological trait. Among the vast number of publications about gene set analysis (GSA), the rotation test for gene set analysis, also referred to as roast, is a general sample randomization approach that maintains the integrity of the intra-gene set correlation structure in defining the null distribution of the test. RESULTS We present roastgsa, an R package that contains several enrichment score functions that feed the roast algorithm for hypothesis testing. These implemented methods are evaluated using both simulated and benchmarking data in microarray and RNA-seq datasets. We find that computationally intensive measures based on Kolmogorov-Smirnov (KS) statistics fail to improve the rates of simpler measures of GSA like mean and maxmean scores. We also show the importance of accounting for the gene linear dependence structure of the testing set, which is linked to the loss of effective signature size. Complete graphical representation of the results, including an approximation for the effective signature size, can be obtained as part of the roastgsa output. CONCLUSIONS We encourage the usage of the absmean (non-directional), mean (directional) and maxmean (directional) scores for roast GSA analysis as these are simple measures of enrichment that have presented dominant results in all provided analyses in comparison to the more complex KS measures.
Collapse
Affiliation(s)
- Adrià Caballé-Mestres
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Baldiri Reixac, 10, 08028, Barcelona, Spain
| | - Antoni Berenguer-Llergo
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Baldiri Reixac, 10, 08028, Barcelona, Spain
| | - Camille Stephan-Otto Attolini
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Baldiri Reixac, 10, 08028, Barcelona, Spain.
| |
Collapse
|
9
|
Integrative pathway and network analysis provide insights on flooding-tolerance genes in soybean. Sci Rep 2023; 13:1980. [PMID: 36737640 PMCID: PMC9898312 DOI: 10.1038/s41598-023-28593-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 01/20/2023] [Indexed: 02/05/2023] Open
Abstract
Soybean is highly sensitive to flooding and extreme rainfall. The phenotypic variation of flooding tolerance is a complex quantitative trait controlled by many genes and their interaction with environmental factors. We previously constructed a gene-pool relevant to soybean flooding-tolerant responses from integrated multiple omics and non-omics databases, and selected 144 prioritized flooding tolerance genes (FTgenes). In this study, we proposed a comprehensive framework at the systems level, using competitive (hypergeometric test) and self-contained (sum-statistic, sum-square-statistic) pathway-based approaches to identify biologically enriched pathways through evaluating the joint effects of the FTgenes within annotated pathways. These FTgenes were significantly enriched in 36 pathways in the Gene Ontology database. These pathways were related to plant hormones, defense-related, primary metabolic process, and system development pathways, which plays key roles in soybean flooding-induced responses. We further identified nine key FTgenes from important subnetworks extracted from several gene networks of enriched pathways. The nine key FTgenes were significantly expressed in soybean root under flooding stress in a qRT-PCR analysis. We demonstrated that this systems biology framework is promising to uncover important key genes underlying the molecular mechanisms of flooding-tolerant responses in soybean. This result supplied a good foundation for gene function analysis in further work.
Collapse
|
10
|
Lu Y, Pang Z, Xia J. Comprehensive investigation of pathway enrichment methods for functional interpretation of LC-MS global metabolomics data. Brief Bioinform 2023; 24:bbac553. [PMID: 36572652 PMCID: PMC9851290 DOI: 10.1093/bib/bbac553] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/31/2022] [Accepted: 11/15/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Global or untargeted metabolomics is widely used to comprehensively investigate metabolic profiles under various pathophysiological conditions such as inflammations, infections, responses to exposures or interactions with microbial communities. However, biological interpretation of global metabolomics data remains a daunting task. Recent years have seen growing applications of pathway enrichment analysis based on putative annotations of liquid chromatography coupled with mass spectrometry (LC-MS) peaks for functional interpretation of LC-MS-based global metabolomics data. However, due to intricate peak-metabolite and metabolite-pathway relationships, considerable variations are observed among results obtained using different approaches. There is an urgent need to benchmark these approaches to inform the best practices. RESULTS We have conducted a benchmark study of common peak annotation approaches and pathway enrichment methods in current metabolomics studies. Representative approaches, including three peak annotation methods and four enrichment methods, were selected and benchmarked under different scenarios. Based on the results, we have provided a set of recommendations regarding peak annotation, ranking metrics and feature selection. The overall better performance was obtained for the mummichog approach. We have observed that a ~30% annotation rate is sufficient to achieve high recall (~90% based on mummichog), and using semi-annotated data improves functional interpretation. Based on the current platforms and enrichment methods, we further propose an identifiability index to indicate the possibility of a pathway being reliably identified. Finally, we evaluated all methods using 11 COVID-19 and 8 inflammatory bowel diseases (IBD) global metabolomics datasets.
Collapse
Affiliation(s)
- Yao Lu
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Jianguo Xia
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
- Institute of Parasitology, McGill University, Quebec, Canada
| |
Collapse
|
11
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
12
|
Han D, Zulewska J, Xiong K, Yang Z. Synergy between oligosaccharides and probiotics: From metabolic properties to beneficial effects. Crit Rev Food Sci Nutr 2022; 64:4078-4100. [PMID: 36315042 DOI: 10.1080/10408398.2022.2139218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Synbiotic is defined as the dietary mixture that comprises both probiotic microorganisms and prebiotic substrates. The concept has been steadily gaining attention owing to the rising recognition of probiotic, prebiotics, and gut health. Among prebiotic substances, oligosaccharides demonstrated considerable health beneficial effects in varieties of food products and their combination with probiotics have been subjected to full range of evaluations. This review delineated the landscape of studies using microbial cultures, cell lines, animal model, and human subjects to explore the functional properties and host impacts of these combinations. Overall, the results suggested that these combinations possess respective metabolic properties that could facilitate beneficial activities therefore could be employed as dietary interventions for human health improvement and therapeutic purposes. However, uncertainties, such as applicational practicalities, underutilized analytical tools, contradictory results in studies, unclear mechanisms, and legislation hurdles, still challenges the broad utilization of these combinations. Future studies to address these issues may not only advance current knowledge on probiotic-prebiotic-host interrelationship but also promote respective applications in food and nutrition.
Collapse
Affiliation(s)
- Dong Han
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, Beijing Engineering and Technology Research Center of Food Additives, School of Food and Health, Beijing Technology and Business University, Beijing, China
- Key Laboratory of Food Bioengineering (China National Light Industry), College of Food Science and Nutritional Engineering, China Agricultural University, Beijing, China
| | - Justyna Zulewska
- Department of Dairy Science and Quality Management, Faculty of Food Sciences, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Ke Xiong
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, Beijing Engineering and Technology Research Center of Food Additives, School of Food and Health, Beijing Technology and Business University, Beijing, China
| | - Zhennai Yang
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, Beijing Engineering and Technology Research Center of Food Additives, School of Food and Health, Beijing Technology and Business University, Beijing, China
| |
Collapse
|
13
|
Topological Distribution of Wound Stiffness Modulates Wound-Induced Hair Follicle Neogenesis. Pharmaceutics 2022; 14:pharmaceutics14091926. [PMID: 36145674 PMCID: PMC9504897 DOI: 10.3390/pharmaceutics14091926] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/02/2022] [Accepted: 09/06/2022] [Indexed: 11/17/2022] Open
Abstract
In the large full-thickness mouse skin regeneration model, wound-induced hair neogenesis (WIHN) occurs in the wound center. This implies a spatial regulation of hair regeneration. The role of mechanotransduction during tissue regeneration is poorly understood. Here, we created wounds with equal area but different shapes to understand if perturbing mechanical forces change the area and quantity of de novo hair regeneration. Atomic force microscopy of wound stiffness demonstrated a stiffness gradient across the wound with the wound center softer than the margin. Reducing mechanotransduction signals using FAK or myosin II inhibitors significantly increased WIHN and, conversely, enhancing these signals with an actin stabilizer reduced WIHN. Here, α-SMA was downregulated in FAK inhibitor-treated wounds and lowered wound stiffness. Wound center epithelial cells exhibited a spherical morphology relative to wound margin cells. Differential gene expression analysis of FAK inhibitor-treated wound RNAseq data showed that cytoskeleton-, integrin-, and matrix-associated genes were downregulated, while hair follicular neogenesis, cell proliferation, and cell signaling genes were upregulated. Immunohistochemistry staining showed that FAK inhibition increased pSTAT3 nuclear staining in the regenerative wound center, implying enhanced signaling for hair follicular neogenesis. These findings suggest that controlling wound stiffness modulates tissue regeneration encompassing epithelial competence, tissue patterning, and regeneration during wound healing.
Collapse
|
14
|
Soutschek M, Germade T, Germain PL, Schratt G. enrichMiR predicts functionally relevant microRNAs based on target collections. Nucleic Acids Res 2022; 50:W280-W289. [PMID: 35609985 PMCID: PMC9252831 DOI: 10.1093/nar/gkac395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 04/25/2022] [Accepted: 05/05/2022] [Indexed: 11/12/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that are among the main post-transcriptional regulators of gene expression. A number of data collections and prediction tools have gathered putative or confirmed targets of these regulators. It is often useful, for discovery and validation, to harness such collections to perform target enrichment analysis in given transcriptional signatures or gene-sets in order to predict involved miRNAs. While several methods have been proposed to this end, a flexible and user-friendly interface for such analyses using various approaches and collections is lacking. enrichMiR (https://ethz-ins.org/enrichMiR/) addresses this gap by enabling users to perform a series of enrichment tests, based on several target collections, to rank miRNAs according to their likely involvement in the control of a given transcriptional signature or gene-set. enrichMiR results can furthermore be visualised through interactive and publication-ready plots. To guide the choice of the appropriate analysis method, we benchmarked various tests across a panel of experiments involving the perturbation of known miRNAs. Finally, we showcase enrichMiR functionalities in a pair of use cases.
Collapse
Affiliation(s)
- Michael Soutschek
- Lab of Systems Neuroscience, D-HEST Institute for Neuroscience, ETH Zürich, Switzerland.,Neuroscience Center Zurich, ETH Zurich and University of Zurich, Switzerland
| | - Tomás Germade
- Lab of Systems Neuroscience, D-HEST Institute for Neuroscience, ETH Zürich, Switzerland
| | - Pierre-Luc Germain
- Lab of Systems Neuroscience, D-HEST Institute for Neuroscience, ETH Zürich, Switzerland.,Lab of Statistical Bioinformatics, DMLS, University of Zürich, Switzerland.,Swiss Institute of Bioinformatics, Switzerland
| | - Gerhard Schratt
- Lab of Systems Neuroscience, D-HEST Institute for Neuroscience, ETH Zürich, Switzerland.,Neuroscience Center Zurich, ETH Zurich and University of Zurich, Switzerland
| |
Collapse
|
15
|
Mubeen S, Tom Kodamullil A, Hofmann-Apitius M, Domingo-Fernández D. On the influence of several factors on pathway enrichment analysis. Brief Bioinform 2022; 23:bbac143. [PMID: 35453140 PMCID: PMC9116215 DOI: 10.1093/bib/bbac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/21/2022] [Accepted: 03/30/2022] [Indexed: 02/01/2023] Open
Abstract
Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| |
Collapse
|
16
|
Ding J, Zhang B, Li Y, André D, Nilsson O. Phytochrome B and PHYTOCHROME INTERACTING FACTOR8 modulate seasonal growth in trees. THE NEW PHYTOLOGIST 2021; 232:2339-2352. [PMID: 33735450 DOI: 10.1111/nph.17350] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/04/2021] [Accepted: 03/10/2021] [Indexed: 05/27/2023]
Abstract
The seasonally synchronized annual growth cycle that is regulated mainly by photoperiod and temperature cues is a crucial adaptive strategy for perennial plants in boreal and temperate ecosystems. Phytochrome B (phyB), as a light and thermal sensor, has been extensively studied in Arabidopsis. However, the specific mechanisms for how the phytochrome photoreceptors control the phenology in tree species remain poorly understood. We characterized the functions of PHYB genes and their downstream PHYTOCHROME INTERACTING FACTOR (PIF) targets in the regulation of shade avoidance and seasonal growth in hybrid aspen trees. We show that while phyB1 and phyB2, as phyB in other plants, act as suppressors of shoot elongation during vegetative growth, they act as promoters of tree seasonal growth. Furthermore, while the Populus homologs of both PIF4 and PIF8 are involved in the shade avoidance syndrome (SAS), only PIF8 plays a major role as a suppressor of seasonal growth. Our data suggest that the PHYB-PIF8 regulon controls seasonal growth through the regulation of FT and CENL1 expression while a genome-wide transcriptome analysis suggests how, in Populus trees, phyB coordinately regulates SAS responses and seasonal growth cessation.
Collapse
Affiliation(s)
- Jihua Ding
- College of Horticulture and Forestry, Huazhong Agricultural University, Wuhan, 430070, China
| | - Bo Zhang
- Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, Umeå, 901 83, Sweden
| | - Yue Li
- College of Horticulture and Forestry, Huazhong Agricultural University, Wuhan, 430070, China
| | - Domenique André
- Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, Umeå, 901 83, Sweden
| | - Ove Nilsson
- Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, Umeå, 901 83, Sweden
| |
Collapse
|
17
|
Das S, Rai SN. Statistical Approach of Gene Set Analysis with Quantitative Trait Loci for Crop Gene Expression Studies. ENTROPY (BASEL, SWITZERLAND) 2021; 23:945. [PMID: 34441085 PMCID: PMC8391627 DOI: 10.3390/e23080945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/19/2021] [Accepted: 07/21/2021] [Indexed: 11/16/2022]
Abstract
Genome-wide expression study is a powerful genomic technology to quantify expression dynamics of genes in a genome. In gene expression study, gene set analysis has become the first choice to gain insights into the underlying biology of diseases or stresses in plants. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results from the primary downstream differential expression analysis. The gene set analysis approaches are well developed in microarrays and RNA-seq gene expression data analysis. These approaches mainly focus on analyzing the gene sets with gene ontology or pathway annotation data. However, in plant biology, such methods may not establish any formal relationship between the genotypes and the phenotypes, as most of the traits are quantitative and controlled by polygenes. The existing Quantitative Trait Loci (QTL)-based gene set analysis approaches only focus on the over-representation analysis of the selected genes while ignoring their associated gene scores. Therefore, we developed an innovative statistical approach, GSQSeq, to analyze the gene sets with trait enriched QTL data. This approach considers the associated differential expression scores of genes while analyzing the gene sets. The performance of the developed method was tested on five different crop gene expression datasets obtained from real crop gene expression studies. Our analytical results indicated that the trait-specific analysis of gene sets was more robust and successful through the proposed approach than existing techniques. Further, the developed method provides a valuable platform for integrating the gene expression data with QTL data.
Collapse
Affiliation(s)
- Samarendra Das
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
| | - Shesh N. Rai
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY 40202, USA
- Alcohol Research Center, University of Louisville, Louisville, KY 40202, USA
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA
| |
Collapse
|
18
|
Gilhooley MJ, Owen N, Moosajee M, Yu Wai Man P. From Transcriptomics to Treatment in Inherited Optic Neuropathies. Genes (Basel) 2021; 12:147. [PMID: 33499292 PMCID: PMC7912133 DOI: 10.3390/genes12020147] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/13/2021] [Accepted: 01/20/2021] [Indexed: 02/06/2023] Open
Abstract
Inherited optic neuropathies, including Leber Hereditary Optic Neuropathy (LHON) and Dominant Optic Atrophy (DOA), are monogenetic diseases with a final common pathway of mitochondrial dysfunction leading to retinal ganglion cell (RGC) death and ultimately loss of vision. They are, therefore, excellent models with which to investigate this ubiquitous disease process-implicated in both common polygenetic ocular diseases (e.g., Glaucoma) and late-onset central nervous system neurodegenerative diseases (e.g., Parkinson disease). In recent years, cellular and animal models of LHON and DOA have matured in parallel with techniques (such as RNA-seq) to determine and analyze the transcriptomes of affected cells. This confluence leaves us at a particularly exciting time with the potential for the identification of novel pathogenic players and therapeutic targets. Here, we present a discussion of the importance of inherited optic neuropathies and how transcriptomic techniques can be exploited in the development of novel mutation-independent, neuroprotective therapies.
Collapse
Affiliation(s)
- Michael James Gilhooley
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
| | - Nicholas Owen
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
| | - Mariya Moosajee
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
- The Francis Crick Institute, 1 Midland Road, Somers Town, London NW1 1AT, UK
- Great Ormond Street Hospital for Children NHS Foundation Trust, London WC1N 3JH, UK
| | - Patrick Yu Wai Man
- Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, UK; (N.O.); (M.M.); (P.Y.W.M.)
- Moorfields Eye Hospital NHS Foundation Trust, 162 City Road, London EC1V 2PD, UK
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
- MRC Mitochondrial Biology Unit, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
- Cambridge Eye Unit, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK
| |
Collapse
|
19
|
Morse CB, Voillet V, Bates BM, Chiu EY, Garcia NM, Gottardo R, Greenberg PD, Anderson KG. Development of a clinically relevant ovarian cancer model incorporating surgical cytoreduction to evaluate treatment of micro-metastatic disease. Gynecol Oncol 2020; 160:427-437. [PMID: 33229044 DOI: 10.1016/j.ygyno.2020.11.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 11/08/2020] [Indexed: 12/16/2022]
Abstract
OBJECTIVES Mouse models of ovarian cancer commonly transfer large numbers of tumor cells into the peritoneal cavity to establish experimental metastatic disease, which may not adequately model early metastatic spread from a primary tumor site. We hypothesized we could develop an ovarian cancer model that predictably represents micro-metastatic disease. METHODS Murine ID8VEGF ovarian cancer cells were transduced to express enhanced luciferase (eLuc) to enable intravital detection of microscopic disease burden and injected beneath the ovarian bursa of C57Bl/6 mice. At 6 or 10 weeks after orthotopic injection, when mice had detectable metastases, hysterectomy and bilateral salpingo-oophorectomy was performed to remove all macroscopic disease, and survival monitored. Immunohistochemistry and gene expression profiling were performed on primary and metastatic tumors. RESULTS eLuc-transduced ID8VEGF cells were brighter than cells transduced with standard luciferase, enabling in vivo visualization of microscopic intra-abdominal metastases developing after orthotopic injection. Primary surgical cytoreduction removed the primary tumor mass but left minimal residual disease in all mice. Metastatic sites that developed following orthotopic injection were similar to metastatic human ovarian cancer sites. Gene expression and immune infiltration were similar between primary and metastatic mouse tumors. Surgical cytoreduction prolonged survival compared to no surgery, with earlier cytoreduction more beneficial than delayed, despite micro-metastatic disease in both settings. CONCLUSIONS Mice with primary ovarian tumors established through orthotopic injection develop progressively fatal metastatic ovarian cancer, and benefit from surgical cytoreduction to remove bulky disease. This model enables the analysis of therapeutic regimens designed to target and potentially eradicate established minimal residual disease.
Collapse
Affiliation(s)
- Christopher B Morse
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Washington, Seattle, WA 98195, United States of America; Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America; Division of Gynecologic Oncology, Allegheny Health Network, West Penn Hospital, Mellon Pavilion, Suite 310, 4815 Liberty Avenue, Pittsburgh, PA 15224, United States of America.
| | - Valentin Voillet
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America
| | - Breanna M Bates
- Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America
| | - Edison Y Chiu
- Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America
| | - Nicolas M Garcia
- Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America
| | - Raphael Gottardo
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America
| | - Philip D Greenberg
- Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America; Divison of Medical Oncology, Department of Medicine, Department of Immunology, University of Washington, Seattle, WA 98195, United States of America.
| | - Kristin G Anderson
- Program in Immunology, Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States of America.
| |
Collapse
|
20
|
Ye J, Xu B, Fan B, Zhang J, Yuan F, Chen Y, Sun Z, Yan X, Song Y, Song S, Yang M, Yu JK. Discovery of Selenocysteine as a Potential Nanomedicine Promotes Cartilage Regeneration With Enhanced Immune Response by Text Mining and Biomedical Databases. Front Pharmacol 2020; 11:1138. [PMID: 32792959 PMCID: PMC7394085 DOI: 10.3389/fphar.2020.01138] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 07/13/2020] [Indexed: 12/21/2022] Open
Abstract
Background Unlike bone tissue, little progress has been made regarding cartilage regeneration, and many challenges remain. Furthermore, the key roles of cartilage lesion caused by traumas, focal lesion, or articular overstress remain unclear. Traumatic injuries to the meniscus as well as its degeneration are important risk factors for long-term joint dysfunction, degenerative joint lesions, and knee osteoarthritis (OA) a chronic joint disease characterized by degeneration of articular cartilage and hyperosteogeny. Nearly 50% of the individuals with meniscus injuries develop OA over time. Due to the limited inherent self-repair capacity of cartilage lesion, the Biomaterial drug-nanomedicine is considered to be a promising alternative. Therefore, it is important to elucidate the gene potential regeneration mechanisms and discover novel precise medication, which are identified through this study to investigate their function and role in pathogenesis. Methods We downloaded the mRNA microarray statistics GSE117999, involving paired cartilage lesion tissue samples from 12 OA patients and 12 patients from a control group. First, we analyzed these statistics to recognize the differentially expressed genes (DEGs). We then exposed the gene ontology (GO) annotation and the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment analyses for these DEGs. Protein-protein interaction (PPI) networks were then constructed, from which we attained eight significant genes after a functional interaction analysis. Finally, we identified a potential nanomedicine attained from this assay set, using a wide range of inhibitor information archived in the Search Tool for the Retrieval of Interacting Genes (STRING) database. Results Sixty-six DEGs were identified with our standards for meaning (adjusted P-value < 0.01, |log2 - FC| ≥1.2). Furthermore, we identified eight hub genes and one potential nanomedicine - Selenocysteine based on these integrative data. Conclusion We identified eight hub genes that could work as prospective biomarkers for the diagnostic and biomaterial drug treatment of cartilage lesion, involving the novel genes CAMP, DEFA3, TOLLIP, HLA-DQA2, SLC38A6, SLC3A1, FAM20A, and ANO8. Meanwhile, these genes were mainly associated with immune response, immune mediator induction, and cell chemotaxis. Significant support is provided for obtaining a series of novel gene targets, and we identify potential mechanisms for cartilage regeneration and final nanomedicine immunotherapy in regenerative medicine.
Collapse
Affiliation(s)
- Jing Ye
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Bingbing Xu
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Baoshi Fan
- School of Clinical Medicine, Weifang Medical University, Weifang, China
| | - Jiying Zhang
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Fuzhen Yuan
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Yourong Chen
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Zewen Sun
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Xin Yan
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Yifan Song
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Shitang Song
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| | - Meng Yang
- School of Clinical Medicine, Weifang Medical University, Weifang, China
| | - Jia-Kuo Yu
- Knee Surgery Department of the Institution of Sports Medicine, Peking University Third Hospital, Beijing Key Laboratory of Sports Injuries, Beijing, China
| |
Collapse
|
21
|
Maleki F, Ovens K, Hogan DJ, Kusalik AJ. Gene Set Analysis: Challenges, Opportunities, and Future Research. Front Genet 2020; 11:654. [PMID: 32695141 PMCID: PMC7339292 DOI: 10.3389/fgene.2020.00654] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/29/2020] [Indexed: 12/14/2022] Open
Abstract
Gene set analysis methods are widely used to provide insight into high-throughput gene expression data. There are many gene set analysis methods available. These methods rely on various assumptions and have different requirements, strengths and weaknesses. In this paper, we classify gene set analysis methods based on their components, describe the underlying requirements and assumptions for each class, and provide directions for future research in developing and evaluating gene set analysis methods.
Collapse
|
22
|
Federico A, Serra A, Ha MK, Kohonen P, Choi JS, Liampa I, Nymark P, Sanabria N, Cattelani L, Fratello M, Kinaret PAS, Jagiello K, Puzyn T, Melagraki G, Gulumian M, Afantitis A, Sarimveis H, Yoon TH, Grafström R, Greco D. Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data. NANOMATERIALS 2020; 10:nano10050903. [PMID: 32397130 PMCID: PMC7279140 DOI: 10.3390/nano10050903] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 04/29/2020] [Accepted: 05/04/2020] [Indexed: 12/28/2022]
Abstract
Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.
Collapse
Affiliation(s)
- Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
| | - My Kieu Ha
- Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea; (M.K.H.); (J.-S.C.); (T.-H.Y.)
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea
| | - Pekka Kohonen
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; (P.K.); (P.N.); (R.G.)
- Division of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Jang-Sik Choi
- Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea; (M.K.H.); (J.-S.C.); (T.-H.Y.)
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea
| | - Irene Liampa
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece; (I.L.); (H.S.)
| | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; (P.K.); (P.N.); (R.G.)
- Division of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Natasha Sanabria
- National Institute for Occupational Health, Johannesburg 30333, South Africa; (N.S.); (M.G.)
| | - Luca Cattelani
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
| | - Michele Fratello
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
| | - Pia Anneli Sofia Kinaret
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Karolina Jagiello
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland; (K.J.); (T.P.)
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Tomasz Puzyn
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland; (K.J.); (T.P.)
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Georgia Melagraki
- Nanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, Cyprus; (G.M.); (A.A.)
| | - Mary Gulumian
- National Institute for Occupational Health, Johannesburg 30333, South Africa; (N.S.); (M.G.)
- Haematology and Molecular Medicine Department, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Antreas Afantitis
- Nanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, Cyprus; (G.M.); (A.A.)
| | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece; (I.L.); (H.S.)
| | - Tae-Hyun Yoon
- Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea; (M.K.H.); (J.-S.C.); (T.-H.Y.)
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea
| | - Roland Grafström
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; (P.K.); (P.N.); (R.G.)
- Division of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.F.); (A.S.); (L.C.); (M.F.); (P.A.S.K.)
- BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
- Correspondence:
| |
Collapse
|
23
|
Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges. ENTROPY 2020; 22:e22040427. [PMID: 33286201 PMCID: PMC7516904 DOI: 10.3390/e22040427] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/18/2020] [Accepted: 04/03/2020] [Indexed: 12/22/2022]
Abstract
Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.
Collapse
|
24
|
Lauria A, Peirone S, Giudice MD, Priante F, Rajan P, Caselle M, Oliviero S, Cereda M. Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles. Nucleic Acids Res 2020; 48:1730-1747. [PMID: 31889184 PMCID: PMC7038995 DOI: 10.1093/nar/gkz1208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 12/05/2019] [Accepted: 12/17/2019] [Indexed: 12/31/2022] Open
Abstract
Heterogeneity is a fundamental feature of complex phenotypes. So far, genomic screenings have profiled thousands of samples providing insights into the transcriptome of the cell. However, disentangling the heterogeneity of these transcriptomic Big Data to identify defective biological processes remains challenging. Here we present GSECA, a method exploiting the bimodal behavior of RNA-sequencing gene expression profiles to identify altered gene sets in heterogeneous patient cohorts. Using simulated and experimental RNA-sequencing data sets, we show that GSECA provides higher performances than other available algorithms in detecting truly altered biological processes in large cohorts. Applied to 5941 samples from 14 different cancer types, GSECA correctly identified the alteration of the PI3K/AKT signaling pathway driven by the somatic loss of PTEN and verified the emerging role of PTEN in modulating immune-related processes. In particular, we showed that, in prostate cancer, PTEN loss appears to establish an immunosuppressive tumor microenvironment through the activation of STAT3, and low PTEN expression levels have a detrimental impact on patient disease-free survival. GSECA is available at https://github.com/matteocereda/GSECA.
Collapse
Affiliation(s)
- Andrea Lauria
- Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
| | - Serena Peirone
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
- Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Marco Del Giudice
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
- Candiolo Cancer Institute, FPO - IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
| | - Francesca Priante
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
- Candiolo Cancer Institute, FPO - IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
| | - Prabhakar Rajan
- Centre for Cell and Molecular Biology, Barts Cancer Institute, Cancer Research UK Barts Centre, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
- The Alan Turing Institute, British Library, 96 Euston Road, London, NW1 2DB, UK
| | - Michele Caselle
- Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Salvatore Oliviero
- Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
| | - Matteo Cereda
- IIGM - Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
- Candiolo Cancer Institute, FPO - IRCCS, Str. Prov.le 142, km 3.95, Candiolo (TO) 10060, Italy
| |
Collapse
|
25
|
Housekeeping gene validation for RT-qPCR studies on synovial fibroblasts derived from healthy and osteoarthritic patients with focus on mechanical loading. PLoS One 2019; 14:e0225790. [PMID: 31809510 PMCID: PMC6897414 DOI: 10.1371/journal.pone.0225790] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 11/12/2019] [Indexed: 12/13/2022] Open
Abstract
Selection of appropriate housekeeping genes is essential for the validity of data normalization in reverse transcription quantitative PCR (RT-qPCR). Synovial fibroblasts (SF) play a mediating role in the development and progression of osteoarthritis (OA) pathogenesis, but there is no information on reliable housekeeping genes available. Therefore the goal of this study was to identify a set of reliable housekeeping genes suitable for studies of mechanical loading on SF from healthy and OA patients. Nine genes were evaluated towards expression stability and ranked according their relative stability determined by four different mathematical procedures (geNorm, NormFinder, BestKeeper and comparative ΔCq). We observed that RPLP0 (ribosomal protein, large, P0) and EEF1A1 (eukaryotic translation elongation factor 1 alpha 1) turned out to be the genes with the most stable expression in SF from non-OA or OA patients treated with or without mechanical loading. According to geNorm two genes are sufficient for normalization throughout. Expression of one tested target gene varied considerably, if normalized to different candidate housekeeping genes. Our study provides a tool for accurate and valid housekeeping gene selection in gene expression experiments on SF from healthy and OA patients with and without mechanical loading in consistent with the MIQE (Minimum Information for Publication of Quantitative Real-Time PCR Experiments) guidelines and additionally demonstrates the impact of proper housekeeping gene selection on the expression of the gene of interest.
Collapse
|
26
|
Khodayari Moez E, Hajihosseini M, Andrews JL, Dinu I. Longitudinal linear combination test for gene set analysis. BMC Bioinformatics 2019; 20:650. [PMID: 31822265 PMCID: PMC6902471 DOI: 10.1186/s12859-019-3221-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 11/13/2019] [Indexed: 11/12/2022] Open
Abstract
Background Although microarray studies have greatly contributed to recent genetic advances, lack of replication has been a continuing concern in this area. Complex study designs have the potential to address this concern, though they remain undervalued by investigators due to the lack of proper analysis methods. The primary challenge in the analysis of complex microarray study data is handling the correlation structure within data while also dealing with the combination of large number of genetic measurements and small number of subjects that are ubiquitous even in standard microarray studies. Motivated by the lack of available methods for analysis of repeatedly measured phenotypic or transcriptomic data, herein we develop a longitudinal linear combination test (LLCT). Results LLCT is a two-step method to analyze multiple longitudinal phenotypes when there is high dimensionality in response and/or explanatory variables. Alternating between calculating within-subjects and between-subjects variations in two steps, LLCT examines if the maximum possible correlation between a linear combination of the time trends and a linear combination of the predictors given by the gene expressions is statistically significant. A generalization of this method can handle family-based study designs when the subjects are not independent. This method is also applicable to time-course microarray, with the ability to identify gene sets that exhibit significantly different expression patterns over time. Based on the results from a simulation study, LLCT outperformed its alternative: pathway analysis via regression. LLCT was shown to be very powerful in the analysis of large gene sets even when the sample size is small. Conclusions This self-contained pathway analysis method is applicable to a wide range of longitudinal genomics, proteomics, metabolomics (OMICS) data, allows adjusting for potentially time-dependent covariates and works well with unbalanced and incomplete data. An important potential application of this method could be time-course linkage of OMICS, an attractive possibility for future genetic researchers. Availability: R package of LLCT is available at: https://github.com/its-likeli-jeff/LLCT
Collapse
|
27
|
Glazko G, Zybailov B, Emmert-Streib F, Baranova A, Rahmatallah Y. Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study. PLoS One 2019; 14:e0221444. [PMID: 31437237 PMCID: PMC6705791 DOI: 10.1371/journal.pone.0221444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/06/2019] [Indexed: 01/10/2023] Open
Abstract
Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.
Collapse
Affiliation(s)
- Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Frank Emmert-Streib
- Computational Medicine and Statistical Learning Laboratory, Tampere University of Technology, Korkeakoulunkatu, Tampere, Finland FI
| | - Ancha Baranova
- School of Systems Biology, George Mason University, Manassas VA, United States of America
- Research Center for Medical Genetics, Moscow, Russia
| | - Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| |
Collapse
|
28
|
Liu C, Wang L, Wang T, Tian S. Construction of subtype-specific prognostic gene signatures for early-stage non-small cell lung cancer using meta feature selection methods. Oncol Lett 2019; 18:2366-2375. [PMID: 31402939 PMCID: PMC6676737 DOI: 10.3892/ol.2019.10563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 06/05/2019] [Indexed: 11/06/2022] Open
Abstract
Feature selection in the framework of meta-analyses (meta feature selection), combines meta-analysis with a feature selection process and thus allows meta-analysis feature selection across multiple datasets. In the present study, a meta feature selection procedure that fitted a multiple Cox regression model to estimate the effect size of a gene in individual studies and to identify the overall effect of the gene using a meta-analysis model was proposed. The method was used to identify prognostic gene signatures for lung adenocarcinoma and lung squamous cell carcinoma. Furthermore, redundant gene elimination (RGE) is of crucial importance during feature selection, and is also essential for a meta feature selection process. The current study demonstrated that the proposed meta feature selection procedure with RGE outperforms that without RGE in terms of predictive ability, model parsimony and biological interpretation.
Collapse
Affiliation(s)
- Chunshui Liu
- Department of Hematology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Linlin Wang
- Department of Ultrasound, China-Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| | - Tianjiao Wang
- The State Key Laboratory of Special Economic Animal Molecular Biology, Institute of Special Wild Economic Animal and Plant Science, Chinese Academy Agricultural Science, Changchun, Jilin 130133, P.R. China
| | - Suyan Tian
- Division of Clinical Research, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
29
|
Anderson KG, Voillet V, Bates BM, Chiu EY, Burnett MG, Garcia NM, Oda SK, Morse CB, Stromnes IM, Drescher CW, Gottardo R, Greenberg PD. Engineered Adoptive T-cell Therapy Prolongs Survival in a Preclinical Model of Advanced-Stage Ovarian Cancer. Cancer Immunol Res 2019; 7:1412-1425. [PMID: 31337659 DOI: 10.1158/2326-6066.cir-19-0258] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/29/2019] [Accepted: 07/19/2019] [Indexed: 01/01/2023]
Abstract
Adoptive T-cell therapy using high-affinity T-cell receptors (TCR) to target tumor antigens has potential for improving outcomes in high-grade serous ovarian cancer (HGSOC) patients. Ovarian tumors develop a hostile, multicomponent tumor microenvironment containing suppressive cells, inhibitory ligands, and soluble factors that facilitate evasion of antitumor immune responses. Developing and validating an immunocompetent mouse model of metastatic ovarian cancer that shares antigenic and immunosuppressive qualities of human disease would facilitate establishing effective T-cell therapies. We used deep transcriptome profiling and IHC analysis of human HGSOC tumors and disseminated mouse ID8VEGF tumors to compare immunologic features. We then evaluated the ability of CD8 T cells engineered to express a high-affinity TCR specific for mesothelin, an ovarian cancer antigen, to infiltrate advanced ID8VEGF murine ovarian tumors and control tumor growth. Human CD8 T cells engineered to target mesothelin were also evaluated for ability to kill HLA-A2+ HGSOC lines. IHC and gene-expression profiling revealed striking similarities between tumors of both species, including processing/presentation of a leading candidate target antigen, suppressive immune cell infiltration, and expression of molecules that inhibit T-cell function. Engineered T cells targeting mesothelin infiltrated mouse tumors but became progressively dysfunctional and failed to persist. Treatment with repeated doses of T cells maintained functional activity, significantly prolonging survival of mice harboring late-stage disease at treatment onset. Human CD8 T cells engineered to target mesothelin were tumoricidal for three HGSOC lines. Treatment with engineered T cells may have clinical applicability in patients with advanced-stage HGSOC.
Collapse
MESH Headings
- Animals
- Antigens, Neoplasm/genetics
- Antigens, Neoplasm/immunology
- CD8-Positive T-Lymphocytes/immunology
- CD8-Positive T-Lymphocytes/metabolism
- Cell Line, Tumor
- Cytotoxicity, Immunologic
- Disease Models, Animal
- Female
- GPI-Linked Proteins/genetics
- GPI-Linked Proteins/immunology
- Gene Expression
- Gene Expression Profiling
- Genetic Engineering
- HLA-A Antigens/genetics
- HLA-A Antigens/immunology
- Humans
- Immunophenotyping
- Immunotherapy, Adoptive/adverse effects
- Immunotherapy, Adoptive/methods
- Mesothelin
- Mice
- Neoplasm Grading
- Neoplasm Staging
- Ovarian Neoplasms/genetics
- Ovarian Neoplasms/mortality
- Ovarian Neoplasms/pathology
- Ovarian Neoplasms/therapy
- Prognosis
- Receptors, Antigen, T-Cell/genetics
- Receptors, Antigen, T-Cell/metabolism
- Receptors, Chimeric Antigen/genetics
- Receptors, Chimeric Antigen/metabolism
- T-Lymphocytes/immunology
- T-Lymphocytes/metabolism
- Treatment Outcome
- Xenograft Model Antitumor Assays
Collapse
Affiliation(s)
- Kristin G Anderson
- Department of Immunology, University of Washington School of Medicine, Seattle, Washington
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Valentin Voillet
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Breanna M Bates
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Edison Y Chiu
- Department of Immunology, University of Washington School of Medicine, Seattle, Washington
| | - Madison G Burnett
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Nicolas M Garcia
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Shannon K Oda
- Department of Immunology, University of Washington School of Medicine, Seattle, Washington
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Christopher B Morse
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
- Department of Obstetrics and Gynecology, Division of Gynecologic Oncology, University of Washington School of Medicine, Seattle, Washington
| | - Ingunn M Stromnes
- Department of Immunology, University of Washington School of Medicine, Seattle, Washington
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Charles W Drescher
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Raphael Gottardo
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Philip D Greenberg
- Department of Immunology, University of Washington School of Medicine, Seattle, Washington.
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| |
Collapse
|
30
|
Ebrahimpoor M, Spitali P, Hettne K, Tsonaka R, Goeman J. Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods. Brief Bioinform 2019; 21:1302-1312. [PMID: 31297505 PMCID: PMC7373179 DOI: 10.1093/bib/bbz074] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 05/28/2019] [Accepted: 05/28/2019] [Indexed: 01/23/2023] Open
Abstract
Studying sets of genomic features is increasingly popular in genomics, proteomics and metabolomics since analyzing at set level not only creates a natural connection to biological knowledge but also offers more statistical power. Currently, there are two gene-set testing approaches, self-contained and competitive, both of which have their advantages and disadvantages, but neither offers the final solution. We introduce simultaneous enrichment analysis (SEA), a new approach for analysis of feature sets in genomics and other omics based on a new unified null hypothesis, which includes the self-contained and competitive null hypotheses as special cases. We employ closed testing using Simes tests to test this new hypothesis. For every feature set, the proportion of active features is estimated, and a confidence bound is provided. Also, for every unified null hypotheses, a \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$P$\end{document}-value is calculated, which is adjusted for family-wise error rate. SEA does not need to assume that the features are independent. Moreover, users are allowed to choose the feature set(s) of interest after observing the data. We develop a novel pipeline and apply it on RNA-seq data of dystrophin-deficient mdx mice, showcasing the flexibility of the method. Finally, the power properties of the method are evaluated through simulation studies.
Collapse
Affiliation(s)
- Mitra Ebrahimpoor
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Pietro Spitali
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Kristina Hettne
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Roula Tsonaka
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| | - Jelle Goeman
- Medical statistics, Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
31
|
Qin W, Wang X, Zhao H, Lu H. A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis. Front Genet 2019; 10:293. [PMID: 31031796 PMCID: PMC6473067 DOI: 10.3389/fgene.2019.00293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 03/19/2019] [Indexed: 12/25/2022] Open
Abstract
Motivation: Gene set enrichment analysis is a widely accepted expression analysis tool which aims at detecting coordinated expression change within a pre-defined gene sets rather than individual genes. The benefit of gene set analysis over individual differentially expressed (DE) gene analysis includes more reproducible and interpretable results and detecting small but consistent change among gene set which could not be detected by DE gene analysis. There have been many successful gene set analysis applications in human diseases. However, when the sample size of a disease study is small and no other public data sets of the same disease are available, it will lead to lack of power to detect pathways of importance to the disease. Results: We have developed a novel joint gene set analysis statistical framework which aims at improving the power of identifying enriched gene sets through integrating multiple similar disease data sets. Through comprehensive simulation studies, we demonstrated that our proposed frameworks obtained much better AUC scores than single data set analysis and another meta-analysis method in identification of enriched pathways. When applied to two real data sets, the proposed framework could retain the enriched gene sets identified by single data set analysis and exclusively obtained up to 200% more disease-related gene sets demonstrating the improved identification power through information shared between similar diseases. We expect that the proposed framework would enable researchers to better explore public data sets when the sample size of their study is limited.
Collapse
Affiliation(s)
- Wenyi Qin
- Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, China
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL, United States
- Department of Genetics, School of Medicine, Yale University, New Haven, CT, United States
| | - Xujun Wang
- Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, China
| | - Hongyu Zhao
- Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, China
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, United States
| | - Hui Lu
- Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai Jiaotong University, Shanghai, China
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL, United States
- Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, Shanghai Jiaotong University, Shanghai, China
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, United States
| |
Collapse
|
32
|
Lim S, Lee S, Jung I, Rhee S, Kim S. Comprehensive and critical evaluation of individualized pathway activity measurement tools on pan-cancer data. Brief Bioinform 2018; 21:36-46. [PMID: 30462155 DOI: 10.1093/bib/bby097] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Revised: 08/20/2018] [Accepted: 09/09/2018] [Indexed: 12/11/2022] Open
Abstract
Motivation : Biological pathways are extensively used for the analysis of transcriptome data to characterize biological mechanisms underlying various phenotypes. There are a number of computational tools that summarize transcriptome data at the pathway level. However, there is no comparative study on how well these tools produce useful information at the cohort level, enabling comparison of many samples or patients. Results : In this study, we systematically compared and evaluated 13 different pathway activity inference tools based on 5 comparison criteria using pan-cancer data set. This study has two major contributions. First, our study provides a comprehensive survey on computational techniques used by existing pathway activity inference tools. The tools use different strategies and assume different requirements on data: input transformation, use of labels, necessity of cohort-level input data, use of gene relations and scoring metric. Second, we performed extensive evaluations on the performance of these tools. Because different tools use different methods to map samples to the pathway dimension, the tools are evaluated at the pathway level using five comparison criteria. Starting from measuring how well a tool maintains the characteristics of original gene expression values, robustness was also investigated by adding noise into gene expression data. Classification tasks on three clinical variables (tumor versus normal, survival and cancer subtypes) were performed to evaluate the utility of tools for their clinical applications. In addition, the inferred activity values were compared between the tools to see how similar they are along with the scoring schemes they use.
Collapse
Affiliation(s)
- Sangsoo Lim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Sangseon Lee
- Department of Computer Science and Engineering, Seoul National University, Seoul, Korea
| | - Inuk Jung
- Bioinformatics Institute, Seoul National University, Seoul, Korea
| | - Sungmin Rhee
- Department of Computer Science and Engineering, Seoul National University, Seoul, Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.,Department of Computer Science and Engineering, Seoul National University, Seoul, Korea.,Bioinformatics Institute, Seoul National University, Seoul, Korea
| |
Collapse
|
33
|
Lin SJ, Lu TP, Yu QY, Hsiao CK. Probabilistic prioritization of candidate pathway association with pathway score. BMC Bioinformatics 2018; 19:391. [PMID: 30355338 PMCID: PMC6201593 DOI: 10.1186/s12859-018-2411-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 10/05/2018] [Indexed: 01/12/2023] Open
Abstract
Background Current methods for gene-set or pathway analysis are usually designed to test the enrichment of a single gene-set. Once the analysis is carried out for each of the sets under study, a list of significant sets can be obtained. However, if one wishes to further prioritize the importance or strength of association of these sets, no such quantitative measure is available. Using the magnitude of p-value to rank the pathways may not be appropriate because p-value is not a measure for strength of significance. In addition, when testing each pathway, these analyses are often implicitly affected by the number of differentially expressed genes included in the set and/or affected by the dependence among genes. Results Here we propose a two-stage procedure to prioritize the pathways/gene-sets. In the first stage we develop a pathway-level measure with three properties. First, it contains all genes (differentially expressed or not) in the same set, and summarizes the collective effect of all genes per sample. Second, this pathway score accounts for the correlation between genes by synchronizing their correlation directions. Third, the score includes a rank transformation to enhance the variation among samples as well as to avoid the influence of extreme heterogeneity among genes. In the second stage, all scores are included simultaneously in a Bayesian logistic regression model which can evaluate the strength of association for each set and rank the sets based on posterior probabilities. Simulations from Gaussian distributions and human microarray data, and a breast cancer study with RNA-Seq are considered for demonstration and comparison with other existing methods. Conclusions The proposed summary pathway score provides for each sample an overall evaluation of gene expression in a gene-set. It demonstrates the advantages of including all genes in the set and the synchronization of correlation direction. The simultaneous utilization of all pathway-level scores in a Bayesian model not only offers a probabilistic evaluation and ranking of the pathway association but also presents good accuracy in identifying the top-ranking pathways. The resulting recommendation list of ranked pathways can be a reference for potential target therapy or for future allocation of research resources. Electronic supplementary material The online version of this article (10.1186/s12859-018-2411-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shu-Ju Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, 10055, Taiwan
| | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, 10055, Taiwan.,Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, 10055, Taiwan
| | - Qi-You Yu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, 10055, Taiwan
| | - Chuhsing Kate Hsiao
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, 10055, Taiwan. .,Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, 10055, Taiwan.
| |
Collapse
|
34
|
CGPS: A machine learning-based approach integrating multiple gene set analysis tools for better prioritization of biologically relevant pathways. J Genet Genomics 2018; 45:489-504. [PMID: 30292791 DOI: 10.1016/j.jgg.2018.08.002] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 08/11/2018] [Accepted: 08/13/2018] [Indexed: 12/20/2022]
Abstract
Gene set enrichment (GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity (CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score (R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets. Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway (hsa04115); by contrast, according to two ensemble methods (EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.
Collapse
|
35
|
Glaab E. Computational systems biology approaches for Parkinson's disease. Cell Tissue Res 2018; 373:91-109. [PMID: 29185073 PMCID: PMC6015628 DOI: 10.1007/s00441-017-2734-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 11/06/2017] [Indexed: 12/26/2022]
Abstract
Parkinson's disease (PD) is a prime example of a complex and heterogeneous disorder, characterized by multifaceted and varied motor- and non-motor symptoms and different possible interplays of genetic and environmental risk factors. While investigations of individual PD-causing mutations and risk factors in isolation are providing important insights to improve our understanding of the molecular mechanisms behind PD, there is a growing consensus that a more complete understanding of these mechanisms will require an integrative modeling of multifactorial disease-associated perturbations in molecular networks. Identifying and interpreting the combinatorial effects of multiple PD-associated molecular changes may pave the way towards an earlier and reliable diagnosis and more effective therapeutic interventions. This review provides an overview of computational systems biology approaches developed in recent years to study multifactorial molecular alterations in complex disorders, with a focus on PD research applications. Strengths and weaknesses of different cellular pathway and network analyses, and multivariate machine learning techniques for investigating PD-related omics data are discussed, and strategies proposed to exploit the synergies of multiple biological knowledge and data sources. A final outlook provides an overview of specific challenges and possible next steps for translating systems biology findings in PD to new omics-based diagnostic tools and targeted, drug-based therapeutic approaches.
Collapse
Affiliation(s)
- Enrico Glaab
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 7 avenue des Hauts Fourneaux, L-4362, Esch-sur-Alzette, Luxembourg.
| |
Collapse
|
36
|
Agniel D, Hejblum BP. Variance component score test for time-course gene set analysis of longitudinal RNA-seq data. Biostatistics 2018; 18:589-604. [PMID: 28334305 DOI: 10.1093/biostatistics/kxx005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 01/04/2017] [Indexed: 01/28/2023] Open
Abstract
As gene expression measurement technology is shifting from microarrays to sequencing, the statistical tools available for their analysis must be adapted since RNA-seq data are measured as counts. It has been proposed to model RNA-seq counts as continuous variables using nonparametric regression to account for their inherent heteroscedasticity. In this vein, we propose tcgsaseq, a principled, model-free, and efficient method for detecting longitudinal changes in RNA-seq gene sets defined a priori. The method identifies those gene sets whose expression varies over time, based on an original variance component score test accounting for both covariates and heteroscedasticity without assuming any specific parametric distribution for the (transformed) counts. We demonstrate that despite the presence of a nonparametric component, our test statistic has a simple form and limiting distribution, and both may be computed quickly. A permutation version of the test is additionally proposed for very small sample sizes. Applied to both simulated data and two real datasets, tcgsaseq is shown to exhibit very good statistical properties, with an increase in stability and power when compared to state-of-the-art methods ROAST (rotation gene set testing), edgeR, and DESeq2, which can fail to control the type I error under certain realistic settings. We have made the method available for the community in the R package tcgsaseq.
Collapse
Affiliation(s)
- Denis Agniel
- Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, MA 02115, USA
| | - Boris P Hejblum
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA University of Bordeaux, ISPED, INSERM U1219, INRIA SISTM, 146 rue Léo Saignat, 33076 Bordeaux, FRANCE Vaccine Research Institute, Créteil, FRANCE
| |
Collapse
|
37
|
Abstract
The analysis of gene sets (in a form of functionally related genes or pathways) has become the method of choice for extracting the strongest signals from omics data. The motivation behind using gene sets instead of individual genes is two-fold. First, this approach incorporates pre-existing biological knowledge into the analysis and facilitates the interpretation of experimental results. Second, it employs a statistical hypotheses testing framework. Here, we briefly review main Gene Set Analysis (GSA) approaches for testing differential expression of gene sets and several GSA approaches for testing statistical hypotheses beyond differential expression that allow extracting additional biological information from the data. We distinguish three major types of GSA approaches testing: (1) differential expression (DE), (2) differential variability (DV), and (3) differential co-expression (DC) of gene sets between two phenotypes. We also present comparative power analysis and Type I error rates for different approaches in each major type of GSA on simulated data. Our evaluation presents a concise guideline for selecting GSA approaches best performing under particular experimental settings. The value of the three major types of GSA approaches is illustrated with real data example. While being applied to the same data set, major types of GSA approaches result in complementary biological information.
Collapse
|
38
|
Song T, Cao S, Tao S, Liang S, Du W, Liang Y. A Novel Unsupervised Algorithm for Biological Process-based Analysis on Cancer. Sci Rep 2017; 7:4671. [PMID: 28680165 PMCID: PMC5498659 DOI: 10.1038/s41598-017-04961-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 05/30/2017] [Indexed: 12/04/2022] Open
Abstract
The aberrant alterations of biological functions are well known in tumorigenesis and cancer development. Hence, with advances in high-throughput sequencing technologies, capturing and quantifying the functional alterations in cancers based on expression profiles to explore cancer malignant process is highlighted as one of the important topics among cancer researches. In this article, we propose an algorithm for quantifying biological processes by using gene expression profiles over a sample population, which involves the idea of constructing principal curves to condense information of each biological process by a novel scoring scheme on an individualized manner. After applying our method on several large-scale breast cancer datasets in survival analysis, a subset of these biological processes extracted from corresponding survival model is then found to have significant associations with clinical outcomes. Further analyses of these biological processes enable the study of the interplays between biological processes and cancer phenotypes of interest, provide us valuable insights into cancer biology in biological process level and guide the precision treatment for cancer patients. And notably, prognosis predictions based on our method are consistently superior to the existing state of art methods with the same intention.
Collapse
Affiliation(s)
- Tianci Song
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Sha Cao
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
| | - Sheng Tao
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
| | - Sen Liang
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
| | - Wei Du
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA.
| | - Yanchun Liang
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
- Zhuhai Laboratory of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai, 519041, China.
| |
Collapse
|
39
|
Kohonen P, Parkkinen JA, Willighagen EL, Ceder R, Wennerberg K, Kaski S, Grafström RC. A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury. Nat Commun 2017; 8:15932. [PMID: 28671182 PMCID: PMC5500850 DOI: 10.1038/ncomms15932] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 05/15/2017] [Indexed: 01/17/2023] Open
Abstract
Predicting unanticipated harmful effects of chemicals and drug molecules is a difficult and costly task. Here we utilize a 'big data compacting and data fusion'-concept to capture diverse adverse outcomes on cellular and organismal levels. The approach generates from transcriptomics data set a 'predictive toxicogenomics space' (PTGS) tool composed of 1,331 genes distributed over 14 overlapping cytotoxicity-related gene space components. Involving ∼2.5 × 108 data points and 1,300 compounds to construct and validate the PTGS, the tool serves to: explain dose-dependent cytotoxicity effects, provide a virtual cytotoxicity probability estimate intrinsic to omics data, predict chemically-induced pathological states in liver resulting from repeated dosing of rats, and furthermore, predict human drug-induced liver injury (DILI) from hepatocyte experiments. Analysing 68 DILI-annotated drugs, the PTGS tool outperforms and complements existing tests, leading to a hereto-unseen level of DILI prediction accuracy.
Collapse
Affiliation(s)
- Pekka Kohonen
- Institute of Environmental Medicine, Karolinska Institutet, Nobels väg 13, Box 210, SE-17177 Stockholm, Sweden
| | - Juuso A Parkkinen
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Konemiehentie 2, P.O. Box 15400, 00076 Aalto, Finland
| | - Egon L Willighagen
- Institute of Environmental Medicine, Karolinska Institutet, Nobels väg 13, Box 210, SE-17177 Stockholm, Sweden.,Department of Bioinformatics-BiGCaT, Maastricht University, Universiteitssingel 50, P.O. Box 616, UNS 50 Box19, NL-6200 MD Maastricht, The Netherlands
| | - Rebecca Ceder
- Institute of Environmental Medicine, Karolinska Institutet, Nobels väg 13, Box 210, SE-17177 Stockholm, Sweden
| | - Krister Wennerberg
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Tukholmankatu 8, P.O. Box 20, FI-00014 Helsinki, Finland
| | - Samuel Kaski
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Konemiehentie 2, P.O. Box 15400, 00076 Aalto, Finland.,Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Gustaf Hällströmin katu 2b, P.O. Box 68, FI-00014 Helsinki, Finland
| | - Roland C Grafström
- Institute of Environmental Medicine, Karolinska Institutet, Nobels väg 13, Box 210, SE-17177 Stockholm, Sweden
| |
Collapse
|
40
|
Alhamdoosh M, Ng M, Wilson NJ, Sheridan JM, Huynh H, Wilson MJ, Ritchie ME. Combining multiple tools outperforms individual methods in gene set enrichment analyses. Bioinformatics 2017; 33:414-424. [PMID: 27694195 PMCID: PMC5408797 DOI: 10.1093/bioinformatics/btw623] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Accepted: 09/23/2016] [Indexed: 12/22/2022] Open
Abstract
Motivation Gene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level results between multiple experimental conditions. Results The ensemble of genes set enrichment analyses (EGSEA) is a method developed for RNA-sequencing data that combines results from twelve algorithms and calculates collective gene set scores to improve the biological relevance of the highest ranked gene sets. EGSEA’s gene set database contains around 25 000 gene sets from sixteen collections. It has multiple visualization capabilities that allow researchers to view gene sets at various levels of granularity. EGSEA has been tested on simulated data and on a number of human and mouse datasets and, based on biologists’ feedback, consistently outperforms the individual tools that have been combined. Our evaluation demonstrates the superiority of the ensemble approach for GSE analysis, and its utility to effectively and efficiently extrapolate biological functions and potential involvement in disease processes from lists of differentially regulated genes. Availability and Implementation EGSEA is available as an R package at http://www.bioconductor.org/packages/EGSEA/. The gene sets collections are available in the R package EGSEAdata from http://www.bioconductor.org/packages/EGSEAdata/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Milica Ng
- CSL Limited, Bio21 Institute, Parkville, Australia
| | | | - Julie M Sheridan
- ACRF Stem Cells and Cancer Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Huy Huynh
- CSL Limited, Bio21 Institute, Parkville, Australia
| | | | - Matthew E Ritchie
- Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.,School of Mathematics and Statistics, The University of Melbourne, Parkville, Australia
| |
Collapse
|
41
|
Wright C, Shin JH, Rajpurohit A, Deep-Soboslay A, Collado-Torres L, Brandon NJ, Hyde TM, Kleinman JE, Jaffe AE, Cross AJ, Weinberger DR. Altered expression of histamine signaling genes in autism spectrum disorder. Transl Psychiatry 2017; 7:e1126. [PMID: 28485729 PMCID: PMC5534955 DOI: 10.1038/tp.2017.87] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 03/17/2017] [Accepted: 03/21/2017] [Indexed: 12/18/2022] Open
Abstract
The histaminergic system (HS) has a critical role in cognition, sleep and other behaviors. Although not well studied in autism spectrum disorder (ASD), the HS is implicated in many neurological disorders, some of which share comorbidity with ASD, including Tourette syndrome (TS). Preliminary studies suggest that antagonism of histamine receptors 1-3 reduces symptoms and specific behaviors in ASD patients and relevant animal models. In addition, the HS mediates neuroinflammation, which may be heightened in ASD. Together, this suggests that the HS may also be altered in ASD. Using RNA sequencing (RNA-seq), we investigated genome-wide expression, as well as a focused gene set analysis of key HS genes (HDC, HNMT, HRH1, HRH2, HRH3 and HRH4) in postmortem dorsolateral prefrontal cortex (DLPFC) initially in 13 subjects with ASD and 39 matched controls. At the genome level, eight transcripts were differentially expressed (false discovery rate <0.05), six of which were small nucleolar RNAs (snoRNAs). There was no significant diagnosis effect on any of the individual HS genes but expression of the gene set of HNMT, HRH1, HRH2 and HRH3 was significantly altered. Curated HS gene sets were also significantly differentially expressed. Differential expression analysis of these gene sets in an independent RNA-seq ASD data set from DLPFC of 47 additional subjects confirmed these findings. Understanding the physiological relevance of an altered HS may suggest new therapeutic options for the treatment of ASD.
Collapse
Affiliation(s)
- C Wright
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,AstraZeneca Postdoc Program, Innovative Medicines and Early Development, Waltham, MA, USA
| | - J H Shin
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - A Rajpurohit
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - A Deep-Soboslay
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - L Collado-Torres
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - N J Brandon
- AstraZeneca Neuroscience, Innovative Medicines and Early Development, Waltham, MA, USA
| | - T M Hyde
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - J E Kleinman
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - A E Jaffe
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA,Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - A J Cross
- AstraZeneca Neuroscience, Innovative Medicines and Early Development, Waltham, MA, USA
| | - D R Weinberger
- Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD, USA,The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA,Lieber Institute for Brain Development, Clinical Sciences, Johns Hopkins School of Medicine, Johns Hopkins Medical Campus, 855 North Wolfe Street, Suite 300, 3rd Floor, Baltimore, MD 21205, USA. E-mail:
| |
Collapse
|
42
|
Domínguez Á, Muñoz E, López MC, Cordero M, Martínez JP, Viñas M. Transcriptomics as a tool to discover new antibacterial targets. Biotechnol Lett 2017; 39:819-828. [PMID: 28289911 DOI: 10.1007/s10529-017-2319-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Accepted: 03/07/2017] [Indexed: 12/20/2022]
Abstract
The emergence of antibiotic-resistant pathogens, multiple drug-resistance, and extremely drug-resistant strains demonstrates the need for improved strategies to discover new drug-based compounds. The development of transcriptomics, proteomics, and metabolomics has provided new tools for global studies of living organisms. However, the compendium of expression profiles produced by these methods has introduced new scientific challenges into antimicrobial research. In this review, we discuss the practical value of transcriptomic techniques as well as their difficulties and pitfalls. We advocate the construction of new databases of transcriptomic data, using standardized formats in addition to standardized models of bacterial and yeast similar to those used in systems biology. The inclusion of proteomic and metabolomic data is also essential, as the resulting networks can provide a landscape to rationally predict and exploit new drug targets and to understand drug synergies.
Collapse
Affiliation(s)
- Ángel Domínguez
- Department of Microbiology and Genetics, Universidad de Salamanca, Plaza de los Drs. de la Reina s/n, 37007, Salamanca, Spain.
| | - Elisa Muñoz
- Department of Cell Biology & Pathology, Universidad de Salamanca, Salamanca, Spain
| | - M Carmen López
- Department of Microbiology and Genetics, Universidad de Salamanca, Plaza de los Drs. de la Reina s/n, 37007, Salamanca, Spain
| | - Miguel Cordero
- Department of Medicine, Universidad de Salamanca, Salamanca, Spain
| | - José Pedro Martínez
- Department of Microbiology & Ecology, Universitat de Valencia/Estudi General (UVEG), Valencia, Spain
| | - Miguel Viñas
- Department of Pathology and Experimental Therapeutics, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
43
|
Abstract
This research is motivated from the analysis of a real gene expression data that aims to identify a subset of "interesting" or "significant" genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample t-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into "over-expressed" and "under-expressed" genes, pairs "over-expressed" and "under-expressed" genes, defines the p-values for gene pairs via column permutations, and then applies the standard FDR method to select "significant" gene pairs instead of "significant" individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.
Collapse
Affiliation(s)
- Sarah E Holte
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Eva K Lee
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Yajun Mei
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| |
Collapse
|
44
|
Abstract
The transcriptome is a powerful proxy for the physiological state of a cell, healthy or diseased. As a result, transcriptome analysis has become a key tool in understanding the molecular changes that accompany bacterial infections of eukaryotic cells. Until recently, such transcriptomic studies have been technically limited to analyzing mRNA expression changes in either the bacterial pathogen or the infected eukaryotic host cell. However, the increasing sensitivity of high-throughput RNA sequencing now enables "dual RNA-seq" studies, simultaneously capturing all classes of coding and noncoding transcripts in both the pathogen and the host. In the five years since the concept of dual RNA-seq was introduced, the technique has been applied to a range of infection models. This has not only led to a better understanding of the physiological changes in pathogen and host during the course of an infection but has also revealed hidden molecular phenotypes of virulence-associated small noncoding RNAs that were not visible in standard infection assays. Here, we use the knowledge gained from these recent studies to suggest experimental and computational guidelines for the design of future dual RNA-seq studies. We conclude this review by discussing prospective applications of the technique.
Collapse
Affiliation(s)
- Alexander J. Westermann
- RNA Biology Group, Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
| | - Lars Barquist
- RNA Biology Group, Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
| | - Jörg Vogel
- RNA Biology Group, Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
- Helmholtz Institute for RNA-based Infection Research (HIRI), Würzburg, Germany
- * E-mail:
| |
Collapse
|
45
|
Rahmatallah Y, Zybailov B, Emmert-Streib F, Glazko G. GSAR: Bioconductor package for Gene Set analysis in R. BMC Bioinformatics 2017; 18:61. [PMID: 28118818 PMCID: PMC5259853 DOI: 10.1186/s12859-017-1482-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 01/10/2017] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Gene set analysis (in a form of functionally related genes or pathways) has become the method of choice for analyzing omics data in general and gene expression data in particular. There are many statistical methods that either summarize gene-level statistics for a gene set or apply a multivariate statistic that accounts for intergene correlations. Most available methods detect complex departures from the null hypothesis but lack the ability to identify the specific alternative hypothesis that rejects the null. RESULTS GSAR (Gene Set Analysis in R) is an open-source R/Bioconductor software package for gene set analysis (GSA). It implements self-contained multivariate non-parametric statistical methods testing a complex null hypothesis against specific alternatives, such as differences in mean (shift), variance (scale), or net correlation structure. The package also provides a graphical visualization tool, based on the union of two minimum spanning trees, for correlation networks to examine the change in the correlation structures of a gene set between two conditions and highlight influential genes (hubs). CONCLUSIONS Package GSAR provides a set of multivariate non-parametric statistical methods that test a complex null hypothesis against specific alternatives. The methods in package GSAR are applicable to any type of omics data that can be represented in a matrix format. The package, with detailed instructions and examples, is freely available under the GPL (> = 2) license from the Bioconductor web site.
Collapse
Affiliation(s)
- Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA.
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA
| | - Frank Emmert-Streib
- Computational Medicine and Statistical Learning Laboratory, Tampere University of Technology, Korkeakoulunkatu 1, Tampere, FI-33720, Finland
| | - Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA
| |
Collapse
|
46
|
Kojima T, Kunitake E, Ihara K, Kobayashi T, Nakano H. A Robust Analytical Pipeline for Genome-Wide Identification of the Genes Regulated by a Transcription Factor: Combinatorial Analysis Performed Using gSELEX-Seq and RNA-Seq. PLoS One 2016; 11:e0159011. [PMID: 27411092 PMCID: PMC4943734 DOI: 10.1371/journal.pone.0159011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 06/25/2016] [Indexed: 11/19/2022] Open
Abstract
For identifying the genes that are regulated by a transcription factor (TF), we have established an analytical pipeline that combines genomic systematic evolution of ligands by exponential enrichment (gSELEX)-Seq and RNA-Seq. Here, SELEX was used to select DNA fragments from an Aspergillus nidulans genomic library that bound specifically to AmyR, a TF from A. nidulans. High-throughput sequencing data were obtained for the DNAs enriched through the selection, following which various in silico analyses were performed. Mapping reads to the genome revealed the binding motifs including the canonical AmyR-binding motif, CGGN8CGG, as well as the candidate promoters controlled by AmyR. In parallel, differentially expressed genes related to AmyR were identified by using RNA-Seq analysis with samples from A. nidulans WT and amyR deletant. By obtaining the intersecting set of genes detected using both gSELEX-Seq and RNA-Seq, the genes directly regulated by AmyR in A. nidulans can be identified with high reliability. This analytical pipeline is a robust platform for comprehensive genome-wide identification of the genes that are regulated by a target TF.
Collapse
Affiliation(s)
- Takaaki Kojima
- Department of Bioengineering Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
- * E-mail: (TK); (HN)
| | - Emi Kunitake
- Department of Biological Mechanisms and Functions, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Kunio Ihara
- Center for Gene Research, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8602, Japan
| | - Tetsuo Kobayashi
- Department of Biological Mechanisms and Functions, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Hideo Nakano
- Department of Bioengineering Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
- * E-mail: (TK); (HN)
| |
Collapse
|
47
|
Hackl H, Charoentong P, Finotello F, Trajanoski Z. Computational genomics tools for dissecting tumour–immune cell interactions. Nat Rev Genet 2016; 17:441-58. [DOI: 10.1038/nrg.2016.67] [Citation(s) in RCA: 188] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|