1
|
Rosati D, Palmieri M, Brunelli G, Morrione A, Iannelli F, Frullanti E, Giordano A. Differential gene expression analysis pipelines and bioinformatic tools for the identification of specific biomarkers: A review. Comput Struct Biotechnol J 2024; 23:1154-1168. [PMID: 38510977 PMCID: PMC10951429 DOI: 10.1016/j.csbj.2024.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 02/20/2024] [Accepted: 02/20/2024] [Indexed: 03/22/2024] Open
Abstract
In recent years, the role of bioinformatics and computational biology together with omics techniques and transcriptomics has gained tremendous importance in biomedicine and healthcare, particularly for the identification of biomarkers for precision medicine and drug discovery. Differential gene expression (DGE) analysis is one of the most used techniques for RNA-sequencing (RNA-seq) data analysis. This tool, which is typically used in various RNA-seq data processing applications, allows the identification of differentially expressed genes across two or more sample sets. Functional enrichment analyses can then be performed to annotate and contextualize the resulting gene lists. These studies provide valuable information about disease-causing biological processes and can help in identifying molecular targets for novel therapies. This review focuses on differential gene expression (DGE) analysis pipelines and bioinformatic techniques commonly used to identify specific biomarkers and discuss the advantages and disadvantages of these techniques.
Collapse
Affiliation(s)
- Diletta Rosati
- Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
- Cancer Genomics & Systems Biology Lab, Dept. of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Italy
| | - Maria Palmieri
- Cancer Genomics & Systems Biology Lab, Dept. of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Italy
| | - Giulia Brunelli
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Italy
| | - Andrea Morrione
- Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Department of Biology, College of Science and Technology, Temple University, Philadelphia, PA 19122, USA
| | - Francesco Iannelli
- Laboratory of Molecular Microbiology and Biotechnology, Department of Medical Biotechnologies, University of Siena, Siena, Italy
| | - Elisa Frullanti
- Cancer Genomics & Systems Biology Lab, Dept. of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, Italy
| | - Antonio Giordano
- Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
- Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Department of Biology, College of Science and Technology, Temple University, Philadelphia, PA 19122, USA
| |
Collapse
|
2
|
Beird HC, Cloutier JM, Gokgoz N, Eeles C, Griffin AM, Ingram DR, Wani KM, Segura RL, Cohen L, Ho C, Wunder JS, Andrulis IL, Futreal PA, Haibe-Kains B, Lazar AJ, Wang WL, Przybyl J, Demicco EG. Epigenomic and transcriptomic profiling of solitary fibrous tumors identifies site-specific patterns and candidate genes regulated by DNA methylation. J Transl Med 2024:102146. [PMID: 39357799 DOI: 10.1016/j.labinv.2024.102146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 09/11/2024] [Accepted: 09/24/2024] [Indexed: 10/04/2024] Open
Abstract
Solitary fibrous tumor (SFT) is a rare mesenchymal neoplasm which can arise at any anatomic site and is characterized by recurrent NAB2::STAT6 fusions and metastatic progression in 10-30%. The cell of origin has not been identified. Despite some progress in understanding the contribution of heterogeneous fusion types and secondary mutations to SFT biology, epigenetic alterations in extrameningeal SFT remain largely unexplored, and most sarcoma research to date has focused on the use of methylation profiling for tumor classification. We interrogated genome-wide DNA methylation in 79 SFTs to identify informative epigenetic changes. RNA-seq data from targeted panels and data from the Cancer Genome Atlas (TCGA) were used for orthogonal validation of selected findings. In unsupervised clustering analysis, the top 500 most variable CpGs segregated SFTs by primary anatomic site. Differentially methylated genes (DMGs) associated with primary SFT site included EGFR, TBX15, multiple HOX genes and their cofactors EBF1, EBF3, and PBX1, as well as RUNX1 and MEIS1. Of the 20 DMGs that were interrogated on the RNA-seq panel, twelve were significantly differentially expressed according to site. However, with the exception of TBX15, most of these also showed differential expression according to NAB2::STAT6 fusion type, suggesting that the fusion oncogene contributes to transcriptional regulation of these genes. Transcriptomic data confirmed an inverse correlation between gene methylation and the expression of TBX15 in both SFT and TCGA sarcomas. TBX15 also showed differential mRNA expression and 5' UTR methylation between tumors located in different anatomic sites in TCGA data. In all analyses, TBX15 methylation and mRNA expression retained the strongest association with tissue of origin in SFT and other sarcomas, suggesting a possible marker to distinguish metastatic tumors from new primaries without genomic profiling. Epigenetic signatures may further help to identify SFT progenitor cells at different anatomic sites.
Collapse
Affiliation(s)
- Hannah C Beird
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jeffrey M Cloutier
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Nalan Gokgoz
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital Toronto, ON, Canada
| | - Christopher Eeles
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada
| | - Anthony M Griffin
- University of Toronto Musculoskeletal Oncology Unit, Mount Sinai Hospital, Toronto, Canada
| | - Davis R Ingram
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Khalida M Wani
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Rossana Lazcano Segura
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Luca Cohen
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Carl Ho
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jay S Wunder
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital Toronto, ON, Canada; University of Toronto Musculoskeletal Oncology Unit, Mount Sinai Hospital, Toronto, Canada
| | - Irene L Andrulis
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital Toronto, ON, Canada; Department of Molecular Genetics Canada & Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
| | - P Andrew Futreal
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2C1, Canada; Department of Medical Biophysics, University of Toronto, & Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Alexander J Lazar
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wei-Lien Wang
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Joanna Przybyl
- Department of Surgery, McGill University & Cancer Research Program, The Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Elizabeth G Demicco
- Department of Pathology & Laboratory Medicine, Mount Sinai Hospital & Department of Laboratory Medicine and pathobiology, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Wang S, Di Y, Yang Y, Salovska B, Li W, Hu L, Yin J, Shao W, Zhou D, Cheng J, Liu D, Yang H, Liu Y. PTMoreR-enabled cross-species PTM mapping and comparative phosphoproteomics across mammals. CELL REPORTS METHODS 2024; 4:100859. [PMID: 39255793 PMCID: PMC11440062 DOI: 10.1016/j.crmeth.2024.100859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 05/13/2024] [Accepted: 08/15/2024] [Indexed: 09/12/2024]
Abstract
To support PTM proteomic analysis and annotation in different species, we developed PTMoreR, a user-friendly tool that considers the surrounding amino acid sequences of PTM sites during BLAST, enabling a motif-centric analysis across species. By controlling sequence window similarity, PTMoreR can map phosphoproteomic results between any two species, perform site-level functional enrichment analysis, and generate kinase-substrate networks. We demonstrate that the majority of real P-sites in mice can be inferred from experimentally derived human P-sites with PTMoreR mapping. Furthermore, the compositions of 129 mammalian phosphoproteomes can also be predicted using PTMoreR. The method also identifies cross-species phosphorylation events that occur on proteins with an increased tendency to respond to the environmental factors. Moreover, the classic kinase motifs can be extracted across mammalian species, offering an evolutionary angle for refining current motifs. PTMoreR supports PTM proteomics in non-human species and facilitates quantitative phosphoproteomic analysis.
Collapse
Affiliation(s)
- Shisheng Wang
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Yi Di
- Yale Cancer Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Yin Yang
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Barbora Salovska
- Yale Cancer Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Wenxue Li
- Yale Cancer Biology Institute, Yale University, West Haven, CT 06516, USA
| | - Liqiang Hu
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Jiahui Yin
- Information Research Institute, Tongji University, Shanghai 200092, China
| | - Wenguang Shao
- State Key Laboratory of Microbial Metabolism, School of Life Science & Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dong Zhou
- Department of Medicine, Division of Nephrology, University of Connecticut School of Medicine, Farmington, CT 06030, USA
| | - Jingqiu Cheng
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Dan Liu
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China; State Key Laboratory of Respiratory Health and Multimorbidity, West China Hospital, Sichuan University, Chengdu 610041, China.
| | - Hao Yang
- Department of Pulmonary and Critical Care Medicine, Proteomics-Metabolomics Analysis Platform, and NHC Key Lab of Transplant Engineering and Immunology, West China Hospital, Sichuan University, Chengdu 610041, China.
| | - Yansheng Liu
- Yale Cancer Biology Institute, Yale University, West Haven, CT 06516, USA; Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06520, USA; Department of Biomedical Informatics & Data Science, Yale Univeristy School of Medicine, New Haven, CT 06510, USA.
| |
Collapse
|
4
|
Zanfardino M, Franzese M, Geraci F. DeClUt: Decluttering differentially expressed genes through clustering of their expression profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108258. [PMID: 38851122 DOI: 10.1016/j.cmpb.2024.108258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 04/26/2024] [Accepted: 05/29/2024] [Indexed: 06/10/2024]
Abstract
BACKGROUND AND OBJECTIVE differential expression analysis is one of the most popular activities in transcriptomic studies based on next-generation sequencing technologies. In fact, differentially expressed genes (DEGs) between two conditions represent ideal prognostic and diagnostic candidate biomarkers for many pathologies. As a result, several algorithms, such as DESeq2 and edgeR, have been developed to identify DEGs. Despite their widespread use, there is no consensus on which model performs best for different types of data, and many existing methods suffer from high False Discovery Rates (FDR). METHODS we present a new algorithm, DeClUt, based on the intuition that the expression profile of differentially expressed genes should form two reasonably compact and well-separated clusters. This, in turn, implies that the bipartition induced by the two conditions being compared should overlap with the clustering. The clustering algorithm underlying DeClUt was designed to be robust to outliers typical of RNA-seq data. In particular, we used the average silhouette function to enforce membership assignment of samples to the most appropriate condition. RESULTS DeClUt was tested on real RNA-seq datasets and benchmarked against four of the most widely used methods (edgeR, DESeq2, NOISeq, and SAMseq). Experiments showed a higher self-consistency of results than the competitors as well as a significantly lower False Positive Rate (FPR). Moreover, tested on a real prostate cancer RNA-seq dataset, DeClUt has highlighted 8 DE genes, linked to neoplastic process according to DisGeNET database, that none of the other methods had identified. CONCLUSIONS our work presents a novel algorithm that builds upon basic concepts of data clustering and exhibits greater consistency and significantly lower False Positive Rate than state-of-the-art methods. Additionally, DeClUt is able to highlight relevant differentially expressed genes not otherwise identified by other tools contributing to improve efficacy of differential expression analyses in various biological applications.
Collapse
Affiliation(s)
| | - Monica Franzese
- IRCCS Synlab SDN, Via E. Gianturco, 113, Naples, 80143, Italy.
| | - Filippo Geraci
- Institute for Informatics and Telematics, CNR, Via G. Moruzzi 1, Pisa, 56124, Italy
| |
Collapse
|
5
|
Filomena A, Giovanni S, Ginevra S, Santiago N, Di Fasano Miriam S, Peppino M, Alessandra C, Antonia DM, Giuliana B, Rosanna P, Marco S, Lorena B. Identification of a circular RNA isoform of WASHC2A as a prognostic factor for high-risk paediatric B-ALL patients. Biomed Pharmacother 2024; 177:116903. [PMID: 38917755 DOI: 10.1016/j.biopha.2024.116903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/24/2024] [Accepted: 06/06/2024] [Indexed: 06/27/2024] Open
Abstract
Pediatric B-cell acute lymphoblastic leukemia (B-ALL) is a serious disease for which a better understanding of prognostic factors and new therapeutic targets is needed. Circular RNAs (circRNAs) are promising markers due to their stability and differential expression patterns in various diseases. However, their role in pediatric B-ALL patients, particularly in risk stratification and relapse prediction, remains poorly understood. In this study, we comprehensively examined the circRNA landscape in pediatric B-ALL patients, focusing on both high-risk and standard-risk patients. Using advanced sequencing techniques and sophisticated bioinformatics tools, we identified thousands of circRNAs, including a novel circRNA derived from the WASHC2A gene, termed circWASHC2A. CircWASHC2A showed differential expression between high-risk and standard-risk patients and exhibited potential for predicting relapse in high-risk patients. Functional experiments highlighted a role for circWASHC2A in regulating cell cycle progression and mitochondrial respiratory activity in leukaemic cells. Transcriptomic analysis further supported these findings, suggesting the involvement of circWASHC2A in signalling pathways relevant to leukaemia pathogenesis. This study provides in-depth insights into the circRNA landscape of pediatric B-ALL patients and identifies circWASHC2A as a potential biomarker for risk stratification and relapse prediction, with significant implications for tailoring diagnostic and therapeutic strategies in this patient population.
Collapse
Affiliation(s)
| | | | | | | | | | - Mirabelli Peppino
- Department of Paediatric Hemato-Oncology, Santobono-Pausilipon Children's Hospital, AORN, Naples 80122, Italy
| | - Cianflone Alessandra
- Department of Paediatric Hemato-Oncology, Santobono-Pausilipon Children's Hospital, AORN, Naples 80122, Italy
| | - De Matteo Antonia
- Department of Paediatric Hemato-Oncology, Santobono-Pausilipon Children's Hospital, AORN, Naples 80122, Italy
| | - Beneduce Giuliana
- Department of Paediatric Hemato-Oncology, Santobono-Pausilipon Children's Hospital, AORN, Naples 80122, Italy
| | - Parasole Rosanna
- Department of Paediatric Hemato-Oncology, Santobono-Pausilipon Children's Hospital, AORN, Naples 80122, Italy
| | | | - Buono Lorena
- IRCCS SYNLAB SDN, Via E. Gianturco 113, Naples 80143, Italy.
| |
Collapse
|
6
|
Ando Y, Shimokawa A. Detecting differentially expressed genes from RNA-seq data using fuzzy clustering. Int J Biostat 2024; 0:ijb-2023-0125. [PMID: 39069791 DOI: 10.1515/ijb-2023-0125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 06/02/2024] [Indexed: 07/30/2024]
Abstract
A two-group comparison test is generally performed on RNA sequencing data to detect differentially expressed genes (DEGs). However, the accuracy of this method is low due to the small sample size. To address this, we propose a method using fuzzy clustering that artificially generates data with expression patterns similar to those of DEGs to identify genes that are highly likely to be classified into the same cluster as the initial cluster data. The proposed method is advantageous in that it does not perform any test. Furthermore, a certain level of accuracy can be maintained even when the sample size is biased, and we show that such a situation may improve the accuracy of the proposed method. We compared the proposed method with the conventional method using simulations. In the simulations, we changed the sample size and difference between the expression levels of group 1 and group 2 in the DEGs to obtain the desired accuracy of the proposed method. The results show that the proposed method is superior in all cases under the conditions simulated. We also show that the effect of the difference between group 1 and group 2 on the accuracy is more prominent when the sample size is biased.
Collapse
Affiliation(s)
- Yuki Ando
- 26413 Tokyo University of Science , Shinjuku-ku, 162-8601, Tokyo, Japan
| | - Asanao Shimokawa
- Department of Mathematics, 26413 Tokyo University of Science , 1-3 Kagurazaka, Shinjuku-ku, 162-8601, Tokyo, Japan
| |
Collapse
|
7
|
Jiang G, Zheng JY, Ren SN, Yin W, Xia X, Li Y, Wang HL. A comprehensive workflow for optimizing RNA-seq data analysis. BMC Genomics 2024; 25:631. [PMID: 38914930 PMCID: PMC11197194 DOI: 10.1186/s12864-024-10414-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/15/2024] [Indexed: 06/26/2024] Open
Abstract
BACKGROUND Current RNA-seq analysis software for RNA-seq data tends to use similar parameters across different species without considering species-specific differences. However, the suitability and accuracy of these tools may vary when analyzing data from different species, such as humans, animals, plants, fungi, and bacteria. For most laboratory researchers lacking a background in information science, determining how to construct an analysis workflow that meets their specific needs from the array of complex analytical tools available poses a significant challenge. RESULTS By utilizing RNA-seq data from plants, animals, and fungi, it was observed that different analytical tools demonstrate some variations in performance when applied to different species. A comprehensive experiment was conducted specifically for analyzing plant pathogenic fungal data, focusing on differential gene analysis as the ultimate goal. In this study, 288 pipelines using different tools were applied to analyze five fungal RNA-seq datasets, and the performance of their results was evaluated based on simulation. This led to the establishment of a relatively universal and superior fungal RNA-seq analysis pipeline that can serve as a reference, and certain standards for selecting analysis tools were derived for reference. Additionally, we compared various tools for alternative splicing analysis. The results based on simulated data indicated that rMATS remained the optimal choice, although consideration could be given to supplementing with tools such as SpliceWiz. CONCLUSION The experimental results demonstrate that, in comparison to the default software parameter configurations, the analysis combination results after tuning can provide more accurate biological insights. It is beneficial to carefully select suitable analysis software based on the data, rather than indiscriminately choosing tools, in order to achieve high-quality analysis results more efficiently.
Collapse
Affiliation(s)
- Gao Jiang
- School of Information Science and Technology, School of Artificial Intelligence, Beijing Forestry University, Beijing, 100083, People's Republic of China
| | - Juan-Yu Zheng
- School of Information Science and Technology, School of Artificial Intelligence, Beijing Forestry University, Beijing, 100083, People's Republic of China
| | - Shu-Ning Ren
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, People's Republic of China
| | - Weilun Yin
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, People's Republic of China
| | - Xinli Xia
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, People's Republic of China
| | - Yun Li
- School of Information Science and Technology, School of Artificial Intelligence, Beijing Forestry University, Beijing, 100083, People's Republic of China.
| | - Hou-Ling Wang
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, People's Republic of China.
| |
Collapse
|
8
|
Liu Z, Petinrin OO, Toseef M, Chen N, Wong KC. Construction of Immune Infiltration-Related LncRNA Signatures Based on Machine Learning for the Prognosis in Colon Cancer. Biochem Genet 2024; 62:1925-1952. [PMID: 37792224 DOI: 10.1007/s10528-023-10516-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 09/05/2023] [Indexed: 10/05/2023]
Abstract
Colon cancer is one of the malignant tumors with high morbidity, lethality, and prevalence across global human health. Molecular biomarkers play key roles in its prognosis. In particular, immune-related lncRNAs (IRL) have attracted enormous interest in diagnosis and treatment, but less is known about their potential functions. We aimed to investigate dysfunctional IRL and construct a risk model for improving the outcomes of patients. Nineteen immune cell types were collected for identifying house-keeping lncRNAs (HKLncRNA). GSE39582 and TCGA-COAD were treated as the discovery and validation datasets, respectively. Four machine learning algorithms (LASSO, Random Forest, Boruta, and Xgboost) and a Gaussian mixture model were utilized to mine the optimal combination of lncRNAs. Univariate and multivariate Cox regression was utilized to construct the risk score model. We distinguished the functional difference in an immune perspective between low- and high-risk cohorts calculated by this scoring system. Finally, we provided a nomogram. By leveraging the microarray, sequencing, and clinical data for immune cells and colon cancer patients, we identified the 221 HKLncRNAs with a low cell type-specificity index. Eighty-seven lncRNAs were up-regulated in the immune compared to cancer cells. Twelve lncRNAs were beneficial in improving performance. A risk score model with three lncRNAs (CYB561D2, LINC00638, and DANCR) was proposed with robust ROC performance on an independent dataset. According to immune-related analysis, the risk score is strongly associated with the tumor immune microenvironment. Our results emphasized IRL has the potential to be a powerful and effective therapy for enhancing the prognostic of colon cancer.
Collapse
Affiliation(s)
- Zhe Liu
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | | | - Muhammad Toseef
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Nanjun Chen
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong, China.
| |
Collapse
|
9
|
Guan Q, Zhang Z, Zhao P, Huang L, Lu R, Liu C, Zhao Y, Shao X, Tian Y, Li J. Identification of idiopathic pulmonary fibrosis hub genes and exploration of the mechanisms of action of Jinshui Huanxian formula. Int Immunopharmacol 2024; 132:112048. [PMID: 38593509 DOI: 10.1016/j.intimp.2024.112048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/27/2024] [Accepted: 04/06/2024] [Indexed: 04/11/2024]
Abstract
Idiopathic pulmonary fibrosis (IPF) is a common and heterogeneous chronic disease, and the mechanism of Jinshui Huanxian formula (JHF) on IPF remains unclear. For a total of 385 lung normal tissue samples from the Gene Expression Omnibus database, 37,777,639 gene pairs were identified through microarray and RNA-seq platforms. Using the individualized differentially expressed gene (DEG) analysis algorithm RankComp (FDR < 0.01), we identified 344 genes as DEGs in at least 95 % (n = 81) of the IPF samples. Of these genes, IGF1, IFNGR1, GLI2, HMGCR, DNM1, KIF4A, and TNFRSF11A were identified as hub genes. These genes were verified using quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) in mice with pulmonary fibrosis (PF) and MRC-5 cells, and they were highly effective at classifying IPF samples in the independent dataset GSE134692 (AUC = 0.587-0.788) and mice with PF (AUC = 0.806-1.000). Moreover, JHF ameliorated the pathological changes in mice with PF and significantly reversed the changes in hub gene expression (KIF4A, IFNGR1, and HMGCR). In conclusion, a series of IPF hub genes was identified, and validated in an independent dataset, mice with PF, and MRC-5 cells. Moreover, the abnormal gene expression was normalized by JHF. These findings provide guidance for further exploration of the pathogenesis and treatment of IPF.
Collapse
Affiliation(s)
- Qingzhou Guan
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Zhenzhen Zhang
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Peng Zhao
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Lidong Huang
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Ruilong Lu
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Chunlei Liu
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Yakun Zhao
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Xuejie Shao
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China
| | - Yange Tian
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou 450046, China; Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China.
| | - Jiansheng Li
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases Co-constructed by Henan Province and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou 450046, China; Department of Respiratory Diseases, The First Affiliated Hospital of Henan University of Chinese Medicine, Zhengzhou 450000, China.
| |
Collapse
|
10
|
Li H, Khang TF. SIEVE: One-stop differential expression, variability, and skewness analyses using RNA-Seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.09.588804. [PMID: 38645120 PMCID: PMC11030344 DOI: 10.1101/2024.04.09.588804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Motivation RNA-Seq data analysis is commonly biased towards detecting differentially expressed genes and insufficiently conveys the complexity of gene expression changes between biological conditions. This bias arises because discrete models of RNA-Seq count data cannot fully characterize the mean, variance, and skewness of gene expression distribution using independent model parameters. A unified framework that simultaneously tests for differential expression, variability, and skewness is needed to realize the full potential of RNA-Seq data analysis in a systems biology context. Results We present SIEVE, a statistical methodology that provides the desired unified framework. SIEVE embraces a compositional data analysis framework that transforms discrete RNA-Seq counts to a continuous form with a distribution that is well-fitted by a skew-normal distribution. Simulation results show that SIEVE controls the false discovery rate and probability of Type II error better than existing methods for differential expression analysis. Analysis of the Mayo RNA-Seq dataset for Alzheimer's disease using SIEVE reveals that a gene set with significant expression difference in mean, standard deviation and skewness between the control and the Alzheimer's disease group strongly predicts a subject's disease state. Furthermore, functional enrichment analysis shows that relying solely on differentially expressed genes detects only a segment of a much broader spectrum of biological aspects associated with Alzheimer's disease. The latter aspects can only be revealed using genes that show differential variability and skewness. Thus, SIEVE enables fresh perspectives for understanding the intricate changes in gene expression that occur in complex diseases. Availability The SIEVE R package and source codes are available at https://github.com/Divo-Lee/SIEVE .
Collapse
|
11
|
Erdogdu B, Varabyou A, Hicks SC, Salzberg SL, Pertea M. Detecting differential transcript usage in complex diseases with SPIT. CELL REPORTS METHODS 2024; 4:100736. [PMID: 38508189 PMCID: PMC10985272 DOI: 10.1016/j.crmeth.2024.100736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 12/21/2023] [Accepted: 02/27/2024] [Indexed: 03/22/2024]
Abstract
Differential transcript usage (DTU) plays a crucial role in determining how gene expression differs among cells, tissues, and developmental stages, contributing to the complexity and diversity of biological systems. In abnormal cells, it can also lead to deficiencies in protein function and underpin disease pathogenesis. Analyzing DTU via RNA sequencing (RNA-seq) data is vital, but the genetic heterogeneity in populations with complex diseases presents an intricate challenge due to diverse causal events and undetermined subtypes. Although the majority of common diseases in humans are categorized as complex, state-of-the-art DTU analysis methods often overlook this heterogeneity in their models. We therefore developed SPIT, a statistical tool that identifies predominant subgroups in transcript usage within a population along with their distinctive sets of DTU events. This study provides comprehensive assessments of SPIT's methodology and applies it to analyze brain samples from individuals with schizophrenia, revealing previously unreported DTU events in six candidate genes.
Collapse
Affiliation(s)
- Beril Erdogdu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA.
| | - Ales Varabyou
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Stephanie C Hicks
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA; Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA; Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA; Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
12
|
Ricker CA, Meli K, Van Allen EM. Historical perspective and future directions: computational science in immuno-oncology. J Immunother Cancer 2024; 12:e008306. [PMID: 38191244 PMCID: PMC10826578 DOI: 10.1136/jitc-2023-008306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2023] [Indexed: 01/10/2024] Open
Abstract
Immuno-oncology holds promise for transforming patient care having achieved durable clinical response rates across a variety of advanced and metastatic cancers. Despite these achievements, only a minority of patients respond to immunotherapy, underscoring the importance of elucidating molecular mechanisms responsible for response and resistance to inform the development and selection of treatments. Breakthroughs in molecular sequencing technologies have led to the generation of an immense amount of genomic and transcriptomic sequencing data that can be mined to uncover complex tumor-immune interactions using computational tools. In this review, we discuss existing and emerging computational methods that contextualize the composition and functional state of the tumor microenvironment, infer the reactivity and clonal dynamics from reconstructed immune cell receptor repertoires, and predict the antigenic landscape for immune cell recognition. We further describe the advantage of multi-omics analyses for capturing multidimensional relationships and artificial intelligence techniques for integrating omics data with histopathological and radiological images to encapsulate patterns of treatment response and tumor-immune biology. Finally, we discuss key challenges impeding their widespread use and clinical application and conclude with future perspectives. We are hopeful that this review will both serve as a guide for prospective researchers seeking to use existing tools for scientific discoveries and inspire the optimization or development of novel tools to enhance precision, ultimately expediting advancements in immunotherapy that improve patient survival and quality of life.
Collapse
Affiliation(s)
- Cora A Ricker
- Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Kevin Meli
- Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | | |
Collapse
|
13
|
Yi H, Lin Y, Chang Q, Jin W. A fast and globally optimal solution for RNA-seq quantification. Brief Bioinform 2023; 24:bbad298. [PMID: 37595963 DOI: 10.1093/bib/bbad298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 07/25/2023] [Accepted: 07/31/2023] [Indexed: 08/20/2023] Open
Abstract
Alignment-based RNA-seq quantification methods typically involve a time-consuming alignment process prior to estimating transcript abundances. In contrast, alignment-free RNA-seq quantification methods bypass this step, resulting in significant speed improvements. Existing alignment-free methods rely on the Expectation-Maximization (EM) algorithm for estimating transcript abundances. However, EM algorithms only guarantee locally optimal solutions, leaving room for further accuracy improvement by finding a globally optimal solution. In this study, we present TQSLE, the first alignment-free RNA-seq quantification method that provides a globally optimal solution for transcript abundances estimation. TQSLE adopts a two-step approach: first, it constructs a k-mer frequency matrix A for the reference transcriptome and a k-mer frequency vector b for the RNA-seq reads; then, it directly estimates transcript abundances by solving the linear equation ATAx = ATb. We evaluated the performance of TQSLE using simulated and real RNA-seq data sets and observed that, despite comparable speed to other alignment-free methods, TQSLE outperforms them in terms of accuracy. TQSLE is freely available at https://github.com/yhg926/TQSLE.
Collapse
Affiliation(s)
- Huiguang Yi
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, 97 Buxin Rd, Shenzhen, 518000, Guangdong, China
- School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Blvd, Shenzhen 518055, Guangdong, China
| | - Yanling Lin
- School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Blvd, Shenzhen 518055, Guangdong, China
| | - Qing Chang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, 97 Buxin Rd, Shenzhen, 518000, Guangdong, China
| | - Wenfei Jin
- School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Blvd, Shenzhen 518055, Guangdong, China
| |
Collapse
|
14
|
Wu EY, Singh NP, Choi K, Zakeri M, Vincent M, Churchill GA, Ackert-Bicknell CL, Patro R, Love MI. SEESAW: detecting isoform-level allelic imbalance accounting for inferential uncertainty. Genome Biol 2023; 24:165. [PMID: 37438847 PMCID: PMC10337143 DOI: 10.1186/s13059-023-03003-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 06/29/2023] [Indexed: 07/14/2023] Open
Abstract
Detecting allelic imbalance at the isoform level requires accounting for inferential uncertainty, caused by multi-mapping of RNA-seq reads. Our proposed method, SEESAW, uses Salmon and Swish to offer analysis at various levels of resolution, including gene, isoform, and aggregating isoforms to groups by transcription start site. The aggregation strategies strengthen the signal for transcripts with high uncertainty. The SEESAW suite of methods is shown to have higher power than other allelic imbalance methods when there is isoform-level allelic imbalance. We also introduce a new test for detecting imbalance that varies across a covariate, such as time.
Collapse
Affiliation(s)
- Euphy Y Wu
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA
| | - Noor P Singh
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | | | - Mohsen Zakeri
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | | | | | - Cheryl L Ackert-Bicknell
- Department of Orthopedics, School of Medicine, University of Colorado, Anschutz Campus, Aurora, CO, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
15
|
Erdogdu B, Varabyou A, Hicks SC, Salzberg SL, Pertea M. Detecting differential transcript usage in complex diseases with SPIT. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.10.548289. [PMID: 37503064 PMCID: PMC10369883 DOI: 10.1101/2023.07.10.548289] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Differential transcript usage (DTU) plays a crucial role in determining how gene expression differs among cells, tissues, and different developmental stages, thereby contributing to the complexity and diversity of biological systems. In abnormal cells, it can also lead to deficiencies in protein function, potentially leading to pathogenesis of diseases. Detecting such events for single-gene genetic traits is relatively uncomplicated; however, the heterogeneity of populations with complex diseases presents an intricate challenge due to the presence of diverse causal events and undetermined subtypes. SPIT is the first statistical tool that quantifies the heterogeneity in transcript usage within a population and identifies predominant subgroups along with their distinctive sets of DTU events. We provide comprehensive assessments of SPIT's methodology in both single-gene and complex traits and report the results of applying SPIT to analyze brain samples from individuals with schizophrenia. Our analysis reveals previously unreported DTU events in six candidate genes.
Collapse
Affiliation(s)
- Beril Erdogdu
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD, United States
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering; Baltimore, MD, United States
| | - Ales Varabyou
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD, United States
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, United States
| | - Stephanie C Hicks
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD, United States
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, MD, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, MD, USA
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD, United States
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering; Baltimore, MD, United States
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, United States
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, MD, USA
- Department of Genetic Medicine, Johns Hopkins School of Medicine; Baltimore, MD, United States
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University; Baltimore, MD, United States
- Department of Biomedical Engineering, Johns Hopkins School of Medicine and Whiting School of Engineering; Baltimore, MD, United States
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, United States
- Department of Genetic Medicine, Johns Hopkins School of Medicine; Baltimore, MD, United States
| |
Collapse
|
16
|
Boshuizen HC, Te Beest DE. Pitfalls in the statistical analysis of microbiome amplicon sequencing data. Mol Ecol Resour 2023; 23:539-548. [PMID: 36330663 DOI: 10.1111/1755-0998.13730] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 10/27/2022] [Indexed: 11/06/2022]
Abstract
Microbiome data are characterized by several aspects that make them challenging to analyse statistically: they are compositional, high dimensional and rich in zeros. A large array of statistical methods exist to analyse these data. Some are borrowed from other fields, such as ecology or RNA-sequencing, while others are custom-made for microbiome data. The large range of available methods, and which is continuously expanding, means that researchers have to invest considerable effort in choosing what method(s) to apply. In this paper we list 14 statistical methods or approaches that we think should be generally avoided. In several cases this is because we believe the assumptions behind the method are unlikely to be met for microbiome data. In other cases we see methods that are used in ways they are not intended to be used. We believe researchers would be helped by more critical evaluations of existing methods, as not all methods in use are suitable or have been sufficiently reviewed. We hope this paper contributes to a critical discussion on what methods are appropriate to use in the analysis of microbiome data.
Collapse
Affiliation(s)
| | - Dennis E Te Beest
- Biometris, Wageningen University and Research, Wageningen, The Netherlands
| |
Collapse
|
17
|
Revealing the History and Mystery of RNA-Seq. Curr Issues Mol Biol 2023; 45:1860-1874. [PMID: 36975490 PMCID: PMC10047236 DOI: 10.3390/cimb45030120] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 02/16/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023] Open
Abstract
Advances in RNA-sequencing technologies have led to the development of intriguing experimental setups, a massive accumulation of data, and high demand for tools to analyze it. To answer this demand, computational scientists have developed a myriad of data analysis pipelines, but it is less often considered what the most appropriate one is. The RNA-sequencing data analysis pipeline can be divided into three major parts: data pre-processing, followed by the main and downstream analyses. Here, we present an overview of the tools used in both the bulk RNA-seq and at the single-cell level, with a particular focus on alternative splicing and active RNA synthesis analysis. A crucial part of data pre-processing is quality control, which defines the necessity of the next steps; adapter removal, trimming, and filtering. After pre-processing, the data are finally analyzed using a variety of tools: differential gene expression, alternative splicing, and assessment of active synthesis, the latter requiring dedicated sample preparation. In brief, we describe the commonly used tools in the sample preparation and analysis of RNA-seq data.
Collapse
|
18
|
Zhong Y, Yang F, Su T, Wu X, Zheng W, Zhang L, Liang G, Wang L, Wang L, Wang S, Yang H. Proteome and phosphoproteome profiling of non-small cell lung cancer cell line A549 treated with TRAIL. Proteomics 2023; 23:e2200248. [PMID: 36222260 DOI: 10.1002/pmic.202200248] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 09/21/2022] [Accepted: 09/29/2022] [Indexed: 11/07/2022]
Abstract
Tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) is recognized for its promising therapeutic effects against cancer. However, mechanisms underlying the effect of TRAIL on protein expression, signal transduction, and apoptosis induction remain unclear. We surmised that a systematic analysis of the proteome and phosphoproteome associated with TRAIL signaling may help elucidate the mechanisms involved and facilitate the development of therapeutics. Therefore, we investigated the proteome and phosphoproteome of non-small cell lung cancer cell line A549 treated with TRAIL. Our results indicated that 126 proteins and 1684 phosphosites were markedly differentially expressed between the phosphate-buffered saline- and TRAIL-treated groups. The expression at protein and phosphosite levels were not completely consistent. Gene ontology functional analysis revealed that metal ion (zinc) binding was highly affected by TRAIL treatment. Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis showed that almost all pathways that involved differentially expressed phosphosites were associated with apoptosis. We also identified an important kinase, AKT1, and its series of substrates in TRAIL signaling. The results of this study may provide guidance for future research on tumor therapy using TRAIL.
Collapse
Affiliation(s)
- Yi Zhong
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Fen Yang
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Tao Su
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Xiyu Wu
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Wen Zheng
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Lu Zhang
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Ge Liang
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Lian Wang
- Chengdu Centre for Disease Control and Prevention, Chengdu, China
| | - Lijun Wang
- Department of Ophthalmology, The Third People's Hospital of Chengdu, The Affiliated Hospital of Southwest Jiaotong University, Chengdu, China
| | - Shisheng Wang
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| | - Hao Yang
- Proteomics-Metabolomics Platform of Core Facilities, Key Lab of Transplant Engineering and Immunology, MOH, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
19
|
Buratin A, Bortoluzzi S, Gaffo E. Systematic benchmarking of statistical methods to assess differential expression of circular RNAs. Brief Bioinform 2023; 24:6966517. [PMID: 36592056 PMCID: PMC9851295 DOI: 10.1093/bib/bbac612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/28/2022] [Accepted: 12/11/2022] [Indexed: 01/03/2023] Open
Abstract
Circular RNAs (circRNAs) are covalently closed transcripts involved in critical regulatory axes, cancer pathways and disease mechanisms. CircRNA expression measured with RNA-seq has particular characteristics that might hamper the performance of standard biostatistical differential expression assessment methods (DEMs). We compared 38 DEM pipelines configured to fit circRNA expression data's statistical properties, including bulk RNA-seq, single-cell RNA-seq (scRNA-seq) and metagenomics DEMs. The DEMs performed poorly on data sets of typical size. Widely used DEMs, such as DESeq2, edgeR and Limma-Voom, gave scarce results, unreliable predictions or even contravened the expected behaviour with some parameter configurations. Limma-Voom achieved the most consistent performance throughout different benchmark data sets and, as well as SAMseq, reasonably balanced false discovery rate (FDR) and recall rate. Interestingly, a few scRNA-seq DEMs obtained results comparable with the best-performing bulk RNA-seq tools. Almost all DEMs' performance improved when increasing the number of replicates. CircRNA expression studies require careful design, choice of DEM and DEM configuration. This analysis can guide scientists in selecting the appropriate tools to investigate circRNA differential expression with RNA-seq experiments.
Collapse
Affiliation(s)
- Alessia Buratin
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | | | - Enrico Gaffo
- Corresponding author: Enrico Gaffo, Department of Molecular Medicine, University of Padova - Via G. Colombo, 3—35131 Padova, Italy. Phone +39 049 827 6502; Fax +39 049 827 6209; E-mail:
| |
Collapse
|
20
|
Singh A, Hermann BP. Bulk and Single-Cell RNA-Seq Analyses for Studies of Spermatogonia. Methods Mol Biol 2023; 2656:37-70. [PMID: 37249866 DOI: 10.1007/978-1-0716-3139-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Robust methods have been developed that leverage next-generation sequencing (NGS) to measure abundance of all mRNAs (RNA-seq) in samples as small as individual cells in order to study the testicular transcriptome in mammals. In this chapter, we present robust options for implementing bioinformatics workflows for the analysis of bulk RNA-seq from aggregate samples of hundreds to millions of cells and single-cell RNA-seq from individual cells. We also provide detailed protocols for using the R packages DESeq2 and Seurat, important parameters for successful implementation, and considerations for drawing conclusions from the results.
Collapse
Affiliation(s)
- Anukriti Singh
- Department of Neuroscience, Developmental and Regenerative Biology, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Brian P Hermann
- Department of Neuroscience, Developmental and Regenerative Biology, University of Texas at San Antonio, San Antonio, TX, USA.
| |
Collapse
|
21
|
Costa-Silva J, Domingues DS, Menotti D, Hungria M, Lopes FM. Temporal progress of gene expression analysis with RNA-Seq data: A review on the relationship between computational methods. Comput Struct Biotechnol J 2022; 21:86-98. [PMID: 36514333 PMCID: PMC9730150 DOI: 10.1016/j.csbj.2022.11.051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open
Abstract
Analysis of differential gene expression from RNA-seq data has become a standard for several research areas. The steps for the computational analysis include many data types and file formats, and a wide variety of computational tools that can be applied alone or together as pipelines. This paper presents a review of the differential expression analysis pipeline, addressing its steps and the respective objectives, the principal methods available in each step, and their properties, therefore introducing an organized overview to this context. This review aims to address mainly the aspects involved in the differentially expressed gene (DEG) analysis from RNA sequencing data (RNA-seq), considering the computational methods. In addition, a timeline of the computational methods for DEG is shown and discussed, and the relationships existing between the most important computational tools are presented by an interaction network. A discussion on the challenges and gaps in DEG analysis is also highlighted in this review. This paper will serve as a tutorial for new entrants into the field and help established users update their analysis pipelines.
Collapse
Affiliation(s)
- Juliana Costa-Silva
- Department of Informatics – Federal University of Paraná, Rua Coronel Francisco Heráclito dos Santos, 100, 81531-990 Curitiba, Paraná, Brazil
| | - Douglas S. Domingues
- Department of Genetics, “Luiz de Queiroz” College of Agriculture, University of São Paulo, Av. Pádua Dias, 11, 13418-900 Piracicaba, São Paulo, Brazil
| | - David Menotti
- Department of Informatics – Federal University of Paraná, Rua Coronel Francisco Heráclito dos Santos, 100, 81531-990 Curitiba, Paraná, Brazil
| | - Mariangela Hungria
- Department of Soil Biotecnology - Embrapa Soybean, Cx. Postal 231, 86000-970 Londrina, Paraná, Brazil
| | - Fabrício Martins Lopes
- Department of Computer Science, Universidade Tecnológica Federal do Paraná – UTFPR, Av. Alberto Carazzai, 1640, 86300-000, Cornélio Procópio, Paraná, Brazil
| |
Collapse
|
22
|
Shah I, Bundy J, Chambers B, Everett LJ, Haggard D, Harrill J, Judson RS, Nyffeler J, Patlewicz G. Navigating Transcriptomic Connectivity Mapping Workflows to Link Chemicals with Bioactivities. Chem Res Toxicol 2022; 35:1929-1949. [PMID: 36301716 PMCID: PMC10483698 DOI: 10.1021/acs.chemrestox.2c00245] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Screening new compounds for potential bioactivities against cellular targets is vital for drug discovery and chemical safety. Transcriptomics offers an efficient approach for assessing global gene expression changes, but interpreting chemical mechanisms from these data is often challenging. Connectivity mapping is a potential data-driven avenue for linking chemicals to mechanisms based on the observation that many biological processes are associated with unique gene expression signatures (gene signatures). However, mining the effects of a chemical on gene signatures for biological mechanisms is challenging because transcriptomic data contain thousands of noisy genes. New connectivity mapping approaches seeking to distinguish signal from noise continue to be developed, spurred by the promise of discovering chemical mechanisms, new drugs, and disease targets from burgeoning transcriptomic data. Here, we analyze these approaches in terms of diverse transcriptomic technologies, public databases, gene signatures, pattern-matching algorithms, and statistical evaluation criteria. To navigate the complexity of connectivity mapping, we propose a harmonized scheme to coherently organize and compare published workflows. We first standardize concepts underlying transcriptomic profiles and gene signatures based on various transcriptomic technologies such as microarrays, RNA-Seq, and L1000 and discuss the widely used data sources such as Gene Expression Omnibus, ArrayExpress, and MSigDB. Next, we generalize connectivity mapping as a pattern-matching task for finding similarity between a query (e.g., transcriptomic profile for new chemical) and a reference (e.g., gene signature of known target). Published pattern-matching approaches fall into two main categories: vector-based use metrics like correlation, Jaccard index, etc., and aggregation-based use parametric and nonparametric statistics (e.g., gene set enrichment analysis). The statistical methods for evaluating the performance of different approaches are described, along with comparisons reported in the literature on benchmark transcriptomic data sets. Lastly, we review connectivity mapping applications in toxicology and offer guidance on evaluating chemical-induced toxicity with concentration-response transcriptomic data. In addition to serving as a high-level guide and tutorial for understanding and implementing connectivity mapping workflows, we hope this review will stimulate new algorithms for evaluating chemical safety and drug discovery using transcriptomic data.
Collapse
Affiliation(s)
- Imran Shah
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Joseph Bundy
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Bryant Chambers
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Logan J. Everett
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Derik Haggard
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Joshua Harrill
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Richard S. Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Johanna Nyffeler
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
- Oak Ridge Institute for Science and Education (ORISE) Postdoctoral Fellow, Oak Ridge, Tennessee, 37831, US
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, Office of Research and Development, US. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| |
Collapse
|
23
|
Bang D, Gu J, Park J, Jeong D, Koo B, Yi J, Shin J, Jung I, Kim S, Lee S. A Survey on Computational Methods for Investigation on ncRNA-Disease Association through the Mode of Action Perspective. Int J Mol Sci 2022; 23:ijms231911498. [PMID: 36232792 PMCID: PMC9570358 DOI: 10.3390/ijms231911498] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/18/2022] [Accepted: 09/26/2022] [Indexed: 02/01/2023] Open
Abstract
Molecular and sequencing technologies have been successfully used in decoding biological mechanisms of various diseases. As revealed by many novel discoveries, the role of non-coding RNAs (ncRNAs) in understanding disease mechanisms is becoming increasingly important. Since ncRNAs primarily act as regulators of transcription, associating ncRNAs with diseases involves multiple inference steps. Leveraging the fast-accumulating high-throughput screening results, a number of computational models predicting ncRNA-disease associations have been developed. These tools suggest novel disease-related biomarkers or therapeutic targetable ncRNAs, contributing to the realization of precision medicine. In this survey, we first introduce the biological roles of different ncRNAs and summarize the databases containing ncRNA-disease associations. Then, we suggest a new trend in recent computational prediction of ncRNA-disease association, which is the mode of action (MoA) network perspective. This perspective includes integrating ncRNAs with mRNA, pathway and phenotype information. In the next section, we describe computational methodologies widely used in this research domain. Existing computational studies are then summarized in terms of their coverage of the MoA network. Lastly, we discuss the potential applications and future roles of the MoA network in terms of integrating biological mechanisms for ncRNA-disease associations.
Collapse
Affiliation(s)
- Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Jeonghyeon Gu
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
| | - Joonhyeong Park
- Department of Computer Science and Engineering, Seoul National University, Seoul 08826, Korea
| | - Dabin Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Bonil Koo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Jungseob Yi
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
| | - Jihye Shin
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Inuk Jung
- Department of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
- Department of Computer Science and Engineering, Seoul National University, Seoul 08826, Korea
- MOGAM Institute for Biomedical Research, Yongin-si 16924, Korea
| | - Sunho Lee
- AIGENDRUG Co., Ltd., Seoul 08826, Korea
- Correspondence:
| |
Collapse
|
24
|
Del Giudice M, Foster JG, Peirone S, Rissone A, Caizzi L, Gaudino F, Parlato C, Anselmi F, Arkell R, Guarrera S, Oliviero S, Basso G, Rajan P, Cereda M. FOXA1 regulates alternative splicing in prostate cancer. Cell Rep 2022; 40:111404. [PMID: 36170835 PMCID: PMC9532847 DOI: 10.1016/j.celrep.2022.111404] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 05/28/2022] [Accepted: 09/01/2022] [Indexed: 11/25/2022] Open
Abstract
Dysregulation of alternative splicing in prostate cancer is linked to transcriptional programs activated by AR, ERG, FOXA1, and MYC. Here, we show that FOXA1 functions as the primary orchestrator of alternative splicing dysregulation across 500 primary and metastatic prostate cancer transcriptomes. We demonstrate that FOXA1 binds to the regulatory regions of splicing-related genes, including HNRNPK and SRSF1. By controlling trans-acting factor expression, FOXA1 exploits an "exon definition" mechanism calibrating alternative splicing toward dominant isoform production. This regulation especially impacts splicing factors themselves and leads to a reduction of nonsense-mediated decay (NMD)-targeted isoforms. Inclusion of the NMD-determinant FLNA exon 30 by FOXA1-controlled oncogene SRSF1 promotes cell growth in vitro and predicts disease recurrence. Overall, we report a role for FOXA1 in rewiring the alternative splicing landscape in prostate cancer through a cascade of events from chromatin access, to splicing factor regulation, and, finally, to alternative splicing of exons influencing patient survival.
Collapse
Affiliation(s)
- Marco Del Giudice
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - John G Foster
- Centre for Cancer Cell and Molecular Biology, Barts Cancer Institute, Cancer Research UK Barts Centre, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
| | - Serena Peirone
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milan, Italy
| | - Alberto Rissone
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Livia Caizzi
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Federica Gaudino
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Caterina Parlato
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Francesca Anselmi
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
| | - Rebecca Arkell
- Centre for Cancer Cell and Molecular Biology, Barts Cancer Institute, Cancer Research UK Barts Centre, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
| | - Simonetta Guarrera
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Salvatore Oliviero
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
| | - Giuseppe Basso
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Candiolo Cancer Institute, FPO-IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy
| | - Prabhakar Rajan
- Centre for Cancer Cell and Molecular Biology, Barts Cancer Institute, Cancer Research UK Barts Centre, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK; Division of Surgery and Interventional Science, University College London, Charles Bell House, 3 Road Floor, 43-45 Foley Street, London W1W 7TS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK; Department of Urology, Barts Health NHS Trust, the Royal London Hospital, Whitechapel Road, London E1 1BB, UK; Department of Uro-oncology, University College London NHS Foundation Trust, 47 Wimpole Street, London W1G 8SE, UK.
| | - Matteo Cereda
- Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov. le 142, km 3.95, 10060 Candiolo (TO), Italy; Department of Biosciences, Università degli Studi di Milano, Via Celoria 26, 20133 Milan, Italy.
| |
Collapse
|
25
|
Zehetmayer S, Posch M, Graf A. Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments. BMC Bioinformatics 2022; 23:388. [PMID: 36153479 PMCID: PMC9509565 DOI: 10.1186/s12859-022-04928-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 09/13/2022] [Indexed: 11/10/2022] Open
Abstract
Background In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no chance of showing a difference between conditions. This step may reduce the multiple testing burden and increase power. Results We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small. Conclusions For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04928-z.
Collapse
|
26
|
Saddozai UAK, Wang F, Khattak S, Akbar MU, Badar M, Khan NH, Zhang L, Zhu W, Xie L, Li Y, Ji X, Guo X. Define the Two Molecular Subtypes of Epithelioid Malignant Pleural Mesothelioma. Cells 2022; 11:cells11182924. [PMID: 36139498 PMCID: PMC9497219 DOI: 10.3390/cells11182924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/25/2022] [Accepted: 09/08/2022] [Indexed: 11/20/2022] Open
Abstract
Malignant pleural mesothelioma (MPM) is a fatal disease of respiratory system. Despite the availability of invasive biomarkers with promising results, there are still significant diagnostic and therapeutic challenges in the treatment of MPM. One of three main mesothelioma cell types, epithelioid mesothelioma makes up approximately 70% of all mesothelioma cases. Different observational findings are under process, but the molecular heterogeneity and pathogenesis of epithelioid malignant pleural mesothelioma (eMPM) are still not well understood. Through molecular analysis, expression profiling data were used to determine the possibility and optimal number of eMPM molecular subtypes. Next, clinicopathological characteristics and different molecular pathways of each subtype were analyzed to prospect the clinical applications and advanced mechanisms of eMPM. In this study, we identified two distinct epithelioid malignant pleural mesothelioma subtypes with distinct gene expression patterns. Subtype I eMPMs were involved in steroid hormone biosynthesis, porphyrin and chlorophyll metabolism, and drug metabolism, while subtype II eMPMs were involved in rational metabolism, tyrosine metabolism, and chemical carcinogenesis pathways. Additionally, we identified potential subtype-specific therapeutic targets, including CCNE1, EPHA3, RNF43, ROS1, and RSPO2 for subtype I and CDKN2A and RET for subtype II. Considering the need for potent diagnostic and therapeutic biomarkers for eMPM, we are anticipating that our findings will help both in exploring underlying mechanisms in the development of eMPM and in designing targeted therapy for eMPM.
Collapse
Affiliation(s)
- Umair Ali Khan Saddozai
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Fengling Wang
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Saadullah Khattak
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Muhammad Usman Akbar
- Gomal Center of Biochemistry and Biotechnology, Gomal University, Dera Ismail Khan 29050, Pakistan
| | - Muhammad Badar
- Gomal Center of Biochemistry and Biotechnology, Gomal University, Dera Ismail Khan 29050, Pakistan
| | - Nazeer Hussain Khan
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Lu Zhang
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Wan Zhu
- Department of Anesthesia, Stanford University, 300 Pasteur Drive, Stanford, CA 94305, USA
| | - Longxiang Xie
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Yongqiang Li
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
| | - Xinying Ji
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
- Correspondence: (X.J.); (X.G.)
| | - Xiangqian Guo
- Department of Preventive Medicine, Institute of Bioinformatics Center, Henan Provincial Engineering Center for Tumor Molecular Medicine, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China
- Correspondence: (X.J.); (X.G.)
| |
Collapse
|
27
|
Li D, Zand MS, Dye TD, Goniewicz ML, Rahman I, Xie Z. An evaluation of RNA-seq differential analysis methods. PLoS One 2022; 17:e0264246. [PMID: 36112652 PMCID: PMC9480998 DOI: 10.1371/journal.pone.0264246] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 08/30/2022] [Indexed: 11/19/2022] Open
Abstract
RNA-seq is a high-throughput sequencing technology widely used for gene transcript discovery and quantification under different biological or biomedical conditions. A fundamental research question in most RNA-seq experiments is the identification of differentially expressed genes among experimental conditions or sample groups. Numerous statistical methods for RNA-seq differential analysis have been proposed since the emergence of the RNA-seq assay. To evaluate popular differential analysis methods used in the open source R and Bioconductor packages, we conducted multiple simulation studies to compare the performance of eight RNA-seq differential analysis methods used in RNA-seq data analysis (edgeR, DESeq, DESeq2, baySeq, EBSeq, NOISeq, SAMSeq, Voom). The comparisons were across different scenarios with either equal or unequal library sizes, different distribution assumptions and sample sizes. We measured performance using false discovery rate (FDR) control, power, and stability. No significant differences were observed for FDR control, power, or stability across methods, whether with equal or unequal library sizes. For RNA-seq count data with negative binomial distribution, when sample size is 3 in each group, EBSeq performed better than the other methods as indicated by FDR control, power, and stability. When sample sizes increase to 6 or 12 in each group, DESeq2 performed slightly better than other methods. All methods have improved performance when sample size increases to 12 in each group except DESeq. For RNA-seq count data with log-normal distribution, both DESeq and DESeq2 methods performed better than other methods in terms of FDR control, power, and stability across all sample sizes. Real RNA-seq experimental data were also used to compare the total number of discoveries and stability of discoveries for each method. For RNA-seq data analysis, the EBSeq method is recommended for studies with sample size as small as 3 in each group, and the DESeq2 method is recommended for sample size of 6 or higher in each group when the data follow the negative binomial distribution. Both DESeq and DESeq2 methods are recommended when the data follow the log-normal distribution.
Collapse
Affiliation(s)
- Dongmei Li
- Clinical and Translational Science Institute, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
- * E-mail:
| | - Martin S. Zand
- Clinical and Translational Science Institute, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
- Department of Medicine, Division of Nephrology, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
| | - Timothy D. Dye
- Department of Obstetrics and Gynecology, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
| | - Maciej L. Goniewicz
- Department of Health Behavior, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States of America
| | - Irfan Rahman
- Department of Environmental Medicine, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
| | - Zidian Xie
- Clinical and Translational Science Institute, School of Medicine and Dentistry, University of Rochester, Rochester, NY, United States of America
| |
Collapse
|
28
|
NBBt-test: a versatile method for differential analysis of multiple types of RNA-seq data. Sci Rep 2022; 12:12833. [PMID: 35896555 PMCID: PMC9329447 DOI: 10.1038/s41598-022-15762-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 06/29/2022] [Indexed: 11/25/2022] Open
Abstract
Rapid development of transcriptome sequencing technologies has resulted in a data revolution and emergence of new approaches to study transcriptomic regulation such as alternative splicing, alternative polyadenylation, CRISPR knockout screening in addition to the regular gene expression. A full characterization of the transcriptional landscape of different groups of cells or tissues holds enormous potential for both basic science as well as clinical applications. Although many methods have been developed in the realm of differential gene expression analysis, they all geared towards a particular type of sequencing data and failed to perform well when applied in different types of transcriptomic data. To fill this gap, we offer a negative beta binomial t-test (NBBt-test). NBBt-test provides multiple functions to perform differential analyses of alternative splicing, polyadenylation, CRISPR knockout screening, and gene expression datasets. Both real and large-scale simulation data show superior performance of NBBt-test with higher efficiency, and lower type I error rate and FDR to identify differential isoforms and differentially expressed genes and differential CRISPR knockout screening genes with different sample sizes when compared against the current very popular statistical methods. An R-package implementing NBBt-test is available for downloading from CRAN (https://CRAN.R-project.org/package=NBBttest).
Collapse
|
29
|
Garg T, Weiss CR, Sheth RA. Techniques for Profiling the Cellular Immune Response and Their Implications for Interventional Oncology. Cancers (Basel) 2022; 14:3628. [PMID: 35892890 PMCID: PMC9332307 DOI: 10.3390/cancers14153628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 07/19/2022] [Accepted: 07/20/2022] [Indexed: 12/07/2022] Open
Abstract
In recent years there has been increased interest in using the immune contexture of the primary tumors to predict the patient's prognosis. The tumor microenvironment of patients with cancers consists of different types of lymphocytes, tumor-infiltrating leukocytes, dendritic cells, and others. Different technologies can be used for the evaluation of the tumor microenvironment, all of which require a tissue or cell sample. Image-guided tissue sampling is a cornerstone in the diagnosis, stratification, and longitudinal evaluation of therapeutic efficacy for cancer patients receiving immunotherapies. Therefore, interventional radiologists (IRs) play an essential role in the evaluation of patients treated with systemically administered immunotherapies. This review provides a detailed description of different technologies used for immune assessment and analysis of the data collected from the use of these technologies. The detailed approach provided herein is intended to provide the reader with the knowledge necessary to not only interpret studies containing such data but also design and apply these tools for clinical practice and future research studies.
Collapse
Affiliation(s)
- Tushar Garg
- Division of Vascular and Interventional Radiology, Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; (T.G.); (C.R.W.)
| | - Clifford R. Weiss
- Division of Vascular and Interventional Radiology, Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; (T.G.); (C.R.W.)
| | - Rahul A. Sheth
- Department of Interventional Radiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
30
|
Zou J, Deng F, Wang M, Zhang Z, Liu Z, Zhang X, Hua R, Chen K, Zou X, Hao J. scCODE: an R package for data-specific differentially expressed gene detection on single-cell RNA-sequencing data. Brief Bioinform 2022; 23:6590434. [PMID: 35598331 DOI: 10.1093/bib/bbac180] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/06/2022] [Accepted: 04/22/2022] [Indexed: 12/13/2022] Open
Abstract
Abstract
Differential expression (DE) gene detection in single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) data is a key step to understand the biological question investigated. Filtering genes is suggested to improve the performance of DE methods, but the influence of filtering genes has not been demonstrated. Furthermore, the optimal methods for different scRNA-seq datasets are divergent, and different datasets should benefit from data-specific DE gene detection strategies. However, existing tools did not take gene filtering into consideration. There is a lack of metrics for evaluating the optimal method on experimental datasets. Based on two new metrics, we propose single-cell Consensus Optimization of Differentially Expressed gene detection, an R package to automatically optimize DE gene detection for each experimental scRNA-seq dataset.
Collapse
Affiliation(s)
- Jiawei Zou
- School of Life Sciences and Biotechnology, Shanghai Centre for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Clinical Science, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Fulan Deng
- School of Materials Science and Engineering, Shanghai Institute of Technology, Shanghai 201418, China
| | - Miaochen Wang
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Zhen Zhang
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Zheqi Liu
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Xiaobin Zhang
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- Department of Cardiovascular Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Rong Hua
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Ke Chen
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Xin Zou
- Jinshan Hospital Center for Tumor Diagnosis & Therapy, Jinshan Hospital, Fudan University, Shanghai, 201508, China
| | - Jie Hao
- Institute of Clinical Science, Zhongshan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
31
|
Zhu M, Lai Y. Improvements Achieved by Multiple Imputation for Single-Cell RNA-Seq Data in Clustering Analysis and Differential Expression Analysis. J Comput Biol 2022; 29:634-649. [PMID: 35575729 DOI: 10.1089/cmb.2021.0597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In a single-cell RNA-seq (scRNA-seq) data set, a high proportion of missing values (or an excessive number of zeroes) are frequently observed. For the related follow-up tasks, such as clustering analysis and differential expression analysis, a data set without missing values is generally required. Many imputation approaches have been proposed for this purpose. Multiple imputation (MI) is a well-established approach to address possible biases in a follow-up analysis result based on one-time imputed data. There is a lack of investigation on this in the analysis of scRNA-seq data. In this study, we have investigated how to efficiently apply the MI approach to the clustering analysis and the differential expression analysis of scRNA-seq data. We proposed an MI procedure for clustering analysis and an MI procedure for differential expression analysis. To demonstrate the improvements achieved by MI in clustering analysis and differential expression analysis of scRNA-seq data, we analyzed three well-known scRNA-seq data sets. scIGANs, an scRNA-seq imputation method based on the generative adversarial networks (GANs), has been recently proposed for scRNA-seq data imputation. Multiple randomly imputed data sets can be conveniently generated by this method. We implemented our MI procedures based on scIGANs. We demonstrated that MI yielded improved performances on the clustering analysis and differential expression analysis results. Our applications to experimental scRNA-seq data illustrated the advantages of MI over one-time imputation of missing values in scRNA-seq data.
Collapse
Affiliation(s)
- Mengqiu Zhu
- Department of Statistics, The George Washington University, Washington, District of Columbia, USA
| | - Yinglei Lai
- School of Mathematical Science, University of Science and Technology of China, Hefei, China
| |
Collapse
|
32
|
Comprehensive characterization of pre- and post-treatment samples of breast cancer reveal potential mechanisms of chemotherapy resistance. NPJ Breast Cancer 2022; 8:60. [PMID: 35523804 PMCID: PMC9076915 DOI: 10.1038/s41523-022-00428-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 04/12/2022] [Indexed: 12/23/2022] Open
Abstract
When locally advanced breast cancer is treated with neoadjuvant chemotherapy, the recurrence risk is significantly higher if no complete pathologic response is achieved. Identification of the underlying resistance mechanisms is essential to select treatments with maximal efficacy and minimal toxicity. Here we employed gene expression profiles derived from 317 HER2-negative treatment-naïve breast cancer biopsies of patients who underwent neoadjuvant chemotherapy, deep whole exome, and RNA-sequencing profiles of 22 matched pre- and post-treatment tumors, and treatment outcome data to identify biomarkers of response and resistance mechanisms. Molecular profiling of treatment-naïve breast cancer samples revealed that expression levels of proliferation, immune response, and extracellular matrix (ECM) organization combined predict response to chemotherapy. Triple negative patients with high proliferation, high immune response and low ECM expression had a significantly better treatment response and survival benefit (HR 0.29, 95% CI 0.10–0.85; p = 0.02), while in ER+ patients the opposite was seen (HR 4.73, 95% CI 1.51–14.8; p = 0.008). The characterization of paired pre-and post-treatment samples revealed that aberrations of known cancer genes were either only present in the pre-treatment sample (CDKN1B) or in the post-treatment sample (TP53, APC, CTNNB1). Proliferation-associated genes were frequently down-regulated in post-treatment ER+ tumors, but not in triple negative tumors. Genes involved in ECM were upregulated in the majority of post-chemotherapy samples. Genomic and transcriptomic differences between pre- and post-chemotherapy samples are common and may reveal potential mechanisms of therapy resistance. Our results show a wide range of distinct, but related mechanisms, with a prominent role for proliferation- and ECM-related genes.
Collapse
|
33
|
Lin MH, Wu PS, Wong TH, Lin IY, Lin J, Cox J, Yu SH. Benchmarking differential expression, imputation and quantification methods for proteomics data. Brief Bioinform 2022; 23:6566001. [PMID: 35397162 DOI: 10.1093/bib/bbac138] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 11/14/2022] Open
Abstract
Data analysis is a critical part of quantitative proteomics studies in interpreting biological questions. Numerous computational tools for protein quantification, imputation and differential expression (DE) analysis were generated in the past decade and the search for optimal tools is still going on. Moreover, due to the rapid development of RNA sequencing (RNA-seq) technology, a vast number of DE analysis methods were created for that purpose. The applicability of these newly developed RNA-seq-oriented tools to proteomics data remains in doubt. In order to benchmark these analysis methods, a proteomics dataset consisting of proteins derived from humans, yeast and drosophila, in defined ratios, was generated in this study. Based on this dataset, DE analysis tools, including microarray- and RNA-seq-based ones, imputation algorithms and protein quantification methods were compared and benchmarked. Furthermore, applying these approaches to two public datasets showed that RNA-seq-based DE tools achieved higher accuracy (ACC) in identifying DEPs. This study provides useful guidelines for analyzing quantitative proteomics datasets. All the methods used in this study were integrated into the Perseus software, version 2.0.3.0, which is available at https://www.maxquant.org/perseus.
Collapse
Affiliation(s)
- Miao-Hsia Lin
- Graduate Institute and Department of Microbiology, College of Medicine, National Taiwan University, No.1 Jen Ai road section 1 Taipei 100 Taiwan
| | - Pei-Shan Wu
- Genome and Systems Biology Degree Program, College of Life Science, National Taiwan University, Taipei, Taiwan
| | - Tzu-Hsuan Wong
- Graduate Institute and Department of Microbiology, College of Medicine, National Taiwan University, No.1 Jen Ai road section 1 Taipei 100 Taiwan
| | - I-Ying Lin
- Graduate Institute and Department of Microbiology, College of Medicine, National Taiwan University, No.1 Jen Ai road section 1 Taipei 100 Taiwan
| | - Johnathan Lin
- Institute of Precision Medicine, National Sun Yat-set University, No.70 Lien-hai Rd., Kaohsiung 80424, Taiwan
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Sung-Huan Yu
- Institute of Precision Medicine, National Sun Yat-set University, No.70 Lien-hai Rd., Kaohsiung 80424, Taiwan
| |
Collapse
|
34
|
Novel Gene Signatures as Prognostic Biomarkers for Predicting the Recurrence of Hepatocellular Carcinoma. Cancers (Basel) 2022; 14:cancers14040865. [PMID: 35205612 PMCID: PMC8870597 DOI: 10.3390/cancers14040865] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 02/04/2022] [Accepted: 02/07/2022] [Indexed: 12/10/2022] Open
Abstract
Simple Summary A high percentage of patients who undergo surgical resection for hepatocellular carcinoma (HCC) experience recurrence. Therefore, identification of accurate molecular markers for predicting recurrence of HCC is important. We analyzed recurrence and non-recurrence HCC tissues using two public omics datasets comprising microarray and RNA-sequencing and found novel gene signatures associated with recurrent HCC. These molecules might be used to not only predict for recurrence of HCC but also act as potential prognostic indicators for patients with HCC. Abstract Hepatocellular carcinoma (HCC) has a high rate of cancer recurrence (up to 70%) in patients who undergo surgical resection. We investigated prognostic gene signatures for predicting HCC recurrence using in silico gene expression analysis. Recurrence-associated gene candidates were chosen by a comparative analysis of gene expression profiles from two independent whole-transcriptome datasets in patients with HCC who underwent surgical resection. Five promising candidate genes, CETN2, HMGA1, MPZL1, RACGAP1, and SNRPB were identified, and the expression of these genes was evaluated using quantitative reverse transcription PCR in the validation set (n = 57). The genes CETN2, HMGA1, RACGAP1, and SNRPB, but not MPZL1, were upregulated in patients with recurrent HCC. In addition, the combination of HMGA1 and MPZL1 demonstrated the best area under the curve (0.807, 95% confidence interval [CI] = 0.681–0.899) for predicting HCC recurrence. In terms of clinicopathological correlation, CETN2, MPZL1, RACGAP1, and SNRPB were upregulated in patients with microvascular invasion, and the expression of MPZL1 and SNRPB was increased in proportion to the Edmonson tumor differentiation grade. Additionally, overexpression of CETN2, HMGA1, and RACGAP1 correlated with poor overall survival (OS) and disease-free survival (DFS) in the validation set. Finally, Cox regression analysis showed that the expression of serum alpha-fetoprotein and RACGAP1 significantly affected OS, whereas platelet count, microvascular invasion, and HMGA1 expression significantly affected DFS. In conclusion, HMGA1 and RACGAP1 may be potential prognostic biomarkers for predicting the recurrence of HCC after surgical resection.
Collapse
|
35
|
Qu H, Qu M, Wang S, Yu L, Jia Q, Wang X, Jia Z. Differential Expression Analysis: Simple Pair, Interaction, Time-series. Bio Protoc 2022. [DOI: 10.21769/bioprotoc.4455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022] Open
|
36
|
Yang S, Zhang K, Fang Z. Robust RNA-seq data analysis using an integrated method of ROC curve and Kolmogorov-Smirnov test. COMMUN STAT-SIMUL C 2022; 51:7444-7457. [PMID: 36583130 PMCID: PMC9793859 DOI: 10.1080/03610918.2020.1837165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
It is a common approach to dichotomize a continuous biomarker in clinical setting for the convenience of application. Analytically, results from using a dichotomized biomarker are often more reliable and resistant to outliers, bi-modal and other unknown distributions. There are two commonly used methods for selecting the best cut-off value for dichotomization of a continuous biomarker, using either maximally selected chi-square statistic or a ROC curve, specifically the Youden Index. In this paper, we explained that in many situations, it is inappropriate to use the former. By using the Maximum Absolute Youden Index (MAYI), we demonstrated that the integration of a MAYI and the Kolmogorov-Smirnov test is not only a robust non-parametric method, but also provides more meaningful p value for selecting the cut-off value than using a Mann-Whitney test. In addition, our method can be applied directly in clinical settings.
Collapse
Affiliation(s)
- Shengping Yang
- Department of Biostatistics, Pennington Biomedical Research Center, Baton Rouge, LA, USA
| | - Kun Zhang
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA, USA
| | - Zhide Fang
- Biostatistics Program, School of Public Health, LSU Health Sciences Center, New Orleans, LA, USA,Corresponding author:
| |
Collapse
|
37
|
Marques-Pereira C, Pires M, Moreira IS. Discovery of Virus-Host interactions using bioinformatic tools. Methods Cell Biol 2022; 169:169-198. [DOI: 10.1016/bs.mcb.2022.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
38
|
Ujifuku K, Morofuji Y, Masumoto H. RNA Sequencing Data Analysis on the Maser Platform and the Tag-Count Comparison Graphical User Interface. Methods Mol Biol 2022; 2535:157-170. [PMID: 35867230 DOI: 10.1007/978-1-0716-2513-2_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The RNA sequencing (RNA-seq) process that allows for comprehensive transcriptome analysis has become increasingly simple. Analysis and interpretation of RNA-seq output data are indispensable for research, but bioinformatics experts are not always available to assist. Currently, however, even a wet-lab specialist can perform the pipeline analysis of RNA-seq described in this chapter using the Maser platform and the Tag-Count Comparison Graphical User Interface (TCC-GUI). These are free of charge for scientific use.
Collapse
Affiliation(s)
- Kenta Ujifuku
- Department of Neurosurgery, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan.
| | - Yoichi Morofuji
- Department of Neurosurgery, Nagasaki University Hospital, Nagasaki, Japan
| | - Hiroshi Masumoto
- Biomedical research support center, Nagasaki University School of Medicine, Nagasaki, Japan
| |
Collapse
|
39
|
Guo X, Wang M, Wang X, Guo M, Xue T, Wang Z, Li H, Xu T, He B, Cui D, Tong S. Progressive Increase of High-Frequency EEG Oscillations during Meditation is Associated with its Trait Effects on Heart Rate and Proteomics: A Study on the Tibetan Buddhist. Cereb Cortex 2021; 32:3865-3877. [PMID: 34974617 DOI: 10.1093/cercor/bhab453] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 11/07/2021] [Accepted: 11/09/2021] [Indexed: 11/12/2022] Open
Abstract
Meditation has been a spiritual and healing practice in the East for thousands of years. However, the neurophysiologic mechanisms underlying its traditional form remain unclear. In this study, we recruited a large sample of monks (n = 73) who practice Tibetan Buddhist meditation and compared with meditation-naive local controls (n = 30). Their electroencephalography (EEG) and electrocardiogram signals were simultaneously recorded and blood samples were collected to investigate the integrative effects of Tibetan Buddhist on brain, heart, and proteomics. We found that the EEG activities in monks shifted to a higher frequency from resting to meditation. Meditation starts with decrease of the (pre)frontal delta activity and increase of the (pre)frontal high beta and gamma activity; while at the deep meditative state, the posterior high-frequency activity was also increased, and could be specified as a biomarker for the deep meditation. The state increase of posterior high-frequency EEG activity was significantly correlated with the trait effects on heart rate and nueropilin-1 in monks, with the source of brain-heart correlation mainly locating in the attention and emotion networks. Our study revealed that the effects of Tibetan Buddhist meditation on brain, heart, and proteomics were highly correlated, demonstrating meditation as an integrative body-mind training.
Collapse
Affiliation(s)
- Xiaoli Guo
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Meiyun Wang
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Xu Wang
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Menglin Guo
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Ting Xue
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Zhuo Wang
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Han Li
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Tianjiao Xu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Bin He
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, USA.,Center for Neuroscience Institute, Carnegie Mellon University, Pittsburgh, USA
| | - Donghong Cui
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Shanbao Tong
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
40
|
Das S, Rai A, Merchant ML, Cave MC, Rai SN. A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies. Genes (Basel) 2021; 12:1947. [PMID: 34946896 PMCID: PMC8701051 DOI: 10.3390/genes12121947] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 11/27/2021] [Accepted: 11/27/2021] [Indexed: 12/13/2022] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.
Collapse
Affiliation(s)
- Samarendra Das
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
| | - Michael L. Merchant
- Department of Medicine, School of Medicine, University of Louisville, Louisville, KY 40202, USA;
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
| | - Matthew C. Cave
- Biostatistics and Informatics Facility, Center for Integrative Environmental Health Sciences, University of Louisville, Louisville, KY 40202, USA;
| | - Shesh N. Rai
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
- Biostatistics and Informatics Facility, Center for Integrative Environmental Health Sciences, University of Louisville, Louisville, KY 40202, USA;
- Christina Lee Brown Envirome Institute, University of Louisville, Louisville, KY 40202, USA
- Department of Bioinformatics and Biostatistics, School of Public Health and Information Science, University of Louisville, Louisville, KY 40202, USA
| |
Collapse
|
41
|
Siavoshi A, Taghizadeh M, Dookhe E, Piran M. Gene expression profiles and pathway enrichment analysis to identification of differentially expressed gene and signaling pathways in epithelial ovarian cancer based on high-throughput RNA-seq data. Genomics 2021; 114:161-170. [PMID: 34839022 DOI: 10.1016/j.ygeno.2021.11.031] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Accepted: 11/23/2021] [Indexed: 12/11/2022]
Abstract
Epithelial ovarian cancer (EOC) can be considered as a stressful and challenging disease among all women in the world, which has been associated with a poor prognosis and its molecular pathogenesis has remained unclear. In recent years, RNA Sequencing (RNA-seq) has become a functional and amazing technology for profiling gene expression. In the present study, RNA-seq raw data from Sequence Read Archive (SRA) of six tumor and normal ovarian sample was extracted, and then analysis and statistical interpretation was done with Linux and R Packages from the open-source Bioconductor. Gene Ontology (GO) term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were applied for the identification of key genes and pathways involved in EOC. We identified 1091 Differential Expression Genes (DEGs) which have been reported in various studies of ovarian cancer as well as other types of cancer. Among them, 333 genes were up-regulated and 273 genes were down-regulated. In addition, Differentially Expressed Genes (DEGs) including RPL41, ALDH3A2, ERBB2, MIEN1, RBM25, ATF4, UPF2, DDIT3, HOXB8 and IL17D as well as Ribosome and Glycolysis/Gluconeogenesis pathway have had the potentiality to be used as targets for EOC diagnosis and treatment. In this study, unlike that of any other studies on various cancers, ALDH3A2 was most down-regulated gene in most KEGG pathways, and ATF4 was most up-regulated gene in leucine zipper domain binding term. In the other hand, RPL41 as a regulatory of cellular ATF4 level was up-regulated in many term and pathways and augmentation of ATF4 could justify the increase of RPL41 in the EOC. Pivotal pathways and significant genes, which were identified in the present study, can be used for adaptation of different EOC study. However, further molecular biological experiments and computational processes are required to confirm the function of the identified genes associated with EOC.
Collapse
Affiliation(s)
- A Siavoshi
- Department of Animal Sciences, Ramin University of Agriculture and Natural Resources, Ahvaz, Iran.
| | - M Taghizadeh
- Department of Medical Genetic, Tarbiat Modares University, Tehran, Iran
| | - E Dookhe
- Department of Biology, Research and Science Branch, Islamic Azad University, Tehran, Iran
| | - M Piran
- Department of Medical Biotechnology, Drug Design and Bioinformatics Unit, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| |
Collapse
|
42
|
Angelescu R, Dobrescu R. MIDGET:Detecting differential gene expression on microarray data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 211:106418. [PMID: 34555591 DOI: 10.1016/j.cmpb.2021.106418] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 09/11/2021] [Indexed: 06/13/2023]
Abstract
Backgound and Objective: Detecting differentially expressed genes is an important step in genome wide analysis and expression profiling. There are a wide array of algorithms used in today's research based on statistical approaches. Even though the current algorithms work, they sometimes miss-predict. There is no framework available for measuring the quality of current algorithms. New machine learning methods (like gradient boost and deep neural networks) were not used to solve this problem. The Gene-Bench open source python package addresses these issues by providing an evaluation and data handling system for differentially expressed genes detection algorithms on microarray data. We also provide MIDGET, a new group of algorithms based on state of the art machine learning approaches Methods: The Gene-Bench package provides data collected from real experiments that consists of 73 transcription-factor perturbation experiments with validation data from Chip-seq experiments and 129 drug perturbation experiments, synthetic data generated with our own method and three evaluation metrics (Kolmogorov, F1 and AUC/ROC). Besides the data and metrics, Gene-Bench also contains well-known algorithms and a new method to identify differentially expressed genes, called MIDGET: Machine learning Identification Differential Gene Expression Tool that is using big-data and machine learning methods to identify differentially expressed genes. The two new groups of machine learning algorithms provided in our package use extreme gradient boosting and deep neural networks to achieve their results. Results: The Gene-Bench package is highly flexible, allows fast prototyping and evaluating of new and old algorithms and provides multiple new machine-learning algorithms (called MIDGET) that perform better on all evaluation metrics than all the other tested alternatives. While everything provided in Gene-Bench is algorithm independent, the user can also use algorithms implemented in the R language even though the package is written in Python. Conclusions: The Gene-Bench package fills a gap in evaluating and benchmarking differential gene detection algorithms. It also provides machine learning methods that perform detection with higher accuracy in all tested metrics. It is available at https://github.com/raduangelescu/GeneBench/ and can be directly installed from the Python Package Index using pip install genebench.
Collapse
Affiliation(s)
- Radu Angelescu
- Department of Automatic Control and Industrial Informatics, Faculty of Automatic Control and Computer Science, University "Politehnica" of Bucharest, Splaiul Independentei nr. 313, Sector 6, Bucuresti, 060042, Romania.
| | - Radu Dobrescu
- Department of Automatic Control and Industrial Informatics, Faculty of Automatic Control and Computer Science, University "Politehnica" of Bucharest, Splaiul Independentei nr. 313, Sector 6, Bucuresti, 060042, Romania.
| |
Collapse
|
43
|
Wu J, Fang Z, Liu T, Hu W, Wu Y, Li S. Maximizing the Utility of Transcriptomics Data in Inflammatory Skin Diseases. Front Immunol 2021; 12:761890. [PMID: 34777377 PMCID: PMC8586455 DOI: 10.3389/fimmu.2021.761890] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022] Open
Abstract
Inflammatory skin diseases are induced by disorders of the host defense system of the skin, which is composed of a barrier, innate and acquired immunity, as well as the cutaneous microbiome. These disorders are characterized by recurrent cutaneous lesions and intense itch, which seriously affecting life quality of people across all ages and ethnicities. To elucidate molecular factors for typical inflammatory skin diseases (such as psoriasis and atopic dermatitis), transcriptomic profiling assays have been largely performed. Additionally, single-cell RNA sequencing (scRNA-seq) as well as spatial transcriptomic profiling have revealed multiple potential translational targets and offered guides to improve diagnosis and treatment strategies for inflammatory skin diseases. High-throughput transcriptomics data has shown unprecedented power to disclose the complex pathophysiology of inflammatory skin diseases. Here, we will summarize discoveries from transcriptomics data and discuss how to maximize the transcriptomics data to propel the development of diagnostic biomarkers and therapeutic targets in inflammatory skin diseases.
Collapse
Affiliation(s)
- Jingni Wu
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhixiao Fang
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Teng Liu
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wei Hu
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yangjun Wu
- Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shengli Li
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
44
|
Huang G, Zhang H, Qu Y, Huang K, Gong X, Wei J, Du H. ARMT: An automatic RNA-seq data mining tool based on comprehensive and integrative analysis in cancer research. Comput Struct Biotechnol J 2021; 19:4426-4434. [PMID: 34471489 PMCID: PMC8379379 DOI: 10.1016/j.csbj.2021.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/19/2021] [Accepted: 08/06/2021] [Indexed: 11/02/2022] Open
Abstract
The comprehensive and integrative analysis of RNA-seq data, in different molecular layers from diverse samples, holds promise to address the full-scale complexity of biological systems. Recent advances in gene set variant analysis (GSVA) are providing exciting opportunities for revealing the specific biological processes of cancer samples. However, it is still urgently needed to develop a tool, which combines GSVA and different molecular characteristic analysis, as well as prognostic characteristics of cancer patients to reveal the biological processes of disease comprehensively. Here, we develop ARMT, an automatic tool for RNA-Seq data analysis. ARMT is an efficient and integrative tool with user-friendly interface to analyze related molecular characters of single gene and gene set comprehensively based on transcriptome and genomic data, which builds the bridge for deeper information between genes and pathways, to further accelerate scientific findings. ARMT can be installed easily from https://github.com/Dulab2020/ARMT.
Collapse
Affiliation(s)
- Guanda Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Haibo Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Yimo Qu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Kaitang Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Xiaocheng Gong
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Jinfen Wei
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Hongli Du
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| |
Collapse
|
45
|
Cui W, Xue H, Geng Y, Zhang J, Liang Y, Tian X, Wang Q. Effect of high variation in transcript expression on identifying differentially expressed genes in RNA-seq analysis. Ann Hum Genet 2021; 85:235-244. [PMID: 34341986 DOI: 10.1111/ahg.12441] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 07/04/2021] [Accepted: 07/15/2021] [Indexed: 12/13/2022]
Abstract
Great efforts have been made on the algorithms that deal with RNA-seq data to enhance the accuracy and efficiency of differential expression (DE) analysis. However, no consensus has been reached on the proper threshold values of fold change and adjusted p-value for filtering differentially expressed genes (DEGs). It is generally believed that the more stringent the filtering threshold, the more reliable the result of a DE analysis. Nevertheless, by analyzing the impact of both adjusted p-value and fold change thresholds on DE analyses, with RNA-seq data obtained for three different cancer types from the Cancer Genome Atlas (TCGA) database, we found that, for a given sample size, the reproducibility of DE results became poorer when more stringent thresholds were applied. No matter which threshold level was applied, the overlap rates of DEGs were generally lower for small sample sizes than for large sample sizes. The raw read count analysis demonstrated that the transcript expression of the same gene in different samples, whether in tumor groups or in normal groups, showed high variations, which resulted in a drastic fluctuation in fold change values and adjustedp-values when different sets of samples were used. Overall, more stringent thresholds did not yield more reliable DEGs due to high variations in transcript expression; the reliability of DEGs obtained with small sample sizes was more susceptible to these variations. Therefore, less stringent thresholds are recommended for screening DEGs. Moreover, large sample sizes should be considered in RNA-seq experimental designs to reduce the interfering effect of variations in transcript expression on DEG identification.
Collapse
Affiliation(s)
- Weitong Cui
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China
| | - Huaru Xue
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China
| | - Yifan Geng
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China.,Xuzhou Medical University, Xuzhou, P. R. China
| | - Jing Zhang
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China
| | - Yajun Liang
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China
| | - Xuewen Tian
- Shandong Sport University, Jinan, P. R. China
| | - Qinglu Wang
- Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China.,Shandong Sport University, Jinan, P. R. China
| |
Collapse
|
46
|
Hildebrand KM, Singla AK, McNeil R, Marritt KL, Hildebrand KN, Zemp F, Rajwani J, Itani D, Bose P, Mahoney DJ, Jirik FR, Monument MJ. The KrasG12D;Trp53fl/fl murine model of undifferentiated pleomorphic sarcoma is macrophage dense, lymphocyte poor, and resistant to immune checkpoint blockade. PLoS One 2021; 16:e0253864. [PMID: 34242269 PMCID: PMC8270133 DOI: 10.1371/journal.pone.0253864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 06/15/2021] [Indexed: 11/19/2022] Open
Abstract
Sarcomas are rare, difficult to treat, mesenchymal lineage tumours that affect children and adults. Immunologically-based therapies have improved outcomes for numerous adult cancers, however, these therapeutic strategies have been minimally effective in sarcoma so far. Clinically relevant, immunologically-competent, and transplantable pre-clinical sarcoma models are essential to advance sarcoma immunology research. Herein we show that Cre-mediated activation of KrasG12D, and deletion of Trp53, in the hindlimb muscles of C57Bl/6 mice results in the highly penetrant, rapid onset undifferentiated pleomorphic sarcomas (UPS), one of the most common human sarcoma subtypes. Cell lines derived from spontaneous UPS tumours can be reproducibly transplanted into the hindlimbs or lungs of naïve, immune competent syngeneic mice. Immunological characterization of both spontaneous and transplanted UPS tumours demonstrates an immunologically-‘quiescent’ microenvironment, characterized by a paucity of lymphocytes, limited spontaneous adaptive immune pathways, and dense macrophage infiltrates. Macrophages are the dominant immune population in both spontaneous and transplanted UPS tumours, although compared to spontaneous tumours, transplanted tumours demonstrate increased spontaneous lymphocytic infiltrates. The growth of transplanted UPS tumours is unaffected by host lymphocyte deficiency, and despite strong expression of PD-1 on tumour infiltrating lymphocytes, tumours are resistant to immunological checkpoint blockade. This spontaneous and transplantable immune competent UPS model will be an important experimental tool in the pre-clinical development and evaluation of novel immunotherapeutic approaches for immunologically cold soft tissue sarcomas.
Collapse
Affiliation(s)
- Karys M. Hildebrand
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Arvind K. Singla
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Reid McNeil
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Microbiology, Immunology and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Kayla L. Marritt
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Kurt N. Hildebrand
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Franz Zemp
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Microbiology, Immunology and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Jahanara Rajwani
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Microbiology, Immunology and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Doha Itani
- Department of Pathology and Laboratory Medicine, Medical College of Wisconsin, Milwaukee, WI, United States of America
| | - Pinaki Bose
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Microbiology, Immunology and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Douglas J. Mahoney
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Microbiology, Immunology and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Frank R. Jirik
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Michael J. Monument
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- * E-mail:
| |
Collapse
|
47
|
Statistical Modeling of High Dimensional Counts. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2284:97-134. [PMID: 33835440 DOI: 10.1007/978-1-0716-1307-8_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Statistical modeling of count data from RNA sequencing (RNA-seq) experiments is important for proper interpretation of results. Here I will describe how count data can be modeled using count distributions, or alternatively analyzed using nonparametric methods. I will focus on basic routines for performing data input, scaling/normalization, visualization, and statistical testing to determine sets of features where the counts reflect differences in gene expression across samples. Finally, I discuss limitations and possible extensions to the models presented here.
Collapse
|
48
|
Saddozai UAK, Wang F, Akbar MU, Zhang L, An Y, Zhu W, Xie L, Li Y, Ji X, Guo X. Identification of Clinical Relevant Molecular Subtypes of Pheochromocytoma. Front Endocrinol (Lausanne) 2021; 12:605797. [PMID: 34234737 PMCID: PMC8256389 DOI: 10.3389/fendo.2021.605797] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 05/10/2021] [Indexed: 12/30/2022] Open
Abstract
Pheochromocytoma (PCC) is a rare neuroendocrine tumor of the adrenal gland with a high rate of mortality if diagnosed at a late stage. Common symptoms of pheochromocytoma include headache, anxiety, palpitation, and diaphoresis. Different treatments are under observation for PCC but there is still no effective treatment option. Recently, the gene expression profiling of various tumors has provided new subtype-specific options for targeted therapies. In this study, using data sets from TCGA and the GSE19422 cohorts, we identified two distinct PCC subtypes with distinct gene expression patterns. Genes enriched in Subtype I PCCs were involved in the dopaminergic synapse, nicotine addiction, and long-term depression pathways, while genes enriched in subtype II PCCs were involved in protein digestion and absorption, vascular smooth muscle contraction, and ECM receptor interaction pathways. We further identified subtype specific genes such as ALK, IGF1R, RET, and RSPO2 for subtype I and EGFR, ESR1, and SMO for subtype II, the overexpression of which led to cell invasion and tumorigenesis. These genes identified in the present research may serve as potential subtype-specific therapeutic targets to understand the underlying mechanisms of tumorigenesis. Our findings may further guide towards the development of targeted therapies and potential molecular biomarkers against PCC.
Collapse
Affiliation(s)
- Umair Ali Khan Saddozai
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Fengling Wang
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Muhammad Usman Akbar
- Gomal Center of Biochemistry and Biotechnology, Gomal University, Dera Ismail Khan, Pakistan
| | - Lu Zhang
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Yang An
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Wan Zhu
- Department of Anesthesia, Stanford University, Stanford, CA, United States
| | - Longxiang Xie
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Yongqiang Li
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Xinying Ji
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Xiangqian Guo
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics Center, School of Basic Medical Sciences, Henan University, Kaifeng, China
| |
Collapse
|
49
|
Moll JM, Myers PN, Zhang C, Eriksen C, Wolf J, Appelberg KS, Lindberg G, Bahl MI, Zhao H, Pan-Hammarström Q, Cai K, Jia H, Borte S, Nielsen HB, Kristiansen K, Brix S, Hammarström L. Gut Microbiota Perturbation in IgA Deficiency Is Influenced by IgA-Autoantibody Status. Gastroenterology 2021; 160:2423-2434.e5. [PMID: 33662387 DOI: 10.1053/j.gastro.2021.02.053] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 02/01/2021] [Accepted: 02/22/2021] [Indexed: 12/22/2022]
Abstract
BACKGROUND & AIMS IgA exerts its primary function at mucosal surfaces, where it binds microbial antigens to regulate bacterial growth and epithelial attachment. One third of individuals with IgA deficiency (IgAD) suffers from recurrent mucosal infections, possibly related to an altered microbiota. We aimed to delineate the impact of IgAD and the IgA-autoantibody status on the composition and functional capacity of the gut microbiota. METHODS We performed a paired, lifestyle-balanced analysis of the effect of IgA on the gut microbiota composition and functionality based on fecal samples from individuals with IgAD and IgA-sufficient household members (n = 100), involving quantitative shotgun metagenomics, species-centric functional annotation of gut bacteria, and strain-level analyses. We supplemented the data set with 32 individuals with IgAD and examined the influence of IgA-autoantibody status on the composition and functionality of the gut microbiota. RESULTS The gut microbiota of individuals with IgAD exhibited decreased richness and diversity and was enriched for bacterial species encoding pathogen-related functions including multidrug and antimicrobial peptide resistance, virulence factors, and type III and VI secretion systems. These functional changes were largely attributed to Escherichia coli but were independent of E coli strain variations and most prominent in individuals with IgAD with IgA-specific autoreactive antibodies. CONCLUSIONS The microbiota of individuals with IgAD is enriched for species holding increased proinflammatory potential, thereby potentially decreasing the resistance to gut barrier-perturbing events. This phenotype is especially pronounced in individuals with IgAD with IgA-specific autoreactive antibodies, thus warranting a screening for IgA-specific autoreactive antibodies in IgAD to identify patients with IgAD with increased risk for gastrointestinal implications.
Collapse
Affiliation(s)
- Janne Marie Moll
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Pernille Neve Myers
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Carsten Eriksen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Johannes Wolf
- ImmunoDeficiencyCenter Leipzig, Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies at the Municipal Hospital St. Georg Leipzig, Leipzig, Germany
| | - K Sofia Appelberg
- Division of Clinical Immunology, Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital, Huddinge, Stockholm, Sweden
| | - Greger Lindberg
- Department of Medicine, Karolinska Institutet and Department of Gastroenterology at Karolinska University Hospital, Huddinge, Stockholm, Sweden
| | - Martin Iain Bahl
- National Food Institute, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | | | - Kaiye Cai
- BGI-Shenzhen, Shenzhen, China; Shenzhen Engineering Laboratory for Detection and Intervention of Human Intestinal Microbiome, BGI-Shenzhen, Shenzhen, China
| | - Huijue Jia
- BGI-Shenzhen, Shenzhen, China; Shenzhen Key Laboratory for Human Commensals and Health Research, BGI-Shenzhen, Shenzhen, China
| | - Stephan Borte
- ImmunoDeficiencyCenter Leipzig, Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies at the Municipal Hospital St. Georg Leipzig, Leipzig, Germany; Division of Clinical Immunology, Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital, Huddinge, Stockholm, Sweden
| | | | - Karsten Kristiansen
- BGI-Shenzhen, Shenzhen, China; Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Qingdao-Europe Advanced Institute for Life Sciences, Qingdao, Shandong, China.
| | - Susanne Brix
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark; Qingdao-Europe Advanced Institute for Life Sciences, Qingdao, Shandong, China.
| | - Lennart Hammarström
- Division of Clinical Immunology, Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital, Huddinge, Stockholm, Sweden.
| |
Collapse
|
50
|
Stupnikov A, McInerney CE, Savage KI, McIntosh SA, Emmert-Streib F, Kennedy R, Salto-Tellez M, Prise KM, McArt DG. Robustness of differential gene expression analysis of RNA-seq. Comput Struct Biotechnol J 2021; 19:3470-3481. [PMID: 34188784 PMCID: PMC8214188 DOI: 10.1016/j.csbj.2021.05.040] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 05/25/2021] [Accepted: 05/25/2021] [Indexed: 01/05/2023] Open
Abstract
RNA-sequencing (RNA-seq) is a relatively new technology that lacks standardisation. RNA-seq can be used for Differential Gene Expression (DGE) analysis, however, no consensus exists as to which methodology ensures robust and reproducible results. Indeed, it is broadly acknowledged that DGE methods provide disparate results. Despite obstacles, RNA-seq assays are in advanced development for clinical use but further optimisation will be needed. Herein, five DGE models (DESeq2, voom + limma, edgeR, EBSeq, NOISeq) for gene-level detection were investigated for robustness to sequencing alterations using a controlled analysis of fixed count matrices. Two breast cancer datasets were analysed with full and reduced sample sizes. DGE model robustness was compared between filtering regimes and for different expression levels (high, low) using unbiased metrics. Test sensitivity estimated as relative False Discovery Rate (FDR), concordance between model outputs and comparisons of a ’population’ of slopes of relative FDRs across different library sizes, generated using linear regressions, were examined. Patterns of relative DGE model robustness proved dataset-agnostic and reliable for drawing conclusions when sample sizes were sufficiently large. Overall, the non-parametric method NOISeq was the most robust followed by edgeR, voom, EBSeq and DESeq2. Our rigorous appraisal provides information for method selection for molecular diagnostics. Metrics may prove useful towards improving the standardisation of RNA-seq for precision medicine.
Collapse
Affiliation(s)
- A Stupnikov
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation.,Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - C E McInerney
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - K I Savage
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - S A McIntosh
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - F Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
| | - R Kennedy
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - M Salto-Tellez
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - K M Prise
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| | - D G McArt
- Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK
| |
Collapse
|