1
|
Zanfardino M, Franzese M, Geraci F. DeClUt: Decluttering differentially expressed genes through clustering of their expression profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108258. [PMID: 38851122 DOI: 10.1016/j.cmpb.2024.108258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 04/26/2024] [Accepted: 05/29/2024] [Indexed: 06/10/2024]
Abstract
BACKGROUND AND OBJECTIVE differential expression analysis is one of the most popular activities in transcriptomic studies based on next-generation sequencing technologies. In fact, differentially expressed genes (DEGs) between two conditions represent ideal prognostic and diagnostic candidate biomarkers for many pathologies. As a result, several algorithms, such as DESeq2 and edgeR, have been developed to identify DEGs. Despite their widespread use, there is no consensus on which model performs best for different types of data, and many existing methods suffer from high False Discovery Rates (FDR). METHODS we present a new algorithm, DeClUt, based on the intuition that the expression profile of differentially expressed genes should form two reasonably compact and well-separated clusters. This, in turn, implies that the bipartition induced by the two conditions being compared should overlap with the clustering. The clustering algorithm underlying DeClUt was designed to be robust to outliers typical of RNA-seq data. In particular, we used the average silhouette function to enforce membership assignment of samples to the most appropriate condition. RESULTS DeClUt was tested on real RNA-seq datasets and benchmarked against four of the most widely used methods (edgeR, DESeq2, NOISeq, and SAMseq). Experiments showed a higher self-consistency of results than the competitors as well as a significantly lower False Positive Rate (FPR). Moreover, tested on a real prostate cancer RNA-seq dataset, DeClUt has highlighted 8 DE genes, linked to neoplastic process according to DisGeNET database, that none of the other methods had identified. CONCLUSIONS our work presents a novel algorithm that builds upon basic concepts of data clustering and exhibits greater consistency and significantly lower False Positive Rate than state-of-the-art methods. Additionally, DeClUt is able to highlight relevant differentially expressed genes not otherwise identified by other tools contributing to improve efficacy of differential expression analyses in various biological applications.
Collapse
Affiliation(s)
| | - Monica Franzese
- IRCCS Synlab SDN, Via E. Gianturco, 113, Naples, 80143, Italy.
| | - Filippo Geraci
- Institute for Informatics and Telematics, CNR, Via G. Moruzzi 1, Pisa, 56124, Italy
| |
Collapse
|
2
|
Paton V, Ramirez Flores RO, Gabor A, Badia-I-Mompel P, Tanevski J, Garrido-Rodriguez M, Saez-Rodriguez J. Assessing the impact of transcriptomics data analysis pipelines on downstream functional enrichment results. Nucleic Acids Res 2024; 52:8100-8111. [PMID: 38943333 DOI: 10.1093/nar/gkae552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 06/03/2024] [Accepted: 06/19/2024] [Indexed: 07/01/2024] Open
Abstract
Transcriptomics is widely used to assess the state of biological systems. There are many tools for the different steps, such as normalization, differential expression, and enrichment. While numerous studies have examined the impact of method choices on differential expression results, little attention has been paid to their effects on further downstream functional analysis, which typically provides the basis for interpretation and follow-up experiments. To address this, we introduce FLOP, a comprehensive nextflow-based workflow combining methods to perform end-to-end analyses of transcriptomics data. We illustrate FLOP on datasets ranging from end-stage heart failure patients to cancer cell lines. We discovered effects not noticeable at the gene-level, and observed that not filtering the data had the highest impact on the correlation between pipelines in the gene set space. Moreover, we performed three benchmarks to evaluate the 12 pipelines included in FLOP, and confirmed that filtering is essential in scenarios of expected moderate-to-low biological signal. Overall, our results underscore the impact of carefully evaluating the consequences of the choice of preprocessing methods on downstream enrichment analyses. We envision FLOP as a valuable tool to measure the robustness of functional analyses, ultimately leading to more reliable and conclusive biological findings.
Collapse
Affiliation(s)
- Victor Paton
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Ricardo Omar Ramirez Flores
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Attila Gabor
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Pau Badia-I-Mompel
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Jovan Tanevski
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
| | - Martin Garrido-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg, Germany
- European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| |
Collapse
|
3
|
Carels N. Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy. BIOLOGY 2024; 13:482. [PMID: 39056677 PMCID: PMC11274087 DOI: 10.3390/biology13070482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/28/2024]
Abstract
RNA-seq faces persistent challenges due to the ongoing, expanding array of data processing workflows, none of which have yet achieved standardization to date. It is imperative to determine which method most effectively preserves biological facts. Here, we used Shannon entropy as a tool for depicting the biological status of a system. Thus, we assessed the measurement of Shannon entropy by several RNA-seq workflow approaches, such as DESeq2 and edgeR, but also by combining nine normalization methods with log2 fold change on paired samples of TCGA RNA-seq representing datasets of 515 patients and spanning 12 different cancer types with 5-year overall survival rates ranging from 20% to 98%. Our analysis revealed that TPM, RLE, and TMM normalization, coupled with a threshold of log2 fold change ≥1, for identifying differentially expressed genes, yielded the best results. We propose that Shannon entropy can serve as an objective metric for refining the optimization of RNA-seq workflows and mRNA sequencing technologies.
Collapse
Affiliation(s)
- Nicolas Carels
- Laboratory of Biological System Modeling, Center of Technological Development in Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Rio de Janeiro 21040-900, RJ, Brazil
| |
Collapse
|
4
|
Au Yeung VPW, Obrezanova O, Zhou J, Yang H, Bowen TJ, Ivanov D, Saffadi I, Carter AS, Subramanian V, Dillmann I, Hall A, Corrigan A, Viant MR, Pointon A. Computational approaches identify a transcriptomic fingerprint of drug-induced structural cardiotoxicity. Cell Biol Toxicol 2024; 40:50. [PMID: 38940987 PMCID: PMC11213733 DOI: 10.1007/s10565-024-09880-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 05/15/2024] [Indexed: 06/29/2024]
Abstract
Structural cardiotoxicity (SCT) presents a high-impact risk that is poorly tolerated in drug discovery unless significant benefit is anticipated. Therefore, we aimed to improve the mechanistic understanding of SCT. First, we combined machine learning methods with a modified calcium transient assay in human-induced pluripotent stem cell-derived cardiomyocytes to identify nine parameters that could predict SCT. Next, we applied transcriptomic profiling to human cardiac microtissues exposed to structural and non-structural cardiotoxins. Fifty-two genes expressed across the three main cell types in the heart (cardiomyocytes, endothelial cells, and fibroblasts) were prioritised in differential expression and network clustering analyses and could be linked to known mechanisms of SCT. This transcriptomic fingerprint may prove useful for generating strategies to mitigate SCT risk in early drug discovery.
Collapse
Affiliation(s)
- Victoria P W Au Yeung
- Safety Sciences, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK.
- Phenomics, Data Sciences & Quantitative Biology, R&D AstraZeneca, Cambridge, UK.
| | - Olga Obrezanova
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Jiarui Zhou
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham, UK
| | - Hongbin Yang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Tara J Bowen
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham, UK
| | - Delyan Ivanov
- High-Throughput Screening, R&D, AstraZeneca, Alderley Park, UK
| | - Izzy Saffadi
- Safety Sciences, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Alfie S Carter
- Safety Sciences, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Vigneshwari Subramanian
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Inken Dillmann
- Disease Molecular Profiling, Discovery Biology, R&D AstraZeneca, Gothenburg, Sweden
| | - Andrew Hall
- Safety Sciences, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Adam Corrigan
- Phenomics, Data Sciences & Quantitative Biology, R&D AstraZeneca, Cambridge, UK
| | - Mark R Viant
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham, UK
- Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham, UK
| | - Amy Pointon
- Safety Sciences, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| |
Collapse
|
5
|
Sarker B, Matiur Rahaman M, Alamin MH, Ariful Islam M, Nurul Haque Mollah M. Boosting edgeR (Robust) by dealing with missing observations and gene-specific outliers in RNA-Seq profiles and its application to explore biomarker genes for diagnosis and therapies of ovarian cancer. Genomics 2024; 116:110834. [PMID: 38527595 DOI: 10.1016/j.ygeno.2024.110834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 02/09/2024] [Accepted: 03/20/2024] [Indexed: 03/27/2024]
Abstract
The edgeR (Robust) is a popular approach for identifying differentially expressed genes (DEGs) from RNA-Seq profiles. However, it shows weak performance against gene-specific outliers and is unable to handle missing observations. To address these issues, we proposed a pre-processing approach of RNA-Seq count data by combining the iLOO-based outlier detection and random forest-based missing imputation approach for boosting the performance of edgeR (Robust). Both simulation and real RNA-Seq count data analysis results showed that the proposed edgeR (Robust) outperformed than the conventional edgeR (Robust). To investigate the effectiveness of identified DEGs for diagnosis, and therapies of ovarian cancer (OC), we selected top-ranked 12 DEGs (IL6, XCL1, CXCL8, C1QC, C1QB, SNAI2, TYROBP, COL1A2, SNAP25, NTS, CXCL2, and AGT) and suggested hub-DEGs guided top-ranked 10 candidate drug-molecules for the treatment against OC. Hence, our proposed procedure might be an effective computational tool for exploring potential DEGs from RNA-Seq profiles for diagnosis and therapies of any disease.
Collapse
Affiliation(s)
- Bandhan Sarker
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Md Matiur Rahaman
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; Zhejiang University-University of Edinburgh Institute, Zhejiang University School of Medicine, Haining 314400, China.
| | - Muhammad Habibulla Alamin
- Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Md Ariful Islam
- Bioinformatics Laboratory (Dry), Department of Statistics, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Md Nurul Haque Mollah
- Bioinformatics Laboratory (Dry), Department of Statistics, University of Rajshahi, Rajshahi 6205, Bangladesh.
| |
Collapse
|
6
|
Fisher JL, Wilk EJ, Oza VH, Gary SE, Howton TC, Flanary VL, Clark AD, Hjelmeland AB, Lasseigne BN. Signature reversion of three disease-associated gene signatures prioritizes cancer drug repurposing candidates. FEBS Open Bio 2024; 14:803-830. [PMID: 38531616 PMCID: PMC11073506 DOI: 10.1002/2211-5463.13796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 03/13/2024] [Accepted: 03/14/2024] [Indexed: 03/28/2024] Open
Abstract
Drug repurposing is promising because approving a drug for a new indication requires fewer resources than approving a new drug. Signature reversion detects drug perturbations most inversely related to the disease-associated gene signature to identify drugs that may reverse that signature. We assessed the performance and biological relevance of three approaches for constructing disease-associated gene signatures (i.e., limma, DESeq2, and MultiPLIER) and prioritized the resulting drug repurposing candidates for four low-survival human cancers. Our results were enriched for candidates that had been used in clinical trials or performed well in the PRISM drug screen. Additionally, we found that pamidronate and nimodipine, drugs predicted to be efficacious against the brain tumor glioblastoma (GBM), inhibited the growth of a GBM cell line and cells isolated from a patient-derived xenograft (PDX). Our results demonstrate that by applying multiple disease-associated gene signature methods, we prioritized several drug repurposing candidates for low-survival cancers.
Collapse
Affiliation(s)
- Jennifer L. Fisher
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Elizabeth J. Wilk
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Vishal H. Oza
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Sam E. Gary
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Timothy C. Howton
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Victoria L. Flanary
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Amanda D. Clark
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Anita B. Hjelmeland
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| | - Brittany N. Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of MedicineThe University of Alabama at BirminghamALUSA
| |
Collapse
|
7
|
Brooks TG, Lahens NF, Mrčela A, Sarantopoulou D, Nayak S, Naik A, Sengupta S, Choi PS, Grant GR. BEERS2: RNA-Seq simulation through high fidelity in silico modeling. Brief Bioinform 2024; 25:bbae164. [PMID: 38605641 PMCID: PMC11009461 DOI: 10.1093/bib/bbae164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 01/26/2024] [Accepted: 03/26/2024] [Indexed: 04/13/2024] Open
Abstract
Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully length messenger RNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in polymerase chain reaction (PCR) amplification, barcode read errors and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Dimitra Sarantopoulou
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Soumyashant Nayak
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: Statistics and Mathematics Unit, Indian Statistical Institute, Bengaluru, Karnataka, India
| | - Amruta Naik
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shaon Sengupta
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Peter S Choi
- Division of Cancer Pathobiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
8
|
El‐Ayoubi A, Arakelyan A, Klawitter M, Merk L, Hakobyan S, Gonzalez‐Menendez I, Quintanilla Fend L, Holm PS, Mikulits W, Schwab M, Danielyan L, Naumann U. Development of an optimized, non-stem cell line for intranasal delivery of therapeutic cargo to the central nervous system. Mol Oncol 2024; 18:528-546. [PMID: 38115217 PMCID: PMC10920084 DOI: 10.1002/1878-0261.13569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 10/23/2023] [Accepted: 12/13/2023] [Indexed: 12/21/2023] Open
Abstract
Neural stem cells (NSCs) are considered to be valuable candidates for delivering a variety of anti-cancer agents, including oncolytic viruses, to brain tumors. However, owing to the previously reported tumorigenic potential of NSC cell lines after intranasal administration (INA), here we identified the human hepatic stellate cell line LX-2 as a cell type capable of longer resistance to replication of oncolytic adenoviruses (OAVs) as a therapeutic cargo, and that is non-tumorigenic after INA. Our data show that LX-2 cells can longer withstand the OAV XVir-N-31 replication and oncolysis than NSCs. By selecting the highly migratory cell population out of LX-2, an offspring cell line with a higher and more stable capability to migrate was generated. Additionally, as a safety backup, we applied genomic herpes simplex virus thymidine kinase (HSV-TK) integration into LX-2, leading to high vulnerability to ganciclovir (GCV). Histopathological analyses confirmed the absence of neoplasia in the respiratory tracts and brains of immuno-compromised mice 3 months after INA of LX-2 cells. Our data suggest that LX-2 is a novel, robust, and safe cell line for delivering anti-cancer and other therapeutic agents to the brain.
Collapse
Affiliation(s)
- Ali El‐Ayoubi
- Molecular Neurooncology, Department of Vascular Neurology, Hertie Institute for Clinical Brain Research and Center NeurologyUniversity Hospital of TübingenGermany
| | - Arsen Arakelyan
- Research Group of BioinformaticsInstitute of Molecular Biology NAS RAYerevanArmenia
| | - Moritz Klawitter
- Molecular Neurooncology, Department of Vascular Neurology, Hertie Institute for Clinical Brain Research and Center NeurologyUniversity Hospital of TübingenGermany
| | - Luisa Merk
- Molecular Neurooncology, Department of Vascular Neurology, Hertie Institute for Clinical Brain Research and Center NeurologyUniversity Hospital of TübingenGermany
| | - Siras Hakobyan
- Research Group of BioinformaticsInstitute of Molecular Biology NAS RAYerevanArmenia
- Armenian Institute of BioinformaticsYerevanArmenia
| | - Irene Gonzalez‐Menendez
- Institute for Pathology, Department of General and Molecular PathologyUniversity Hospital TübingenGermany
- Cluster of Excellence iFIT (EXC 2180) "Image‐Guided and Functionally Instructed Tumor Therapies"Eberhard Karls University of TübingenGermany
| | - Leticia Quintanilla Fend
- Institute for Pathology, Department of General and Molecular PathologyUniversity Hospital TübingenGermany
- Cluster of Excellence iFIT (EXC 2180) "Image‐Guided and Functionally Instructed Tumor Therapies"Eberhard Karls University of TübingenGermany
| | - Per Sonne Holm
- Department of Urology, Klinikum rechts der IsarTechnical University of MunichGermany
- Department of Oral and Maxillofacial SurgeryMedical University InnsbruckAustria
- XVir Therapeutics GmbHMunichGermany
| | - Wolfgang Mikulits
- Center for Cancer Research, Comprehensive Cancer CenterMedical University of ViennaAustria
| | - Matthias Schwab
- Cluster of Excellence iFIT (EXC 2180) "Image‐Guided and Functionally Instructed Tumor Therapies"Eberhard Karls University of TübingenGermany
- Dr. Margarete Fischer‐Bosch Institute of Clinical PharmacologyStuttgartGermany
- Department of Pharmacy and BiochemistryUniversity of TübingenGermany
- Department of Clinical PharmacologyUniversity Hospital TübingenGermany
- Neuroscience Laboratory and Departments of Biochemistry and Clinical PharmacologyYerevan State Medical UniversityArmenia
| | - Lusine Danielyan
- Department of Pharmacy and BiochemistryUniversity of TübingenGermany
- Department of Clinical PharmacologyUniversity Hospital TübingenGermany
- Neuroscience Laboratory and Departments of Biochemistry and Clinical PharmacologyYerevan State Medical UniversityArmenia
| | - Ulrike Naumann
- Molecular Neurooncology, Department of Vascular Neurology, Hertie Institute for Clinical Brain Research and Center NeurologyUniversity Hospital of TübingenGermany
- Gene and RNA Therapy Center (GRTC)Faculty of Medicine University TübingenGermany
| |
Collapse
|
9
|
Warden CD, Wu X. Critical Differential Expression Assessment for Individual Bulk RNA-Seq Projects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579728. [PMID: 38405814 PMCID: PMC10888899 DOI: 10.1101/2024.02.10.579728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Finding the right balance of quality and quantity can be important, and it is essential that project quality does not drop below the level where important main conclusions are missed or misstated. We use knock-out and over-expression studies as a simplification to test recovery of a known causal gene in RNA-Seq cell line experiments. When single-end RNA-Seq reads are aligned with STAR and quantified with htseq-count, we found potential value in testing the use of the Generalized Linear Model (GLM) implementation of edgeR with robust dispersion estimation more frequently for either single-variate or multi-variate 2-group comparisons (with the possibility of defining criteria less stringent than |fold-change| > 1.5 and FDR < 0.05). When considering a limited number of patient sample comparisons with larger sample size, there might be some decreased variability between methods (except for DESeq1). However, at the same time, the ranking of the gene identified using immunohistochemistry (for ER/PR/HER2 in breast cancer samples from The Cancer Genome Atlas) showed as possible shift in performance compared to the cell line comparisons, potentially highlighting utility for standard statistical tests and/or limma-based analysis with larger sample sizes. If this continues to be true in additional studies and comparisons, then that could be consistent with the possibility that it may be important to allocate time for potential methods troubleshooting for genomics projects. Analysis of public data presented in this study does not consider all experimental designs, and presentation of downstream analysis is limited. So, any estimate from this simplification would be an underestimation of the true need for some methods testing for every project. Additionally, this set of independent cell line experiments has a limitation in being able to determine the frequency of missing a highly important gene if the problem is rare (such as 10% or lower). For example, if there was an assumption that only one method can be tested for "initial" analysis, then it is not completely clear to the extent that using edgeR-robust might perform better than DESeq2 in the cell line experiments. Importantly, we do not wish to cause undue concern, and we believe that it should often be possible to define a gene expression differential expression workflow that is suitable for some purposes for many samples. Nevertheless, at the same time, we provide a variety of measures that we believe emphasize the need to critically assess every individual project and maximize confidence in published results.
Collapse
Affiliation(s)
- Charles D Warden
- Integrative Genomics Core, Department of Molecular and Cellular Biology, City of Hope National Medical Center, Duarte, CA
| | - Xiwei Wu
- Integrative Genomics Core, Department of Molecular and Cellular Biology, City of Hope National Medical Center, Duarte, CA
| |
Collapse
|
10
|
Hackert NS, Radtke FA, Exner T, Lorenz HM, Müller-Tidow C, Nigrovic PA, Wabnitz G, Grieshaber-Bouyer R. Human and mouse neutrophils share core transcriptional programs in both homeostatic and inflamed contexts. Nat Commun 2023; 14:8133. [PMID: 38065997 PMCID: PMC10709367 DOI: 10.1038/s41467-023-43573-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 11/14/2023] [Indexed: 12/18/2023] Open
Abstract
Neutrophils are frequently studied in mouse models, but the extent to which findings translate to humans remains poorly defined. In an integrative analysis of 11 mouse and 13 human datasets, we find a strong correlation of neutrophil gene expression across species. In inflammation, neutrophils display substantial transcriptional diversity but share a core inflammation program. This program includes genes encoding IL-1 family members, CD14, IL-4R, CD69, and PD-L1. Chromatin accessibility of core inflammation genes increases in blood compared to bone marrow and further in tissue. Transcription factor enrichment analysis implicates members of the NF-κB family and AP-1 complex as important drivers, and HoxB8 neutrophils with JunB knockout show a reduced expression of core inflammation genes in resting and activated cells. In independent single-cell validation data, neutrophil activation by type I or type II interferon, G-CSF, and E. coli leads to upregulation in core inflammation genes. In COVID-19 patients, higher expression of core inflammation genes in neutrophils is associated with more severe disease. In vitro treatment with GM-CSF, LPS, and type II interferon induces surface protein upregulation of core inflammation members. Together, we demonstrate transcriptional conservation in neutrophils in homeostasis and identify a core inflammation program shared across heterogeneous inflammatory conditions.
Collapse
Affiliation(s)
- Nicolaj S Hackert
- Division of Rheumatology, Department of Medicine V, Heidelberg University Hospital, Heidelberg, Germany
- Institute for Immunology, Heidelberg University Hospital, Heidelberg, Germany
- Division of Immunology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Felix A Radtke
- Division of Rheumatology, Department of Medicine V, Heidelberg University Hospital, Heidelberg, Germany
- Institute for Immunology, Heidelberg University Hospital, Heidelberg, Germany
- Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
- Oxford Centre for Haematology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Tarik Exner
- Division of Rheumatology, Department of Medicine V, Heidelberg University Hospital, Heidelberg, Germany
- Institute for Immunology, Heidelberg University Hospital, Heidelberg, Germany
| | - Hanns-Martin Lorenz
- Division of Rheumatology, Department of Medicine V, Heidelberg University Hospital, Heidelberg, Germany
| | - Carsten Müller-Tidow
- Department of Medicine V, Hematology, Oncology and Rheumatology, Heidelberg University Hospital, Heidelberg, Germany
- Molecular Medicine Partnership Unit, European Molecular Biology Laboratory (EMBL), University of Heidelberg, Heidelberg, Germany
| | - Peter A Nigrovic
- Division of Immunology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Guido Wabnitz
- Institute for Immunology, Heidelberg University Hospital, Heidelberg, Germany
| | - Ricardo Grieshaber-Bouyer
- Division of Rheumatology, Department of Medicine V, Heidelberg University Hospital, Heidelberg, Germany.
- Institute for Immunology, Heidelberg University Hospital, Heidelberg, Germany.
- Molecular Medicine Partnership Unit, European Molecular Biology Laboratory (EMBL), University of Heidelberg, Heidelberg, Germany.
- Deutsches Zentrum für Immuntherapie (DZI), Friedrich Alexander Universität Erlangen-Nürnberg and Universitätsklinikum Erlangen, Erlangen, Germany.
- Department of Internal Medicine 3 - Rheumatology and Immunology, Friedrich Alexander Universität Erlangen-Nürnberg and Universitätsklinikum Erlangen, Erlangen, Germany.
| |
Collapse
|
11
|
Chen WC, Choudhury A, Youngblood MW, Polley MYC, Lucas CHG, Mirchia K, Maas SLN, Suwala AK, Won M, Bayley JC, Harmanci AS, Harmanci AO, Klisch TJ, Nguyen MP, Vasudevan HN, McCortney K, Yu TJ, Bhave V, Lam TC, Pu JKS, Li LF, Leung GKK, Chan JW, Perlow HK, Palmer JD, Haberler C, Berghoff AS, Preusser M, Nicolaides TP, Mawrin C, Agnihotri S, Resnick A, Rood BR, Chew J, Young JS, Boreta L, Braunstein SE, Schulte J, Butowski N, Santagata S, Spetzler D, Bush NAO, Villanueva-Meyer JE, Chandler JP, Solomon DA, Rogers CL, Pugh SL, Mehta MP, Sneed PK, Berger MS, Horbinski CM, McDermott MW, Perry A, Bi WL, Patel AJ, Sahm F, Magill ST, Raleigh DR. Targeted gene expression profiling predicts meningioma outcomes and radiotherapy responses. Nat Med 2023; 29:3067-3076. [PMID: 37944590 PMCID: PMC11073469 DOI: 10.1038/s41591-023-02586-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/11/2023] [Indexed: 11/12/2023]
Abstract
Surgery is the mainstay of treatment for meningioma, the most common primary intracranial tumor, but improvements in meningioma risk stratification are needed and indications for postoperative radiotherapy are controversial. Here we develop a targeted gene expression biomarker that predicts meningioma outcomes and radiotherapy responses. Using a discovery cohort of 173 meningiomas, we developed a 34-gene expression risk score and performed clinical and analytical validation of this biomarker on independent meningiomas from 12 institutions across 3 continents (N = 1,856), including 103 meningiomas from a prospective clinical trial. The gene expression biomarker improved discrimination of outcomes compared with all other systems tested (N = 9) in the clinical validation cohort for local recurrence (5-year area under the curve (AUC) 0.81) and overall survival (5-year AUC 0.80). The increase in AUC compared with the standard of care, World Health Organization 2021 grade, was 0.11 for local recurrence (95% confidence interval 0.07 to 0.17, P < 0.001). The gene expression biomarker identified meningiomas benefiting from postoperative radiotherapy (hazard ratio 0.54, 95% confidence interval 0.37 to 0.78, P = 0.0001) and suggested postoperative management could be refined for 29.8% of patients. In sum, our results identify a targeted gene expression biomarker that improves discrimination of meningioma outcomes, including prediction of postoperative radiotherapy responses.
Collapse
Affiliation(s)
- William C Chen
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA.
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA.
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA.
| | - Abrar Choudhury
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
- Medical Scientist Training Program, University of California San Francisco, San Francisco, CA, USA
| | - Mark W Youngblood
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
| | - Mei-Yin C Polley
- NRG Statistics and Data Management Center, NRG Oncology, Philadelphia, PA, USA
| | | | - Kanish Mirchia
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Sybren L N Maas
- Department of Pathology, Leiden University Medical Center, Leiden, the Netherlands
- Department of Pathology, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Abigail K Suwala
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Neuropathology, University Hospital Heidelberg and CCU Neuropathology, German Consortium for Translational Cancer Research, German Cancer Research Center, Heidelberg, Germany
| | - Minhee Won
- NRG Statistics and Data Management Center, NRG Oncology, Philadelphia, PA, USA
| | - James C Bayley
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, USA
| | - Akdes S Harmanci
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Arif O Harmanci
- Center for Secure Artificial Intelligence for Healthcare, Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, USA
| | - Tiemo J Klisch
- Department of Molecular and Human Genetics, Baylor College of Medicine, and Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, USA
| | - Minh P Nguyen
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Harish N Vasudevan
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Kathleen McCortney
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
| | - Theresa J Yu
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA, USA
| | - Varun Bhave
- Department of Neurosurgery, Brigham and Women's Hospital, and Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tai-Chung Lam
- Department of Clinical Oncology, The University of Hong Kong, Pokfulam, China
| | - Jenny Kan-Suen Pu
- Division of Neurosurgery, Department of Surgery, The University of Hong Kong, Pokfulam, China
| | - Lai-Fung Li
- Division of Neurosurgery, Department of Surgery, The University of Hong Kong, Pokfulam, China
| | - Gilberto Ka-Kit Leung
- Division of Neurosurgery, Department of Surgery, The University of Hong Kong, Pokfulam, China
| | - Jason W Chan
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Haley K Perlow
- Department of Radiation Oncology, Ohio State University, Columbus, OH, USA
| | - Joshua D Palmer
- Department of Radiation Oncology, Ohio State University, Columbus, OH, USA
| | - Christine Haberler
- Division of Neuropathology and Neurochemistry, Department of Neurology, Medical University of Vienna, Vienna, Austria
| | - Anna S Berghoff
- Division of Oncology, Department of Medicine, Medical University of Vienna, Vienna, Austria
| | - Matthias Preusser
- Division of Oncology, Department of Medicine, Medical University of Vienna, Vienna, Austria
| | | | - Christian Mawrin
- Department of Neuropathology, University of Magdeburg, Magdeburg, Germany
| | - Sameer Agnihotri
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - Adam Resnick
- Department of Neurological Surgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Brian R Rood
- Brain Tumor Institute, Children's National Hospital, Washington, DC, USA
| | - Jessica Chew
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Jacob S Young
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Lauren Boreta
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Steve E Braunstein
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Jessica Schulte
- Neurosciences Department, University of California San Diego, La Jolla, CA, USA
| | - Nicholas Butowski
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Sandro Santagata
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Nancy Ann Oberheim Bush
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Javier E Villanueva-Meyer
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA, USA
| | - James P Chandler
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
| | - David A Solomon
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - C Leland Rogers
- NRG Statistics and Data Management Center, NRG Oncology, Philadelphia, PA, USA
| | - Stephanie L Pugh
- NRG Statistics and Data Management Center, NRG Oncology, Philadelphia, PA, USA
| | - Minesh P Mehta
- NRG Statistics and Data Management Center, NRG Oncology, Philadelphia, PA, USA
- Miami Neuroscience Institute, Baptist Health, Miami, FL, USA
| | - Penny K Sneed
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Mitchel S Berger
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Craig M Horbinski
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
- Department of Pathology, Northwestern University, Chicago, IL, USA
| | | | - Arie Perry
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Wenya Linda Bi
- Department of Neurosurgery, Brigham and Women's Hospital, and Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Akash J Patel
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, USA
| | - Felix Sahm
- Department of Neuropathology, University Hospital Heidelberg and CCU Neuropathology, German Consortium for Translational Cancer Research, German Cancer Research Center, Heidelberg, Germany
| | - Stephen T Magill
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA.
| | - David R Raleigh
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA.
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA.
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
12
|
Xie Z, Chen C, Ma’ayan A. Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis. PeerJ 2023; 11:e16351. [PMID: 37953774 PMCID: PMC10638921 DOI: 10.7717/peerj.16351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 10/04/2023] [Indexed: 11/14/2023] Open
Abstract
Many tools and algorithms are available for analyzing transcriptomics data. These include algorithms for performing sequence alignment, data normalization and imputation, clustering, identifying differentially expressed genes, and performing gene set enrichment analysis. To make the best choice about which tools to use, objective benchmarks can be developed to compare the quality of different algorithms to extract biological knowledge maximally and accurately from these data. The Dexamethasone Benchmark (Dex-Benchmark) resource aims to fill this need by providing the community with datasets and code templates for benchmarking different gene expression analysis tools and algorithms. The resource provides access to a collection of curated RNA-seq, L1000, and ChIP-seq data from dexamethasone treatment as well as genetic perturbations of its known targets. In addition, the website provides Jupyter Notebooks that use these pre-processed curated datasets to demonstrate how to benchmark the different steps in gene expression analysis. By comparing two independent data sources and data types with some expected concordance, we can assess which tools and algorithms best recover such associations. To demonstrate the usefulness of the resource for discovering novel drug targets, we applied it to optimize data processing strategies for the chemical perturbations and CRISPR single gene knockouts from the L1000 transcriptomics data from the Library of Integrated Network Cellular Signatures (LINCS) program, with a focus on understudied proteins from the Illuminating the Druggable Genome (IDG) program. Overall, the Dex-Benchmark resource can be utilized to assess the quality of transcriptomics and other related bioinformatics data analysis workflows. The resource is available from: https://maayanlab.github.io/dex-benchmark.
Collapse
Affiliation(s)
- Zhuorui Xie
- Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Clara Chen
- Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Avi Ma’ayan
- Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
13
|
Sp S, Mitra RN, Zheng M, Chrispell JD, Wang K, Kwon YS, Weiss ER, Han Z. Gene augmentation for autosomal dominant retinitis pigmentosa using rhodopsin genomic loci nanoparticles in the P23H +/- knock-in murine model. Gene Ther 2023; 30:628-640. [PMID: 36935427 DOI: 10.1038/s41434-023-00394-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 02/13/2023] [Accepted: 02/28/2023] [Indexed: 03/21/2023]
Abstract
Gene therapy for autosomal dominant retinitis pigmentosa (adRP) is challenged by the dominant inheritance of the mutant genes, which would seemingly require a combination of mutant suppression and wild-type replacement of the appropriate gene. We explore the possibility that delivery of a nanoparticle (NP)-mediated full-length mouse genomic rhodopsin (gRho) or human genomic rhodopsin (gRHO) locus can overcome the dominant negative effects of the mutant rhodopsin in the clinically relevant P23H+/--knock-in heterozygous mouse model. Our results demonstrate that mice in both gRho and gRHO NP-treated groups exhibit significant structural and functional recovery of the rod photoreceptors, which lasted for 3 months post-injection, indicating a promising reduction in photoreceptor degeneration. We performed miRNA transcriptome analysis using next generation sequencing and detected differentially expressed miRNAs as a first step towards identifying miRNAs that could potentially be used as rhodopsin gene expression enhancers or suppressors for sustained photoreceptor rescue. Our results indicate that delivering an intact genomic locus as a transgene has a greater chance of success compared to the use of the cDNA for treatment of this model of adRP, emphasizing the importance of gene augmentation using a gDNA that includes regulatory elements.
Collapse
Affiliation(s)
- Simna Sp
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Rajendra N Mitra
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Min Zheng
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jared D Chrispell
- Department of Cell Biology and Physiology, the University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Kai Wang
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yong-Su Kwon
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Ellen R Weiss
- Department of Cell Biology and Physiology, the University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Zongchao Han
- Department of Ophthalmology, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Carolina Institute for NanoMedicine, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Division of Pharmacoengineering & Molecular Pharmaceutics, Eshelman School of Pharmacy, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
14
|
Singh PP, Benayoun BA. Considerations for reproducible omics in aging research. NATURE AGING 2023; 3:921-930. [PMID: 37386258 PMCID: PMC10527412 DOI: 10.1038/s43587-023-00448-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 06/01/2023] [Indexed: 07/01/2023]
Abstract
Technical advancements over the past two decades have enabled the measurement of the panoply of molecules of cells and tissues including transcriptomes, epigenomes, metabolomes and proteomes at unprecedented resolution. Unbiased profiling of these molecular landscapes in the context of aging can reveal important details about mechanisms underlying age-related functional decline and age-related diseases. However, the high-throughput nature of these experiments creates unique analytical and design demands for robustness and reproducibility. In addition, 'omic' experiments are generally onerous, making it crucial to effectively design them to eliminate as many spurious sources of variation as possible as well as account for any biological or technical parameter that may influence such measures. In this Perspective, we provide general guidelines on best practices in the design and analysis of omic experiments in aging research from experimental design to data analysis and considerations for long-term reproducibility and validation of such studies.
Collapse
Affiliation(s)
- Param Priya Singh
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Aging Research Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA.
- Molecular and Computational Biology Department, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, USA.
- Biochemistry and Molecular Medicine Department, USC Keck School of Medicine, Los Angeles, CA, USA.
- Epigenetics and Gene Regulation, USC Norris Comprehensive Cancer Center, Los Angeles, CA, USA.
- USC Stem Cell Initiative, Los Angeles, CA, USA.
| |
Collapse
|
15
|
Brooks TG, Lahens NF, Mrčela A, Sarantopoulou D, Nayak S, Naik A, Sengupta S, Choi PS, Grant GR. BEERS2: RNA-Seq simulation through high fidelity in silico modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.21.537847. [PMID: 37162982 PMCID: PMC10168222 DOI: 10.1101/2023.04.21.537847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking, and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully-length mRNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM, or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in PCR amplification, barcode read errors, and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
| | - Dimitra Sarantopoulou
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Soumyashant Nayak
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Current address: Statistics and Mathematics Unit, Indian Statistical Institute, Bengaluru, Karnataka, India
| | - Amruta Naik
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shaon Sengupta
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Peter S Choi
- Division of Cancer Pathobiology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
16
|
Riquelme-Perez M, Perez-Sanz F, Deleuze JF, Escartin C, Bonnet E, Brohard S. DEVEA: an interactive shiny application for Differential Expression analysis, data Visualization and Enrichment Analysis of transcriptomics data. F1000Res 2023; 11:711. [PMID: 36999088 PMCID: PMC10043628.2 DOI: 10.12688/f1000research.122949.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/21/2023] [Indexed: 03/29/2023] Open
Abstract
We are at a time of considerable growth in transcriptomics studies and subsequent in silico analysis. RNA sequencing (RNA-Seq) is the most widely used approach to analyse the transcriptome and is integrated in many studies. The processing of transcriptomic data typically requires a noteworthy number of steps, statistical knowledge, and coding skills, which are not accessible to all scientists. Despite the development of a plethora of software applications over the past few years to address this concern, there is still room for improvement. Here we present DEVEA, an R shiny application tool developed to perform differential expression analysis, data visualization and enrichment pathway analysis mainly from transcriptomics data, but also from simpler gene lists with or without statistical values. The intuitive and easy-to-manipulate interface facilitates gene expression exploration through numerous interactive figures and tables, and statistical comparisons of expression profile levels between groups. Further meta-analysis such as enrichment analysis is also possible, without the need for prior bioinformatics expertise. DEVEA performs a comprehensive analysis from multiple and flexible data sources representing distinct analytical steps. Consequently, it produces dynamic graphs and tables, to explore the expression levels and statistical results from differential expression analysis. Moreover, it generates a comprehensive pathway analysis to extend biological insights. Finally, a complete and customizable HTML report can be extracted to enable the scientists to explore results beyond the application. DEVEA is freely accessible at https://shiny.imib.es/devea/ and the source code is available on our GitHub repository https://github.com/MiriamRiquelmeP/DEVEA.
Collapse
Affiliation(s)
- Miriam Riquelme-Perez
- Université Paris-Saclay, CEA, CNRS, MIRCen, Laboratoire des Maladies Neurodégénératives, Fontenay-aux-Roses, 92265, France
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, 91000, Evry, France
| | - Fernando Perez-Sanz
- Biomedical Informatics & Bioinformatics Service, Institute for Biomedical Research of Murcia (IMIB), Murcia, 30120, Spain
| | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, 91000, Evry, France
| | - Carole Escartin
- Université Paris-Saclay, CEA, CNRS, MIRCen, Laboratoire des Maladies Neurodégénératives, Fontenay-aux-Roses, 92265, France
| | - Eric Bonnet
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, 91000, Evry, France
| | - Solène Brohard
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, 91000, Evry, France
| |
Collapse
|
17
|
Päll T, Luidalepp H, Tenson T, Maiväli Ü. A field-wide assessment of differential expression profiling by high-throughput sequencing reveals widespread bias. PLoS Biol 2023; 21:e3002007. [PMID: 36862747 PMCID: PMC10013925 DOI: 10.1371/journal.pbio.3002007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 03/14/2023] [Accepted: 01/20/2023] [Indexed: 03/03/2023] Open
Abstract
We assess inferential quality in the field of differential expression profiling by high-throughput sequencing (HT-seq) based on analysis of datasets submitted from 2008 to 2020 to the NCBI GEO data repository. We take advantage of the parallel differential expression testing over thousands of genes, whereby each experiment leads to a large set of p-values, the distribution of which can indicate the validity of assumptions behind the test. From a well-behaved p-value set π0, the fraction of genes that are not differentially expressed can be estimated. We found that only 25% of experiments resulted in theoretically expected p-value histogram shapes, although there is a marked improvement over time. Uniform p-value histogram shapes, indicative of <100 actual effects, were extremely few. Furthermore, although many HT-seq workflows assume that most genes are not differentially expressed, 37% of experiments have π0-s of less than 0.5, as if most genes changed their expression level. Most HT-seq experiments have very small sample sizes and are expected to be underpowered. Nevertheless, the estimated π0-s do not have the expected association with N, suggesting widespread problems of experiments with controlling false discovery rate (FDR). Both the fractions of different p-value histogram types and the π0 values are strongly associated with the differential expression analysis program used by the original authors. While we could double the proportion of theoretically expected p-value distributions by removing low-count features from the analysis, this treatment did not remove the association with the analysis program. Taken together, our results indicate widespread bias in the differential expression profiling field and the unreliability of statistical methods used to analyze HT-seq data.
Collapse
Affiliation(s)
- Taavi Päll
- Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
| | | | - Tanel Tenson
- Institute of Technology, University of Tartu, Tartu, Estonia
| | - Ülo Maiväli
- Institute of Technology, University of Tartu, Tartu, Estonia
- * E-mail:
| |
Collapse
|
18
|
Garg T, Weiss CR, Sheth RA. Techniques for Profiling the Cellular Immune Response and Their Implications for Interventional Oncology. Cancers (Basel) 2022; 14:3628. [PMID: 35892890 PMCID: PMC9332307 DOI: 10.3390/cancers14153628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 07/19/2022] [Accepted: 07/20/2022] [Indexed: 12/07/2022] Open
Abstract
In recent years there has been increased interest in using the immune contexture of the primary tumors to predict the patient's prognosis. The tumor microenvironment of patients with cancers consists of different types of lymphocytes, tumor-infiltrating leukocytes, dendritic cells, and others. Different technologies can be used for the evaluation of the tumor microenvironment, all of which require a tissue or cell sample. Image-guided tissue sampling is a cornerstone in the diagnosis, stratification, and longitudinal evaluation of therapeutic efficacy for cancer patients receiving immunotherapies. Therefore, interventional radiologists (IRs) play an essential role in the evaluation of patients treated with systemically administered immunotherapies. This review provides a detailed description of different technologies used for immune assessment and analysis of the data collected from the use of these technologies. The detailed approach provided herein is intended to provide the reader with the knowledge necessary to not only interpret studies containing such data but also design and apply these tools for clinical practice and future research studies.
Collapse
Affiliation(s)
- Tushar Garg
- Division of Vascular and Interventional Radiology, Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; (T.G.); (C.R.W.)
| | - Clifford R. Weiss
- Division of Vascular and Interventional Radiology, Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; (T.G.); (C.R.W.)
| | - Rahul A. Sheth
- Department of Interventional Radiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
19
|
Li Y, Ge X, Peng F, Li W, Li JJ. Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biol 2022; 23:79. [PMID: 35292087 PMCID: PMC8922736 DOI: 10.1186/s13059-022-02648-4] [Citation(s) in RCA: 94] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 03/07/2022] [Indexed: 12/05/2022] Open
Abstract
When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test.
Collapse
Affiliation(s)
- Yumei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Xinzhou Ge
- Department of Statistics, University of California, Los Angeles, CA, 90095, USA
| | - Fanglue Peng
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA.
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA, 90095, USA.
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, University of California, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA.
- Department of Biostatistics, University of California, Los Angeles, CA, 90095, USA.
| |
Collapse
|
20
|
Zappia L, Theis FJ. Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. Genome Biol 2021; 22:301. [PMID: 34715899 PMCID: PMC8555270 DOI: 10.1186/s13059-021-02519-4] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 10/14/2021] [Indexed: 11/16/2022] Open
Abstract
Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data shows the evolution of the field and a change of focus from ordering cells on continuous trajectories to integrating multiple samples and making use of reference datasets. We also find that open science practices reward developers with increased recognition and help accelerate the field.
Collapse
Affiliation(s)
- Luke Zappia
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany.
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
21
|
RNA Sequencing Data from Human Intracranial Aneurysm Tissue Reveals a Complex Inflammatory Environment Associated with Rupture. Mol Diagn Ther 2021; 25:775-790. [PMID: 34403136 DOI: 10.1007/s40291-021-00552-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND Intracranial aneurysm (IA) rupture leads to deadly subarachnoid hemorrhages. However, the mechanisms leading to rupture remain poorly understood. Altered gene expression within IA tissue is linked to the pathobiology of aneurysm development and progression. Here, we analyzed expression patterns of control tissue samples and compared them to those of unruptured and ruptured IA tissue samples using data from the Gene Expression Omnibus (GEO). METHODS FASTQ files for 21 ruptured IAs, 21 unruptured IAs, and 16 control tissue samples were accessed from the GEO database. DESeq2 was used for differential expression analysis in three comparisons: unruptured IA versus control, ruptured IA versus control, and ruptured versus unruptured IA. Genes that were differentially expressed in multiple comparisons were evaluated to find those progressively increasing/decreasing from control to unruptured to ruptured. Significance was tested by either analysis of variance/Gabriel or Brown-Forsythe/Games Howell (p < 0.05 was considered significant). We used additional RNA sequencing and proteomics datasets to evaluate if our differentially expressed genes (DEGs) were present in other studies. Bioinformatics analyses were performed with g:Profiler and Ingenuity Pathway Analysis. RESULTS In total, we identified 1768 DEGs, of which 318 were found in multiple comparisons. Unruptured versus control reflected vascular remodeling processes, while ruptured versus control reflected inflammatory responses and cell activation/signaling. When comparing ruptured to unruptured IAs, we found massive activation of inflammation, inflammatory responses, and leukocyte responses. Of the 318 genes in multiple comparisons, 127 were found to be significant in the multi-cohort correlation analysis. Those that progressively increased (70 genes) were associated with immune system processes, while those that progressively decreased (38 genes) did not return any gene ontology terms. Many of our DEGs were also found in the other IA tissue sequencing studies. CONCLUSIONS We found unruptured IAs relate more to remodeling processes, while ruptured IAs reflect more inflammatory and immune responses.
Collapse
|
22
|
Lim YX, Lin H, Chu T, Lim YP. WBP2 promotes BTRC mRNA stability to drive migration and invasion in triple-negative breast cancer via NF-κB activation. Mol Oncol 2021; 16:422-446. [PMID: 34197030 PMCID: PMC8763649 DOI: 10.1002/1878-0261.13048] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/04/2021] [Accepted: 06/28/2021] [Indexed: 01/23/2023] Open
Abstract
WW‐domain‐binding protein 2 (WBP2) is an oncogene that drives breast carcinogenesis through regulating Wnt, estrogen receptor (ER), and Hippo signaling. Recent studies have identified neoteric modes of action of WBP2 other than its widely recognized function as a transcriptional coactivator. Here, we identified a previously unexplored role of WBP2 in inflammatory signaling in breast cancer via an integrated proteogenomic analysis of The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA BRCA) dataset. WBP2 was shown to enhance the migration and invasion in triple‐negative breast cancer (TNBC) cells especially under tumor necrosis factor alpha (TNF‐α) stimulation. Molecularly, WBP2 potentiates TNF‐α‐induced nuclear factor kappa B (NF‐κB) transcriptional activity and nuclear localization through aggrandizing ubiquitin‐mediated proteasomal degradation of its upstream inhibitor, NF‐κB inhibitor alpha (NFKBIA; also known as IκBα). We further demonstrate that WBP2 induces mRNA stability of beta‐transducin repeat‐containing E3 ubiquitin protein ligase (BTRC), which targets IκBα for ubiquitination and degradation. Disruption of IκBα rescued the impaired migratory and invasive phenotypes in WBP2‐silenced cells, while loss of BTRC ameliorated WBP2‐driven migration and invasion. Clinically, the WBP2‐BTRC‐IκBα signaling axis correlates with poorer prognosis in breast cancer patients. Our findings reveal a pivotal mechanism of WBP2 in modulating BTRC‐IκBα‐NF‐κB pathway to promote TNBC aggressiveness.
Collapse
Affiliation(s)
- Yvonne Xinyi Lim
- Integrative Sciences and Engineering Programme, National University of Singapore, Singapore.,Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Hexian Lin
- Integrative Sciences and Engineering Programme, National University of Singapore, Singapore.,Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Tinghine Chu
- Integrative Sciences and Engineering Programme, National University of Singapore, Singapore.,Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.,Department of Biomedical Informatics, Yong Loo Lin School of Medicine, National University Health System, Singapore City, Singapore
| | - Yoon Pin Lim
- Integrative Sciences and Engineering Programme, National University of Singapore, Singapore.,Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.,National University Cancer Institute, Singapore City, Singapore
| |
Collapse
|
23
|
Yoon S, Baik B, Park T, Nam D. Powerful p-value combination methods to detect incomplete association. Sci Rep 2021; 11:6980. [PMID: 33772054 PMCID: PMC7997958 DOI: 10.1038/s41598-021-86465-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 03/08/2021] [Indexed: 12/13/2022] Open
Abstract
Meta-analyses increase statistical power by combining statistics from multiple studies. Meta-analysis methods have mostly been evaluated under the condition that all the data in each study have an association with the given phenotype. However, specific experimental conditions in each study or genetic heterogeneity can result in "unassociated statistics" that are derived from the null distribution. Here, we show that power of conventional meta-analysis methods rapidly decreases as an increasing number of unassociated statistics are included, whereas the classical Fisher's method and its weighted variant (wFisher) exhibit relatively high power that is robust to addition of unassociated statistics. We also propose another robust method based on joint distribution of ordered p-values (ordmeta). Simulation analyses for t-test, RNA-seq, and microarray data demonstrated that wFisher and ordmeta, when only a small number of studies have an association, outperformed existing meta-analysis methods. We performed meta-analyses of nine microarray datasets (prostate cancer) and four association summary datasets (body mass index), where our methods exhibited high biological relevance and were able to detect genes that the-state-of-the-art methods missed. The metapro R package that implements the proposed methods is available from both CRAN and GitHub ( http://github.com/unistbig/metapro ).
Collapse
Affiliation(s)
- Sora Yoon
- Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Bukyung Baik
- Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
| | - Dougu Nam
- Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.
- Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.
| |
Collapse
|
24
|
Gunnarsson S, Prabakaran S. In silico identification of novel open reading frames in Plasmodium falciparum oocyte and salivary gland sporozoites using proteogenomics framework. Malar J 2021; 20:71. [PMID: 33546698 PMCID: PMC7866754 DOI: 10.1186/s12936-021-03598-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/16/2021] [Indexed: 11/25/2022] Open
Abstract
Background Plasmodium falciparum causes the deadliest form of malaria, which remains one of the most prevalent infectious diseases. Unfortunately, the only licensed vaccine showed limited protection and resistance to anti-malarial drug is increasing, which can be largely attributed to the biological complexity of the parasite’s life cycle. The progression from one developmental stage to another in P. falciparum involves drastic changes in gene expressions, where its infectivity to human hosts varies greatly depending on the stage. Approaches to identify candidate genes that are responsible for the development of infectivity to human hosts typically involve differential gene expression analysis between stages. However, the detection may be limited to annotated proteins and open reading frames (ORFs) predicted using restrictive criteria. Methods The above problem is particularly relevant for P. falciparum; whose genome annotation is relatively incomplete given its clinical significance. In this work, systems proteogenomics approach was used to address this challenge, as it allows computational detection of unannotated, novel Open Reading Frames (nORFs), which are neglected by conventional analyses. Two pairs of transcriptome/proteome were obtained from a previous study where one was collected in the mosquito-infectious oocyst sporozoite stage, and the other in the salivary gland sporozoite stage with human infectivity. They were then re-analysed using the proteogenomics framework to identify nORFs in each stage. Results Translational products of nORFs that map to antisense, intergenic, intronic, 3′ UTR and 5′ UTR regions, as well as alternative reading frames of canonical proteins were detected. Some of these nORFs also showed differential expression between the two life cycle stages studied. Their regulatory roles were explored through further bioinformatics analyses including the expression regulation on the parent reference genes, in silico structure prediction, and gene ontology term enrichment analysis. Conclusion The identification of nORFs in P. falciparum sporozoites highlights the biological complexity of the parasite. Although the analyses are solely computational, these results provide a starting point for further experimental validation of the existence and functional roles of these nORFs,
Collapse
Affiliation(s)
- Sophie Gunnarsson
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|