1
|
Lin KZ, Qiu Y, Roeder K. eSVD-DE: cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings. BMC Bioinformatics 2024; 25:113. [PMID: 38486150 PMCID: PMC10941434 DOI: 10.1186/s12859-024-05724-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/28/2024] [Indexed: 03/17/2024] Open
Abstract
BACKGROUND Single-cell RNA-sequencing (scRNA) datasets are becoming increasingly popular in clinical and cohort studies, but there is a lack of methods to investigate differentially expressed (DE) genes among such datasets with numerous individuals. While numerous methods exist to find DE genes for scRNA data from limited individuals, differential-expression testing for large cohorts of case and control individuals using scRNA data poses unique challenges due to substantial effects of human variation, i.e., individual-level confounding covariates that are difficult to account for in the presence of sparsely-observed genes. RESULTS We develop the eSVD-DE, a matrix factorization that pools information across genes and removes confounding covariate effects, followed by a novel two-sample test in mean expression between case and control individuals. In general, differential testing after dimension reduction yields an inflation of Type-1 errors. However, we overcome this by testing for differences between the case and control individuals' posterior mean distributions via a hierarchical model. In previously published datasets of various biological systems, eSVD-DE has more accuracy and power compared to other DE methods typically repurposed for analyzing cohort-wide differential expression. CONCLUSIONS eSVD-DE proposes a novel and powerful way to test for DE genes among cohorts after performing a dimension reduction. Accurate identification of differential expression on the individual level, instead of the cell level, is important for linking scRNA-seq studies to our understanding of the human population.
Collapse
Affiliation(s)
- Kevin Z Lin
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| | - Yixuan Qiu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
2
|
Lin KZ, Qiu Y, Roeder K. eSVD-DE: Cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.22.568369. [PMID: 38045428 PMCID: PMC10690270 DOI: 10.1101/2023.11.22.568369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Background Single-cell RNA-sequencing (scRNA) datasets are becoming increasingly popular in clinical and cohort studies, but there is a lack of methods to investigate differentially expressed (DE) genes among such datasets with numerous individuals. While numerous methods exist to find DE genes for scRNA data from limited individuals, differential-expression testing for large cohorts of case and control individuals using scRNA data poses unique challenges due to substantial effects of human variation, i.e., individual-level confounding covariates that are difficult to account for in the presence of sparsely-observed genes. Results We develop the eSVD-DE, a matrix factorization that pools information across genes and removes confounding covariate effects, followed by a novel two-sample test in mean expression between case and control individuals. In general, differential testing after dimension reduction yields an inflation of Type-1 errors. However, we overcome this by testing for differences between the case and control individuals' posterior mean distributions via a hierarchical model. In previously published datasets of various biological systems, eSVD-DE has more accuracy and power compared to other DE methods typically repurposed for analyzing cohort-wide differential expression. Conclusions eSVD-DE proposes a novel and powerful way to test for DE genes among cohorts after performing a dimension reduction. Accurate identification of differential expression on the individual level, instead of the cell level, is important for linking scRNA-seq studies to our understanding of the human population.
Collapse
Affiliation(s)
- Kevin Z Lin
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Yixuan Qiu
- School of Statistics & Management, Shanghai University of Finance and Economics, Shanghai,People's Republic of China
| | - Kathryn Roeder
- Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
3
|
Mallick H, Porwal A, Saha S, Basak P, Svetnik V, Paul E. An integrated Bayesian framework for multi-omics prediction and classification. Stat Med 2024; 43:983-1002. [PMID: 38146838 DOI: 10.1002/sim.9953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 10/06/2023] [Accepted: 10/24/2023] [Indexed: 12/27/2023]
Abstract
With the growing commonality of multi-omics datasets, there is now increasing evidence that integrated omics profiles lead to more efficient discovery of clinically actionable biomarkers that enable better disease outcome prediction and patient stratification. Several methods exist to perform host phenotype prediction from cross-sectional, single-omics data modalities but decentralized frameworks that jointly analyze multiple time-dependent omics data to highlight the integrative and dynamic impact of repeatedly measured biomarkers are currently limited. In this article, we propose a novel Bayesian ensemble method to consolidate prediction by combining information across several longitudinal and cross-sectional omics data layers. Unlike existing frequentist paradigms, our approach enables uncertainty quantification in prediction as well as interval estimation for a variety of quantities of interest based on posterior summaries. We apply our method to four published multi-omics datasets and demonstrate that it recapitulates known biology in addition to providing novel insights while also outperforming existing methods in estimation, prediction, and uncertainty quantification. Our open-source software is publicly available at https://github.com/himelmallick/IntegratedLearner.
Collapse
Affiliation(s)
- Himel Mallick
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, 10065, New York, USA
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
| | - Anupreet Porwal
- Department of Statistics, University of Washington, Seattle, Washington, USA
| | - Satabdi Saha
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Piyali Basak
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Vladimir Svetnik
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Erina Paul
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
4
|
Buendia P, Fernandez K, Raley C, Rahnavard A, Crandall KA, Castro JG. Hospital antimicrobial stewardship: profiling the oral microbiome after exposure to COVID-19 and antibiotics. Front Microbiol 2024; 15:1346762. [PMID: 38476940 PMCID: PMC10927822 DOI: 10.3389/fmicb.2024.1346762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/22/2024] [Indexed: 03/14/2024] Open
Abstract
Introduction During the COVID-19 Delta variant surge, the CLAIRE cross-sectional study sampled saliva from 120 hospitalized patients, 116 of whom had a positive COVID-19 PCR test. Patients received antibiotics upon admission due to possible secondary bacterial infections, with patients at risk of sepsis receiving broad-spectrum antibiotics (BSA). Methods The saliva samples were analyzed with shotgun DNA metagenomics and respiratory RNA virome sequencing. Medical records for the period of hospitalization were obtained for all patients. Once hospitalization outcomes were known, patients were classified based on their COVID-19 disease severity and the antibiotics they received. Results Our study reveals that BSA regimens differentially impacted the human salivary microbiome and disease progression. 12 patients died and all of them received BSA. Significant associations were found between the composition of the COVID-19 saliva microbiome and BSA use, between SARS-CoV-2 genome coverage and severity of disease. We also found significant associations between the non-bacterial microbiome and severity of disease, with Candida albicans detected most frequently in critical patients. For patients who did not receive BSA before saliva sampling, our study suggests Staphylococcus aureus as a potential risk factor for sepsis. Discussion Our results indicate that the course of the infection may be explained by both monitoring antibiotic treatment and profiling a patient's salivary microbiome, establishing a compelling link between microbiome and the specific antibiotic type and timing of treatment. This approach can aid with emergency room triage and inpatient management but also requires a better understanding of and access to narrow-spectrum agents that target pathogenic bacteria.
Collapse
Affiliation(s)
| | | | - Castle Raley
- The George Washington University Genomics Core, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Ali Rahnavard
- Department of Biostatistics and Bioinformatics, Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Keith A. Crandall
- The George Washington University Genomics Core, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
- Department of Biostatistics and Bioinformatics, Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Jose Guillermo Castro
- Division of Infectious Diseases, Leonard M. Miller School of Medicine, University of Miami, Miami, FL, United States
| |
Collapse
|
5
|
Reuter MA, Tucker M, Marfori Z, Shishani R, Bustamante JM, Moreno R, Goodson ML, Ehrlich A, Taha AY, Lein PJ, Joshi N, Brito I, Durbin-Johnson B, Nandakumar R, Cummings BP. Dietary resistant starch supplementation increases gut luminal deoxycholic acid abundance in mice. Gut Microbes 2024; 16:2315632. [PMID: 38375831 PMCID: PMC10880513 DOI: 10.1080/19490976.2024.2315632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 02/02/2024] [Indexed: 02/21/2024] Open
Abstract
Bile acids (BA) are among the most abundant metabolites produced by the gut microbiome. Primary BAs produced in the liver are converted by gut bacterial 7-α-dehydroxylation into secondary BAs, which can differentially regulate host health via signaling based on their varying affinity for BA receptors. Despite the importance of secondary BAs in host health, the regulation of 7-α-dehydroxylation and the role of diet in modulating this process is incompletely defined. Understanding this process could lead to dietary guidelines that beneficially shift BA metabolism. Dietary fiber regulates gut microbial composition and metabolite production. We tested the hypothesis that feeding mice a diet rich in a fermentable dietary fiber, resistant starch (RS), would alter gut bacterial BA metabolism. Male and female wild-type mice were fed a diet supplemented with RS or an isocaloric control diet (IC). Metabolic parameters were similar between groups. RS supplementation increased gut luminal deoxycholic acid (DCA) abundance. However, gut luminal cholic acid (CA) abundance, the substrate for 7-α-dehydroxylation in DCA production, was unaltered by RS. Further, RS supplementation did not change the mRNA expression of hepatic BA producing enzymes or ileal BA transporters. Metagenomic assessment of gut bacterial composition revealed no change in the relative abundance of bacteria known to perform 7-α-dehydroxylation. P. ginsenosidimutans and P. multiformis were positively correlated with gut luminal DCA abundance and increased in response to RS supplementation. These data demonstrate that RS supplementation enriches gut luminal DCA abundance without increasing the relative abundance of bacteria known to perform 7-α-dehydroxylation.
Collapse
Affiliation(s)
- Melanie A. Reuter
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Madelynn Tucker
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Zara Marfori
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
| | - Rahaf Shishani
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Jessica Miranda Bustamante
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Rosalinda Moreno
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Michael L. Goodson
- Department of Environmental Toxicology, College of Agricultural and Environmental Sciences, University of California – Davis, Davis, CA, USA
| | - Allison Ehrlich
- Department of Environmental Toxicology, College of Agricultural and Environmental Sciences, University of California – Davis, Davis, CA, USA
| | - Ameer Y. Taha
- Department of Food Science and Technology, University of California - Davis, Davis, CA, USA
| | - Pamela J. Lein
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| | - Nikhil Joshi
- Bioinformatics Core, UC Davis Genome Center, University of California – Davis, Davis, CA, USA
| | - Ilana Brito
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA
| | - Blythe Durbin-Johnson
- Bioinformatics Core, UC Davis Genome Center, University of California – Davis, Davis, CA, USA
| | - Renu Nandakumar
- Biomarkers Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, NY, USA
| | - Bethany P. Cummings
- Department of Surgery, Center for Alimentary and Metabolic Sciences, School of Medicine, University of California – Davis, Sacramento, CA, USA
- Department of Molecular Biosciences, School of Veterinary Medicine, University of California – Davis, Davis, CA, USA
| |
Collapse
|
6
|
Wu D, Gaskins JT, Sekula M, Datta S. Inferring Cell-Cell Communications from Spatially Resolved Transcriptomics Data Using a Bayesian Tweedie Model. Genes (Basel) 2023; 14:1368. [PMID: 37510272 PMCID: PMC10379215 DOI: 10.3390/genes14071368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/16/2023] [Accepted: 06/26/2023] [Indexed: 07/30/2023] Open
Abstract
Cellular communication through biochemical signaling is fundamental to every biological activity. Investigating cell signaling diffusions across cell types can further help understand biological mechanisms. In recent years, this has become an important research topic as single-cell sequencing technologies have matured. However, cell signaling activities are spatially constrained, and single-cell data cannot provide spatial information for each cell. This issue may cause a high false discovery rate, and using spatially resolved transcriptomics data is necessary. On the other hand, as far as we know, most existing methods focus on providing an ad hoc measurement to estimate intercellular communication instead of relying on a statistical model. It is undeniable that descriptive statistics are straightforward and accessible, but a suitable statistical model can provide more accurate and reliable inference. In this way, we propose a generalized linear regression model to infer cellular communications from spatially resolved transcriptomics data, especially spot-based data. Our BAyesian Tweedie modeling of COMmunications (BATCOM) method estimates the communication scores between cell types with the consideration of their corresponding distances. Due to the properties of the regression model, BATCOM naturally provides the direction of the communication between cell types and the interaction of ligands and receptors that other approaches cannot offer. We conduct simulation studies to assess the performance under different scenarios. We also employ BATCOM in a real-data application and compare it with other existing algorithms. In summary, our innovative model can fill gaps in the inference of cell-cell communication and provide a robust and straightforward result.
Collapse
Affiliation(s)
- Dongyuan Wu
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, USA
| | - Jeremy T Gaskins
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA
| | - Michael Sekula
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, USA
| |
Collapse
|
7
|
Peng YL, Wang LX, Li MY, Liu LP, Li RS. Construction and validation of a prognostic signature based on necroptosis-related genes in hepatocellular carcinoma. PLoS One 2023; 18:e0279744. [PMID: 36795724 PMCID: PMC9934426 DOI: 10.1371/journal.pone.0279744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 12/04/2022] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Necroptosis is a necrotic programmed cell death with potent immunogenicity. Due to the dual effects of necroptosis on tumor growth, metastasis and immunosuppression, we evaluated the prognostic value of necroptosis-related genes (NRGs) in hepatocellular carcinoma (HCC). METHODS We first analyzed RNA sequencing and clinical HCC patient data obtained to develop an NRG prognostic signature based on the TCGA dataset. Differentially expressed NRGs were further evaluated by GO and KEGG pathway analyses. Next, we conducted univariate and multivariate Cox regression analyses to build a prognostic model. We also used the dataset obtained from the International Cancer Genome Consortium (ICGC) database to verify the signature. The Tumor Immune Dysfunction and Exclusion (TIDE) algorithm was used to investigate the immunotherapy response. Furthermore, we investigated the relationship between the prediction signature and chemotherapy treatment response in HCC. RESULTS We first identified 36 differentially expressed genes out of 159 NRGs in hepatocellular carcinoma. Enrichment analysis showed that they were mainly enriched in the necroptosis pathway. Four NRGs were screened by Cox regression analysis to establish a prognostic model. The survival analysis revealed that the overall survival of patients with high-risk scores was significantly shorter than that of patients with low-risk scores. The nomogram demonstrated satisfactory discrimination and calibration. The calibration curves validated a fine concordance between the nomogram prediction and actual observation. The efficacy of the necroptosis-related signature was also validated by an independent dataset and immunohistochemistry experiments. TIDE analysis revealed that patients in the high-risk group were possibly more susceptible to immunotherapy. Furthermore, high-risk patients were found to be more sensitive to conventional chemotherapeutic medicines such as bleomycin, bortezomib, and imatinib. CONCLUSION We identified 4 necroptosis-related genes and established a prognostic risk model that could potentially predict prognosis and response to chemotherapy and immunotherapy in HCC patients in the future.
Collapse
Affiliation(s)
- Yue-ling Peng
- Department of Nephrology, Shanxi Provincial People’s Hospital (Fifth Hospital of Shanxi Medical University), Taiyuan, China
| | - Ling-xiao Wang
- Department of Colorectal and Anal Surgery, Shanxi Provincial People’s Hospital (Fifth Hospital of Shanxi Medical University), Taiyuan, China
| | - Mu-ye Li
- Department of Ocular Fundus Diseases, Shanxi Eye Hospital, Shanxi Medical University, Taiyuan, China
| | - Li-ping Liu
- Department of Ultrasound, First Hospital of Shanxi Medical University, Taiyuan, China
| | - Rong-shan Li
- Department of Nephrology, Shanxi Provincial People’s Hospital (Fifth Hospital of Shanxi Medical University), Taiyuan, China
- * E-mail:
| |
Collapse
|
8
|
Bustamante JM, Dawson T, Loeffler C, Marfori Z, Marchesi JR, Mullish BH, Thompson CC, Crandall KA, Rahnavard A, Allegretti JR, Cummings BP. Impact of Fecal Microbiota Transplantation on Gut Bacterial Bile Acid Metabolism in Humans. Nutrients 2022; 14:5200. [PMID: 36558359 PMCID: PMC9785599 DOI: 10.3390/nu14245200] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/29/2022] [Accepted: 12/02/2022] [Indexed: 12/12/2022] Open
Abstract
Fecal microbiota transplantation (FMT) is a promising therapeutic modality for the treatment and prevention of metabolic disease. We previously conducted a double-blind, randomized, placebo-controlled pilot trial of FMT in obese metabolically healthy patients in which we found that FMT enhanced gut bacterial bile acid metabolism and delayed the development of impaired glucose tolerance relative to the placebo control group. Therefore, we conducted a secondary analysis of fecal samples collected from these patients to assess the potential gut microbial species contributing to the effect of FMT to improve metabolic health and increase gut bacterial bile acid metabolism. Fecal samples collected at baseline and after 4 weeks of FMT or placebo treatment underwent shotgun metagenomic analysis. Ultra-high-performance liquid chromatography-mass spectrometry was used to profile fecal bile acids. FMT-enriched bacteria that have been implicated in gut bile acid metabolism included Desulfovibrio fairfieldensis and Clostridium hylemonae. To identify candidate bacteria involved in gut microbial bile acid metabolism, we assessed correlations between bacterial species abundance and bile acid profile, with a focus on bile acid products of gut bacterial metabolism. Bacteroides ovatus and Phocaeicola dorei were positively correlated with unconjugated bile acids. Bifidobacterium adolescentis, Collinsella aerofaciens, and Faecalibacterium prausnitzii were positively correlated with secondary bile acids. Together, these data identify several candidate bacteria that may contribute to the metabolic benefits of FMT and gut bacterial bile acid metabolism that requires further functional validation.
Collapse
Affiliation(s)
- Jessica-Miranda Bustamante
- Department of Surgery, School of Medicine, Center for Alimentary and Metabolic Science, University of California, Sacramento, CA 95817, USA
| | - Tyson Dawson
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| | - Caitlin Loeffler
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| | - Zara Marfori
- Department of Surgery, School of Medicine, Center for Alimentary and Metabolic Science, University of California, Sacramento, CA 95817, USA
| | - Julian R. Marchesi
- Division of Digestive Diseases, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, St. Mary’s Hospital Campus, Imperial College London, London W2 1NY, UK
| | - Benjamin H. Mullish
- Division of Digestive Diseases, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, St. Mary’s Hospital Campus, Imperial College London, London W2 1NY, UK
| | - Christopher C. Thompson
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Keith A. Crandall
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| | - Ali Rahnavard
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| | - Jessica R. Allegretti
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115, USA
| | - Bethany P. Cummings
- Department of Surgery, School of Medicine, Center for Alimentary and Metabolic Science, University of California, Sacramento, CA 95817, USA
| |
Collapse
|
9
|
Das S, Rai A, Rai SN. Differential Expression Analysis of Single-Cell RNA-Seq Data: Current Statistical Approaches and Outstanding Challenges. ENTROPY 2022; 24:e24070995. [PMID: 35885218 PMCID: PMC9315519 DOI: 10.3390/e24070995] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/25/2022] [Accepted: 07/09/2022] [Indexed: 01/11/2023]
Abstract
With the advent of single-cell RNA-sequencing (scRNA-seq), it is possible to measure the expression dynamics of genes at the single-cell level. Through scRNA-seq, a huge amount of expression data for several thousand(s) of genes over million(s) of cells are generated in a single experiment. Differential expression analysis is the primary downstream analysis of such data to identify gene markers for cell type detection and also provide inputs to other secondary analyses. Many statistical approaches for differential expression analysis have been reported in the literature. Therefore, we critically discuss the underlying statistical principles of the approaches and distinctly divide them into six major classes, i.e., generalized linear, generalized additive, Hurdle, mixture models, two-class parametric, and non-parametric approaches. We also succinctly discuss the limitations that are specific to each class of approaches, and how they are addressed by other subsequent classes of approach. A number of challenges are identified in this study that must be addressed to develop the next class of innovative approaches. Furthermore, we also emphasize the methodological challenges involved in differential expression analysis of scRNA-seq data that researchers must address to draw maximum benefit from this recent single-cell technology. This study will serve as a guide to genome researchers and experimental biologists to objectively select options for their analysis.
Collapse
Affiliation(s)
- Samarendra Das
- ICAR-Directorate of Foot and Mouth Disease, Arugul, Bhubaneswar 752050, India
- International Centre for Foot and Mouth Disease, Arugul, Bhubaneswar 752050, India
- Correspondence: or (S.D.); (S.N.R.)
| | - Anil Rai
- ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
| | - Shesh N. Rai
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
- Biostatistics and Bioinformatics Facility, Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- Biostatisitcs and Informatics Facility, Center for Integrative Environmental Health Sciences, University of Louisville, Louisville, KY 40202, USA
- Data Analysis and Sample Management Facility, The University of Louisville Super Fund Center, University of Louisville, Louisville, KY 40202, USA
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
- Christina Lee Brown Envirome Institute, University of Louisville, Louisville, KY 40202, USA
- Correspondence: or (S.D.); (S.N.R.)
| |
Collapse
|
10
|
Metabolite, protein, and tissue dysfunction associated with COVID-19 disease severity. Sci Rep 2022; 12:12204. [PMID: 35842456 PMCID: PMC9288092 DOI: 10.1038/s41598-022-16396-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 07/08/2022] [Indexed: 01/09/2023] Open
Abstract
Proteins are direct products of the genome and metabolites are functional products of interactions between the host and other factors such as environment, disease state, clinical information, etc. Omics data, including proteins and metabolites, are useful in characterizing biological processes underlying COVID-19 along with patient data and clinical information, yet few methods are available to effectively analyze such diverse and unstructured data. Using an integrated approach that combines proteomics and metabolomics data, we investigated the changes in metabolites and proteins in relation to patient characteristics (e.g., age, gender, and health outcome) and clinical information (e.g., metabolic panel and complete blood count test results). We found significant enrichment of biological indicators of lung, liver, and gastrointestinal dysfunction associated with disease severity using publicly available metabolite and protein profiles. Our analyses specifically identified enriched proteins that play a critical role in responses to injury or infection within these anatomical sites, but may contribute to excessive systemic inflammation within the context of COVID-19. Furthermore, we have used this information in conjunction with machine learning algorithms to predict the health status of patients presenting symptoms of COVID-19. This work provides a roadmap for understanding the biochemical pathways and molecular mechanisms that drive disease severity, progression, and treatment of COVID-19.
Collapse
|