Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022;23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 99] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open

For:	Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022;23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 99] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open

Number

Cited by Other Article(s)

Hu M, Chikina M. Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods. Genome Biol 2024;25:169. [PMID: 38956606 PMCID: PMC11218230 DOI: 10.1186/s13059-024-03292-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/29/2024] [Indexed: 07/04/2024] Open

Abstract

BACKGROUND

Computational cell type deconvolution enables the estimation of cell type abundance from bulk tissues and is important for understanding tissue microenviroment, especially in tumor tissues. With rapid development of deconvolution methods, many benchmarking studies have been published aiming for a comprehensive evaluation for these methods. Benchmarking studies rely on cell-type resolved single-cell RNA-seq data to create simulated pseudobulk datasets by adding individual cells-types in controlled proportions.

RESULTS

In our work, we show that the standard application of this approach, which uses randomly selected single cells, regardless of the intrinsic difference between them, generates synthetic bulk expression values that lack appropriate biological variance. We demonstrate why and how the current bulk simulation pipeline with random cells is unrealistic and propose a heterogeneous simulation strategy as a solution. The heterogeneously simulated bulk samples match up with the variance observed in real bulk datasets and therefore provide concrete benefits for benchmarking in several ways. We demonstrate that conceptual classes of deconvolution methods differ dramatically in their robustness to heterogeneity with reference-free methods performing particularly poorly. For regression-based methods, the heterogeneous simulation provides an explicit framework to disentangle the contributions of reference construction and regression methods to performance. Finally, we perform an extensive benchmark of diverse methods across eight different datasets and find BayesPrism and a hybrid MuSiC/CIBERSORTx approach to be the top performers.

CONCLUSIONS

Our heterogeneous bulk simulation method and the entire benchmarking framework is implemented in a user friendly package https://github.com/humengying0907/deconvBenchmarking and https://doi.org/10.5281/zenodo.8206516 , enabling further developments in deconvolution methods.

Collapse

BOLLON JORDY, SHORTREED MICHAELR, JORDAN BENT, MILLER RACHEL, JEFFERY ERIN, CAVALLI ANDREA, SMITH LLOYDM, DEWEY COLIN, SHEYNKMAN GLORIAM, TIBERI SIMONE. IsoBayes: a Bayesian approach for single-isoform proteomics inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.10.598223. [PMID: 38915658 PMCID: PMC11195044 DOI: 10.1101/2024.06.10.598223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Abstract

Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and most peptides, known as shared peptides, are associated to multiple protein isoforms. As a consequence, studying individual protein isoforms is challenging, and inferred protein results are often abstracted to the gene-level or to groups of protein isoforms. Here, we introduce IsoBayes, a novel statistical method to perform inference at the isoform level. Our method enhances the information available, by integrating mass spectrometry proteomics and transcriptomics data in a Bayesian probabilistic framework. To account for the uncertainty in the measurement process, we propose a two-layer latent variable approach: first, we sample if a peptide has been correctly detected (or, alternatively filter peptides); second, we allocate the abundance of such selected peptides across the protein(s) they are compatible with. This enables us, starting from peptide-level data, to recover protein-level data; in particular, we: i) infer the presence/absence of each protein isoform (via a posterior probability), ii) estimate its abundance (and credible interval), and iii) target isoforms where transcript and protein relative abundances significantly differ. We benchmarked our approach in simulations, and in two multi-protease real datasets: our method displays good sensitivity and specificity when detecting protein isoforms, its estimated abundances highly correlate with the ground truth, and can detect changes between protein and transcript relative abundances. IsoBayes is freely distributed as a Bioconductor R package, and is accompanied by an example usage vignette.

Collapse

Ornelas MY, Ouyang WO, Wu NC. A library-on-library screen reveals the breadth expansion landscape of a broadly neutralizing betacoronavirus antibody. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.06.597810. [PMID: 38915656 PMCID: PMC11195093 DOI: 10.1101/2024.06.06.597810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Salcedo-Tacuma D, Howells G, Mchose C, Gutierrez-Diaz A, Schupp J, Smith DM. ProEnd: A Comprehensive Database for Identifying HbYX Motif-Containing Proteins Across the Tree of Life. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.08.598080. [PMID: 38895466 PMCID: PMC11185799 DOI: 10.1101/2024.06.08.598080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]

Li X, Chen K, Shao M. Efficient Seeding for Error-Prone Sequences with SubseqHash2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.30.596711. [PMID: 38895288 PMCID: PMC11185578 DOI: 10.1101/2024.05.30.596711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]

Duo H, Li Y, Lan Y, Tao J, Yang Q, Xiao Y, Sun J, Li L, Nie X, Zhang X, Liang G, Liu M, Hao Y, Li B. Systematic evaluation with practical guidelines for single-cell and spatially resolved transcriptomics data simulation under multiple scenarios. Genome Biol 2024;25:145. [PMID: 38831386 PMCID: PMC11149245 DOI: 10.1186/s13059-024-03290-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open

Abstract

BACKGROUND

Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines.

RESULTS

We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https://github.com/duohongrui/simpipe ; https://doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https://www.ciblab.net/software/simshiny/ ) for data simulation.

CONCLUSIONS

No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.

Collapse

Affiliation(s)

Hongrui Duo College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Yinghong Li Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, People's Republic of China
Yang Lan Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Army Medical University, Chongqing, 400038, People's Republic of China
Jingxin Tao College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Qingxia Yang Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, People's Republic of China
Yingxue Xiao College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Jing Sun College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Lei Li College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Xiner Nie Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China
Xiaoxi Zhang College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
Guizhao Liang Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China
Mingwei Liu Key Laboratory of Clinical Laboratory Diagnostics, College of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, People's Republic of China
Youjin Hao College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
Bo Li College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.

Collapse

Rivero-Garcia I, Torres M, Sánchez-Cabo F. Deep generative models in single-cell omics. Comput Biol Med 2024;176:108561. [PMID: 38749321 DOI: 10.1016/j.compbiomed.2024.108561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/30/2024] [Accepted: 05/05/2024] [Indexed: 05/31/2024]

Cottrell S, Hozumi Y, Wei GW. K-nearest-neighbors induced topological PCA for single cell RNA-sequence data analysis. Comput Biol Med 2024;175:108497. [PMID: 38678944 PMCID: PMC11090715 DOI: 10.1016/j.compbiomed.2024.108497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 04/08/2024] [Accepted: 04/21/2024] [Indexed: 05/01/2024]

Abstract

Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing downstream analysis. Traditional PCA, a main workhorse in dimensionality reduction, lacks the ability to capture geometrical structure information embedded in the data, and previous graph Laplacian regularizations are limited by the analysis of only a single scale. We propose a topological Principal Components Analysis (tPCA) method by the combination of persistent Laplacian (PL) technique and L2,1 norm regularization to address multiscale and multiclass heterogeneity issues in data. We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian technique to improve the robustness of our persistent Laplacian method. The proposed kNN-PL is a new algebraic topology technique which addresses the many limitations of the traditional persistent homology. Rather than inducing filtration via the varying of a distance threshold, we introduced kNN-tPCA, where filtrations are achieved by varying the number of neighbors in a kNN network at each step, and find that this framework has significant implications for hyper-parameter tuning. We validate the efficacy of our proposed tPCA and kNN-tPCA methods on 11 diverse benchmark scRNA-seq datasets, and showcase that our methods outperform other unsupervised PCA enhancements from the literature, as well as popular Uniform Manifold Approximation (UMAP), t-Distributed Stochastic Neighbor Embedding (tSNE), and Projection Non-Negative Matrix Factorization (NMF) by significant margins. For example, tPCA provides up to 628%, 78%, and 149% improvements to UMAP, tSNE, and NMF, respectively on classification in the F1 metric, and kNN-tPCA offers 53%, 63%, and 32% improvements to UMAP, tSNE, and NMF, respectively on clustering in the ARI metric.

Collapse

Bolteau M, Chebouba L, David L, Bourdon J, Guziolowski C. Boolean Network Models of Human Preimplantation Development. J Comput Biol 2024;31:513-523. [PMID: 38814745 DOI: 10.1089/cmb.2024.0517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024] Open

Zanfardino M, Franzese M, Geraci F. DeClUt: Decluttering differentially expressed genes through clustering of their expression profiles. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;254:108258. [PMID: 38851122 DOI: 10.1016/j.cmpb.2024.108258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 04/26/2024] [Accepted: 05/29/2024] [Indexed: 06/10/2024]

Abstract

BACKGROUND AND OBJECTIVE

differential expression analysis is one of the most popular activities in transcriptomic studies based on next-generation sequencing technologies. In fact, differentially expressed genes (DEGs) between two conditions represent ideal prognostic and diagnostic candidate biomarkers for many pathologies. As a result, several algorithms, such as DESeq2 and edgeR, have been developed to identify DEGs. Despite their widespread use, there is no consensus on which model performs best for different types of data, and many existing methods suffer from high False Discovery Rates (FDR).

METHODS

we present a new algorithm, DeClUt, based on the intuition that the expression profile of differentially expressed genes should form two reasonably compact and well-separated clusters. This, in turn, implies that the bipartition induced by the two conditions being compared should overlap with the clustering. The clustering algorithm underlying DeClUt was designed to be robust to outliers typical of RNA-seq data. In particular, we used the average silhouette function to enforce membership assignment of samples to the most appropriate condition.

RESULTS

DeClUt was tested on real RNA-seq datasets and benchmarked against four of the most widely used methods (edgeR, DESeq2, NOISeq, and SAMseq). Experiments showed a higher self-consistency of results than the competitors as well as a significantly lower False Positive Rate (FPR). Moreover, tested on a real prostate cancer RNA-seq dataset, DeClUt has highlighted 8 DE genes, linked to neoplastic process according to DisGeNET database, that none of the other methods had identified.

CONCLUSIONS

our work presents a novel algorithm that builds upon basic concepts of data clustering and exhibits greater consistency and significantly lower False Positive Rate than state-of-the-art methods. Additionally, DeClUt is able to highlight relevant differentially expressed genes not otherwise identified by other tools contributing to improve efficacy of differential expression analyses in various biological applications.

Collapse

Abegaz F, Abedini D, White F, Guerrieri A, Zancarini A, Dong L, Westerhuis JA, van Eeuwijk F, Bouwmeester H, Smilde AK. A strategy for differential abundance analysis of sparse microbiome data with group-wise structured zeros. Sci Rep 2024;14:12433. [PMID: 38816496 PMCID: PMC11139916 DOI: 10.1038/s41598-024-62437-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 05/16/2024] [Indexed: 06/01/2024] Open

Ferriera Neres D, Wright RC. Pleiotropy, a feature or a bug? Toward co-ordinating plant growth, development, and environmental responses through engineering plant hormone signaling. Curr Opin Biotechnol 2024;88:103151. [PMID: 38823314 DOI: 10.1016/j.copbio.2024.103151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 06/03/2024]

Zhao Y, Ansarullah, Kumar P, Mahoney JM, He H, Baker C, George J, Li S. Causal network perturbation analysis identifies known and novel type-2 diabetes driver genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.22.595431. [PMID: 38826370 PMCID: PMC11142180 DOI: 10.1101/2024.05.22.595431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]

Algavi YM, Borenstein E. Relative dispersion ratios following fecal microbiota transplant elucidate principles governing microbial migration dynamics. Nat Commun 2024;15:4447. [PMID: 38789466 PMCID: PMC11126695 DOI: 10.1038/s41467-024-48717-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 05/08/2024] [Indexed: 05/26/2024] Open

Wang M, Fontaine S, Jiang H, Li G. ADAPT: Analysis of Microbiome Differential Abundance by Pooling Tobit Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594186. [PMID: 38798558 PMCID: PMC11118451 DOI: 10.1101/2024.05.14.594186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Barão S, Xu Y, Llongueras JP, Vistein R, Goff L, Nielsen K, Bae BI, Smith RS, Walsh CA, Stein-O'Brien G, Müller U. BRN1/2 Function in Neocortical Size Determination and Microcephaly. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.02.565322. [PMID: 37961182 PMCID: PMC10635068 DOI: 10.1101/2023.11.02.565322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]

Cuevas-Diaz Duran R, Wei H, Wu J. Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets. BMC Genomics 2024;25:444. [PMID: 38711017 PMCID: PMC11073985 DOI: 10.1186/s12864-024-10364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 04/29/2024] [Indexed: 05/08/2024] Open

Abstract

BACKGROUND

Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data.

MAIN BODY

The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis.

CONCLUSIONS

According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.

Collapse

Van Deynze K, Mumm C, Maltby CJ, Switzenberg JA, Todd PK, Boyle AP. Enhanced Detection and Genotyping of Disease-Associated Tandem Repeats Using HMMSTR and Targeted Long-Read Sequencing. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.01.24306681. [PMID: 38746091 PMCID: PMC11092683 DOI: 10.1101/2024.05.01.24306681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]

Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024;25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]

Kim H, Chang W, Chae SJ, Park JE, Seo M, Kim JK. scLENS: data-driven signal detection for unbiased scRNA-seq data analysis. Nat Commun 2024;15:3575. [PMID: 38678050 DOI: 10.1038/s41467-024-47884-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/14/2024] [Indexed: 04/29/2024] Open

Thulasiram MR, Yamamoto R, Olszewski RT, Gu S, Morell RJ, Hoa M, Dabdoub A. Molecular differences between neonatal and adult stria vascularis from organotypic explants and transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.24.590986. [PMID: 38712156 PMCID: PMC11071502 DOI: 10.1101/2024.04.24.590986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]

Abstract

Summary

The stria vascularis (SV), part of the blood-labyrinth barrier, is an essential component of the inner ear that regulates the ionic environment required for hearing. SV degeneration disrupts cochlear homeostasis, leading to irreversible hearing loss, yet a comprehensive understanding of the SV, and consequently therapeutic availability for SV degeneration, is lacking. We developed a whole-tissue explant model from neonatal and adult mice to create a robust platform for SV research. We validated our model by demonstrating that the proliferative behaviour of the SV in vitro mimics SV in vivo, providing a representative model and advancing high-throughput SV research. We also provided evidence for pharmacological intervention in our system by investigating the role of Wnt/β-catenin signaling in SV proliferation. Finally, we performed single-cell RNA sequencing from in vivo neonatal and adult mouse SV and revealed key genes and pathways that may play a role in SV proliferation and maintenance. Together, our results contribute new insights into investigating biological solutions for SV-associated hearing loss.

Significance

Hearing loss impairs our ability to communicate with people and interact with our environment. This can lead to social isolation, depression, cognitive deficits, and dementia. Inner ear degeneration is a primary cause of hearing loss, and our study provides an in depth look at one of the major sites of inner ear degeneration: the stria vascularis. The stria vascularis and associated blood-labyrinth barrier maintain the functional integrity of the auditory system, yet it is relatively understudied. By developing a new in vitro model for the young and adult stria vascularis and using single cell RNA sequencing, our study provides a novel approach to studying this tissue, contributing new insights and widespread implications for auditory neuroscience and regenerative medicine.

Highlights

- We established an organotypic explant system of the neonatal and adult stria vascularis with an intact blood-labyrinth barrier. - Proliferation of the stria vascularis decreases with age in vitro , modelling its proliferative behaviour in vivo . - Pharmacological studies using our in vitro SV model open possibilities for testing injury paradigms and therapeutic interventions. - Inhibition of Wnt signalling decreases proliferation in neonatal stria vascularis.- We identified key genes and transcription factors unique to developing and mature SV cell types using single cell RNA sequencing.

Collapse

Tian J, Bai X, Quek C. Single-Cell Informatics for Tumor Microenvironment and Immunotherapy. Int J Mol Sci 2024;25:4485. [PMID: 38674070 PMCID: PMC11050520 DOI: 10.3390/ijms25084485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/12/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open

Barry T, Mason K, Roeder K, Katsevich E. Robust differential expression testing for single-cell CRISPR screens at low multiplicity of infection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.15.540875. [PMID: 38659821 PMCID: PMC11042176 DOI: 10.1101/2023.05.15.540875] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]

Baharav TZ, Tse D, Salzman J. OASIS: An interpretable, finite-sample valid alternative to Pearson's X² for scientific discovery. Proc Natl Acad Sci U S A 2024;121:e2304671121. [PMID: 38564640 PMCID: PMC11009617 DOI: 10.1073/pnas.2304671121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 02/08/2024] [Indexed: 04/04/2024] Open

Huuki-Myers LA, Montgomery KD, Kwon SH, Cinquemani S, Eagles NJ, Gonzalez-Padilla D, Maden SK, Kleinman JE, Hyde TM, Hicks SC, Maynard KR, Collado-Torres L. Benchmark of cellular deconvolution methods using a multi-assay reference dataset from postmortem human prefrontal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.09.579665. [PMID: 38405805 PMCID: PMC10888823 DOI: 10.1101/2024.02.09.579665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]

Abstract

Background

Cellular deconvolution of bulk RNA-sequencing (RNA-seq) data using single cell or nuclei RNA-seq (sc/snRNA-seq) reference data is an important strategy for estimating cell type composition in heterogeneous tissues, such as human brain. Computational methods for deconvolution have been developed and benchmarked against simulated data, pseudobulked sc/snRNA-seq data, or immunohistochemistry reference data. A major limitation in developing improved deconvolution algorithms has been the lack of integrated datasets with orthogonal measurements of gene expression and estimates of cell type proportions on the same tissue sample. Deconvolution algorithm performance has not yet been evaluated across different RNA extraction methods (cytosolic, nuclear, or whole cell RNA), different library preparation types (mRNA enrichment vs. ribosomal RNA depletion), or with matched single cell reference datasets.

Results

A rich multi-assay dataset was generated in postmortem human dorsolateral prefrontal cortex (DLPFC) from 22 tissue blocks. Assays included spatially-resolved transcriptomics, snRNA-seq, bulk RNA-seq (across six library/extraction RNA-seq combinations), and RNAScope/Immunofluorescence (RNAScope/IF) for six broad cell types. The Mean Ratio method, implemented in the DeconvoBuddies R package, was developed for selecting cell type marker genes. Six computational deconvolution algorithms were evaluated in DLPFC and predicted cell type proportions were compared to orthogonal RNAScope/IF measurements.

Conclusions

Bisque and hspe were the most accurate methods, were robust to differences in RNA library types and extractions. This multi-assay dataset showed that cell size differences, marker genes differentially quantified across RNA libraries, and cell composition variability in reference snRNA-seq impact the accuracy of current deconvolution methods.

Collapse

Affiliation(s)

Louise A. Huuki-Myers Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
Kelsey D. Montgomery Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
Sang Ho Kwon Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
Sophia Cinquemani Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
Nicholas J. Eagles Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
Daianna Gonzalez-Padilla Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
Sean K. Maden Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
Joel E. Kleinman Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
Thomas M. Hyde Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
Stephanie C. Hicks Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, 21218, USA
Kristen R. Maynard Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
Leonardo Collado-Torres Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA

Collapse

Marmarelis MG, Littman R, Battaglin F, Niedzwiecki D, Venook A, Ambite JL, Galstyan A, Lenz HJ, Ver Steeg G. q-Diffusion leverages the full dimensionality of gene coexpression in single-cell transcriptomics. Commun Biol 2024;7:400. [PMID: 38565955 DOI: 10.1038/s42003-024-06104-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 03/25/2024] [Indexed: 04/04/2024] Open

Zhu C, Liu LY, Yamaguchi TN, Zhu H, Hugh-White R, Livingstone J, Patel Y, Kislinger T, Boutros PC. moPepGen: Rapid and Comprehensive Proteoform Identification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.28.587261. [PMID: 38585946 PMCID: PMC10996593 DOI: 10.1101/2024.03.28.587261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Affiliation(s)

Chenghao Zhu Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA Department of Urology, University of California, Los Angeles, CA, USA
Lydia Y. Liu Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Department of Medical Biophysics, University of Toronto, Toronto, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, Canada Vector Institute for Artificial Intelligence, Toronto, Canada
Takafumi N. Yamaguchi Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA
Helen Zhu Department of Medical Biophysics, University of Toronto, Toronto, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, Canada Vector Institute for Artificial Intelligence, Toronto, Canada
Rupert Hugh-White Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA
Julie Livingstone Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA
Yash Patel Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA
Thomas Kislinger Department of Medical Biophysics, University of Toronto, Toronto, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
Paul C. Boutros Department of Human Genetics, University of California, Los Angeles, CA, USA Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA Institute for Precision Health, University of California, Los Angeles, CA, USA Department of Urology, University of California, Los Angeles, CA, USA Department of Medical Biophysics, University of Toronto, Toronto, Canada

Collapse

Keskus A, Bryant A, Ahmad T, Yoo B, Aganezov S, Goretsky A, Donmez A, Lansdon LA, Rodriguez I, Park J, Liu Y, Cui X, Gardner J, McNulty B, Sacco S, Shetty J, Zhao Y, Tran B, Narzisi G, Helland A, Cook DE, Chang PC, Kolesnikov A, Carroll A, Molloy EK, Pushel I, Guest E, Pastinen T, Shafin K, Miga KH, Malikic S, Day CP, Robine N, Sahinalp C, Dean M, Farooqi MS, Paten B, Kolmogorov M. Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.22.24304756. [PMID: 38585974 PMCID: PMC10996739 DOI: 10.1101/2024.03.22.24304756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Affiliation(s)

Ayse Keskus Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Asher Bryant Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Tanveer Ahmad Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Byunggil Yoo Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Sergey Aganezov Oxford Nanopore Technologies, NY, USA
Anton Goretsky Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA Department of Computer Science, University of Maryland, College Park, MD, USA
Ataberk Donmez Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA Department of Computer Science, University of Maryland, College Park, MD, USA
Lisa A. Lansdon Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Isabel Rodriguez Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
Jimin Park UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Yuelin Liu Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA Department of Computer Science, University of Maryland, College Park, MD, USA
Xiwen Cui Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Joshua Gardner UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Brandy McNulty UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Samuel Sacco UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Jyoti Shetty Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
Yongmei Zhao Sequencing Facility Bioinformatics Group, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
Bao Tran Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
Giuseppe Narzisi New York Genome Center, NY, USA
Adrienne Helland New York Genome Center, NY, USA
Daniel E. Cook Google Inc, Mountain View, CA, USA
Pi-Chuan Chang Google Inc, Mountain View, CA, USA
Alexey Kolesnikov Google Inc, Mountain View, CA, USA
Andrew Carroll Google Inc, Mountain View, CA, USA
Erin K. Molloy Department of Computer Science, University of Maryland, College Park, MD, USA
Irina Pushel Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Erin Guest Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Tomi Pastinen Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Kishwar Shafin Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
Karen H. Miga UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Salem Malikic Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Chi-Ping Day Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Nicolas Robine New York Genome Center, NY, USA
Cenk Sahinalp Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
Michael Dean Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA
Midhat S. Farooqi Children’s Mercy Hospital, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
Benedict Paten UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Mikhail Kolmogorov Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA

Collapse

Jones EF, Haldar A, Oza VH, Lasseigne BN. Quantifying transcriptome diversity: a review. Brief Funct Genomics 2024;23:83-94. [PMID: 37225889 DOI: 10.1093/bfgp/elad019] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 04/14/2023] [Accepted: 05/05/2023] [Indexed: 05/26/2023] Open

Weideman AMK, Wang R, Ibrahim JG, Jiang Y. Canopy2: tumor phylogeny inference by bulk DNA and single-cell RNA sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.18.585595. [PMID: 38562795 PMCID: PMC10983938 DOI: 10.1101/2024.03.18.585595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]

Abstract

In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.

Collapse

Lin KZ, Qiu Y, Roeder K. eSVD-DE: Cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.22.568369. [PMID: 38045428 PMCID: PMC10690270 DOI: 10.1101/2023.11.22.568369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]

Öling S, Struck E, Noreen-Thorsen M, Zwahlen M, von Feilitzen K, Odeberg J, Pontén F, Lindskog C, Uhlén M, Dusart P, Butler LM. A human stomach cell type transcriptome atlas. BMC Biol 2024;22:36. [PMID: 38355543 PMCID: PMC10865703 DOI: 10.1186/s12915-024-01812-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 01/02/2024] [Indexed: 02/16/2024] Open

Affiliation(s)

S Öling Department of Clinical Medicine, Translational Vascular Research, The Arctic University of Norway, 9019, Tromsø, Norway
E Struck Department of Clinical Medicine, Translational Vascular Research, The Arctic University of Norway, 9019, Tromsø, Norway
M Noreen-Thorsen Department of Clinical Medicine, Translational Vascular Research, The Arctic University of Norway, 9019, Tromsø, Norway
M Zwahlen Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden
K von Feilitzen Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden
J Odeberg Department of Clinical Medicine, Translational Vascular Research, The Arctic University of Norway, 9019, Tromsø, Norway Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden The University Hospital of North Norway (UNN), 9019, Tromsø, Norway Department of Haematology, Coagulation Unit, Karolinska University Hospital, 171 76, Stockholm, Sweden
F Pontén Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 752 37, Uppsala, Sweden
C Lindskog Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 752 37, Uppsala, Sweden
M Uhlén Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden
P Dusart Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden Clinical Chemistry and Blood Coagulation Research, Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76, Stockholm, Sweden Clinical Chemistry, Karolinska University Laboratory, Karolinska University Hospital, 171 76, Stockholm, Sweden
L M Butler Department of Clinical Medicine, Translational Vascular Research, The Arctic University of Norway, 9019, Tromsø, Norway. Science for Life Laboratory, Department of Protein Science, Royal Institute of Technology (KTH), 171 21, Stockholm, Sweden. Clinical Chemistry and Blood Coagulation Research, Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76, Stockholm, Sweden. Clinical Chemistry, Karolinska University Laboratory, Karolinska University Hospital, 171 76, Stockholm, Sweden.

Collapse

Groh JS, Vik DC, Stevens KA, Brown PJ, Langley CH, Coop G. Distinct ancient structural polymorphisms control heterodichogamy in walnuts and hickories. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.23.573205. [PMID: 38187547 PMCID: PMC10769452 DOI: 10.1101/2023.12.23.573205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]

Abstract

The maintenance of stable mating type polymorphisms is a classic example of balancing selection, underlying the nearly ubiquitous 50/50 sex ratio in species with separate sexes. One lesser known but intriguing example of a balanced mating polymorphism in angiosperms is heterodichogamy - polymorphism for opposing directions of dichogamy (temporal separation of male and female function in hermaphrodites) within a flowering season. This mating system is common throughout Juglandaceae, the family that includes globally important and iconic nut and timber crops - walnuts (Juglans), as well as pecan and other hickories (Carya). In both genera, heterodichogamy is controlled by a single dominant allele. We fine-map the locus in each genus, and find two ancient (>50 Mya) structural variants involving different genes that both segregate as genus-wide trans-species polymorphisms. The Juglans locus maps to a ca. 20 kb structural variant adjacent to a probable trehalose phosphate phosphatase (TPPD-1), homologs of which regulate floral development in model systems. TPPD-1 is differentially expressed between morphs in developing male flowers, with increased allele-specific expression of the dominant haplotype copy. Across species, the dominant haplotype contains a tandem array of duplicated sequence motifs, part of which is an inverted copy of the TPPD-1 3' UTR. These repeats generate various distinct small RNAs matching sequences within the 3' UTR and further downstream. In contrast to the single-gene Juglans locus, the Carya heterodichogamy locus maps to a ca. 200-450 kb cluster of tightly linked polymorphisms across 20 genes, some of which have known roles in flowering and are differentially expressed between morphs in developing flowers. The dominant haplotype in pecan, which is nearly always heterozygous and appears to rarely recombine, shows markedly reduced genetic diversity and is over twice as long as its recessive counterpart due to accumulation of various types of transposable elements. We did not detect either genetic system in other heterodichogamous genera within Juglandaceae, suggesting that additional genetic systems for heterodichogamy may yet remain undiscovered.

Collapse

Church SH, Mah JL, Wagner G, Dunn CW. Normalizing need not be the norm: count-based math for analyzing single-cell data. Theory Biosci 2024;143:45-62. [PMID: 37947999 DOI: 10.1007/s12064-023-00408-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 10/13/2023] [Indexed: 11/12/2023]

Kousnetsov R, Bourque J, Surnov A, Fallahee I, Hawiger D. Single-cell sequencing analysis within biologically relevant dimensions. Cell Syst 2024;15:83-103.e11. [PMID: 38198894 DOI: 10.1016/j.cels.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 05/23/2023] [Accepted: 12/14/2023] [Indexed: 01/12/2024]

Maiorino E, De Marzio M, Xu Z, Yun JH, Chase RP, Hersh CP, Weiss ST, Silverman EK, Castaldi PJ, Glass K. Joint clinical and molecular subtyping of COPD with variational autoencoders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.08.19.23294298. [PMID: 38260473 PMCID: PMC10802661 DOI: 10.1101/2023.08.19.23294298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Chrysinas P, Venkatesan S, Ang I, Ghosh V, Chen C, Neelamegham S, Gunawan R. Cell and tissue-specific glycosylation pathways informed by single-cell transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.26.559616. [PMID: 38260527 PMCID: PMC10802235 DOI: 10.1101/2023.09.26.559616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Holcombe J, Weavers H. Functional-metabolic coupling in distinct renal cell types coordinates organ-wide physiology and delays premature ageing. Nat Commun 2023;14:8405. [PMID: 38110414 PMCID: PMC10728150 DOI: 10.1038/s41467-023-44098-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 11/30/2023] [Indexed: 12/20/2023] Open

Karakurt HU, Pir P. SUMA: a lightweight machine learning model-powered shared nearest neighbour-based clustering application interface for scRNA-Seq data. Turk J Biol 2023;47:413-422. [PMID: 38681777 PMCID: PMC11045205 DOI: 10.55730/1300-0152.2675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/28/2023] [Accepted: 12/18/2023] [Indexed: 05/01/2024] Open

Abstract

Background/aim

Single-cell transcriptomics (scRNA-Seq) explores cellular diversity at the gene expression level. Due to the inherent sparsity and noise in scRNA-Seq data and the uncertainty on the types of sequenced cells, effective clustering and cell type annotation are essential. The graph-based clustering of scRNA-Seq data is a simple yet powerful approach that presents data as a "shared nearest neighbour" graph and clusters the cells using graph clustering algorithms. These algorithms are dependent on several user-defined parameters.Here we present SUMA, a lightweight tool that uses a random forest model to predict the optimum number of neighbours to obtain the optimum clustering results. Moreover, we integrated our method with other commonly used methods in an RShiny application. SUMA can be used in a local environment (https://github.com/hkarakurt8742/SUMA) or as a browser tool (https://hkarakurt.shinyapps.io/suma/).

Materials and methods

Publicly available scRNA-Seq datasets and 3 different graph-based clustering algorithms were used to develop SUMA, and a large range for number of neighbours and variant genes was taken into consideration. The quality of clustering was assessed using the adjusted Rand index (ARI) and true labels of each dataset. The data were split into training and test datasets, and the model was built and optimised using Scikit-learn (Python) and randomForest (R) libraries.

Results

The accuracy of our machine learning model was 0.96, while the AUC of the ROC curve was 0.98. The model indicated that the number of cells in scRNA-Seq data is the most important feature when deciding the number of neighbours.

Conclusion

We developed and evaluated the SUMA model and implemented the method in the SUMAShiny app, which integrates SUMA with different clustering methods and enables nonbioinformatician users to cluster and visualise their scRNA data easily. The SUMAShiny app is available both for desktop and browser use.

Collapse

Choi J, Ehrlich ME, Roussos P, Wang P, Yuan GC, Song X. QuadST: A Powerful and Robust Approach for Identifying Cell-Cell Interaction-Changed Genes on Spatially Resolved Transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.04.570019. [PMID: 38106025 PMCID: PMC10723309 DOI: 10.1101/2023.12.04.570019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]

Wang Z, Xie X, Liu S, Ji Z. scFseCluster: a feature selection-enhanced clustering for single-cell RNA-seq data. Life Sci Alliance 2023;6:e202302103. [PMID: 37788907 PMCID: PMC10547911 DOI: 10.26508/lsa.202302103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 09/21/2023] [Accepted: 09/22/2023] [Indexed: 10/05/2023] Open

Zhang Y, Ben Nathan J, Moreno A, Merkel R, Kahng MW, Hayes MR, Reiner BC, Crist RC, Schmidt HD. Calcitonin receptor signaling in nucleus accumbens D1R- and D2R-expressing medium spiny neurons bidirectionally alters opioid taking in male rats. Neuropsychopharmacology 2023;48:1878-1888. [PMID: 37355732 PMCID: PMC10584857 DOI: 10.1038/s41386-023-01634-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 06/12/2023] [Accepted: 06/13/2023] [Indexed: 06/26/2023]

Affiliation(s)

Yafang Zhang Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Jennifer Ben Nathan Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Amanda Moreno Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Riley Merkel Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Michelle W Kahng Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Matthew R Hayes Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Benjamin C Reiner Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Richard C Crist Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
Heath D Schmidt Department of Biobehavioral Health Sciences, School of Nursing, University of Pennsylvania, Philadelphia, PA, 19104, USA. Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.

Collapse

Tan KT, Slevin MK, Leibowitz ML, Garrity-Janger M, Li H, Meyerson M. Neotelomeres and Telomere-Spanning Chromosomal Arm Fusions in Cancer Genomes Revealed by Long-Read Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569101. [PMID: 38077026 PMCID: PMC10705422 DOI: 10.1101/2023.11.30.569101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]

Zheng W, Min W, Wang S. TsImpute: an accurate two-step imputation method for single-cell RNA-seq data. Bioinformatics 2023;39:btad731. [PMID: 38039139 PMCID: PMC10724850 DOI: 10.1093/bioinformatics/btad731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 11/22/2023] [Accepted: 11/30/2023] [Indexed: 12/03/2023] Open

Wang H, Zhang C, Hong SH, Maye P, Rowe D, Shin DG. CGCom: a framework for inferring Cell-cell Communication based on Graph Neural Network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.10.566642. [PMID: 38014057 PMCID: PMC10680670 DOI: 10.1101/2023.11.10.566642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]

Abstract

Cell-cell communication is crucial in maintaining cellular homeostasis, cell survival and various regulatory relationships among interacting cells. Thanks to recent advances of spatial transcriptomics technologies, we can now explore if and how cells' proximal information available from spatial transcriptomics datasets can be used to infer cell-cell communication. Here we present a cell-cell communication inference framework, called CGCom, which uses a graph neural network (GNN) to learn communication patterns among interacting cells by combining single-cell spatial transcriptomic datasets with publicly available ligand-receptor information and the molecular regulatory information down-stream of the ligand-receptor signaling. To evaluate the performance of CGCom, we applied it to mouse embryo seqFISH datasets. Our results demonstrate that CGCom can not only accurately infer cell communication between individual cell pairs but also generalize its learning to predict communication between different cell types. We compared the performance of CGCom with two existing methods, CellChat and CellPhoneDB, and our comparative study revealed both common and unique communication patterns from the three approaches. Commonly found communication patterns include three sets of ligand-receptor communication relationships, one between surface ectoderm cells and spinal cord cells, one between gut tube cells and endothelium, and one between neural crest and endothelium, all of which have already been reported in the literature thus offering credibility of all three methods. However, we hypothesize that CGCom is superior in reducing false positives thanks to its use of cell proximal information and its learning between specific cell pairs rather than between cell types. CGCom is a GNN-based solution that can take advantage of spatially resolved single-cell transcriptomic data in predicting cell-cell communication with a higher accuracy.

Collapse

Liu Y, Sapoval N, Gallego-García P, Tomás L, Posada D, Treangen TJ, Stadler LB. Crykey: Rapid Identification of SARS-CoV-2 Cryptic Mutations in Wastewater. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.16.23291524. [PMID: 37986916 PMCID: PMC10659477 DOI: 10.1101/2023.06.16.23291524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Leduc A, Harens H, Slavov N. Modeling and interpretation of single-cell proteogenomic data. ARXIV 2023:arXiv:2308.07465v2. [PMID: 37645043 PMCID: PMC10462161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]

Baharav TZ, Tse D, Salzman J. OASIS: An interpretable, finite-sample valid alternative to Pearson's X2 for scientific discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.16.533008. [PMID: 37961606 PMCID: PMC10634974 DOI: 10.1101/2023.03.16.533008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]

Kim D, Tran A, Kim HJ, Lin Y, Yang JYH, Yang P. Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. NPJ Syst Biol Appl 2023;9:51. [PMID: 37857632 PMCID: PMC10587078 DOI: 10.1038/s41540-023-00312-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/02/2023] [Indexed: 10/21/2023] Open