1
|
Koca MB, Sevilgen FE. Integration of single-cell proteomic datasets through distinctive proteins in cell clusters. Proteomics 2024; 24:e2300282. [PMID: 38135888 DOI: 10.1002/pmic.202300282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 11/01/2023] [Accepted: 12/04/2023] [Indexed: 12/24/2023]
Abstract
The use of mass spectrometry and antibody-based sequencing technologies at the single-cell level has led to an increase in single-cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods for horizontally integrating high-dimensional single-cell transcriptomic datasets can also be applied to single-cell proteomic datasets, a specialized approach explicitly tailored for low-dimensional proteomic datasets may enhance the integration process. Here, we introduce SCPRO-HI, an algorithm for the horizontal integration of antibody-based single-cell proteomic datasets. It utilizes a hierarchical cell anchoring technique to match cells based on the similarity of distinctive proteins for constituting cell clusters. A novel variational auto-encoder model is employed for correcting batch effects on the protein abundances, eliminating the need for mapping them into a new domain. Moreover, we propose a technique for extending the algorithm to high-dimensional datasets. The performance of the SCPRO-HI algorithm is evaluated using simulated and real-world single-cell proteomic datasets. The findings demonstrate our algorithm outperforms state-of-the-art methods, achieving a 75% higher silhouette score while preserving HVPs 13% better. Furthermore, the algorithm shows competitive performance in transcriptomic datasets, suggesting potential for integrating high-dimensional mass-spectrometry-based proteomic datasets.
Collapse
Affiliation(s)
- Mehmet Burak Koca
- Computer Engineering Department, Gebze Technical University, Kocaeli, Türkiye
| | - Fatih Erdoğan Sevilgen
- Institute for Data Science and Artificial Intelligence, Boğaziçi University, İstanbul, Türkiye
| |
Collapse
|
2
|
Peidli S, Green TD, Shen C, Gross T, Min J, Garda S, Yuan B, Schumacher LJ, Taylor-King JP, Marks DS, Luna A, Blüthgen N, Sander C. scPerturb: harmonized single-cell perturbation data. Nat Methods 2024; 21:531-540. [PMID: 38279009 DOI: 10.1038/s41592-023-02144-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 12/04/2023] [Indexed: 01/28/2024]
Abstract
Analysis across a growing number of single-cell perturbation datasets is hampered by poor data interoperability. To facilitate development and benchmarking of computational methods, we collect a set of 44 publicly available single-cell perturbation-response datasets with molecular readouts, including transcriptomics, proteomics and epigenomics. We apply uniform quality control pipelines and harmonize feature annotations. The resulting information resource, scPerturb, enables development and testing of computational methods, and facilitates comparison and integration across datasets. We describe energy statistics (E-statistics) for quantification of perturbation effects and significance testing, and demonstrate E-distance as a general distance measure between sets of single-cell expression profiles. We illustrate the application of E-statistics for quantifying similarity and efficacy of perturbations. The perturbation-response datasets and E-statistics computation software are publicly available at scperturb.org. This work provides an information resource for researchers working with single-cell perturbation data and recommendations for experimental design, including optimal cell counts and read depth.
Collapse
Affiliation(s)
- Stefan Peidli
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität, Berlin, Germany.
- Institute of Biology, Humboldt-Universität, Berlin, Germany.
| | - Tessa D Green
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Ciyue Shen
- Departments of Cell Biology and Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute, Cambridge, MA, USA
| | | | - Joseph Min
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Samuele Garda
- Institute of Biology, Humboldt-Universität, Berlin, Germany
- Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Bo Yuan
- Departments of Cell Biology and Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute, Cambridge, MA, USA
| | - Linus J Schumacher
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, UK
| | | | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute, Cambridge, MA, USA
| | - Augustin Luna
- Departments of Cell Biology and Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Broad Institute, Cambridge, MA, USA.
- Computational Biology Branch, National Library of Medicine and Developmental Therapeutics Branch, National Cancer Institute, Bethesda, MD, USA.
| | - Nils Blüthgen
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität, Berlin, Germany.
- Institute of Biology, Humboldt-Universität, Berlin, Germany.
| | - Chris Sander
- Departments of Cell Biology and Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Broad Institute, Cambridge, MA, USA.
| |
Collapse
|
3
|
Ghazanfar S, Guibentif C, Marioni JC. Stabilized mosaic single-cell data integration using unshared features. Nat Biotechnol 2024; 42:284-292. [PMID: 37231260 PMCID: PMC10869270 DOI: 10.1038/s41587-023-01766-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 03/28/2023] [Indexed: 05/27/2023]
Abstract
Currently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured with different technologies, onto a common embedding to facilitate downstream analytical tasks. Current horizontal data integration techniques use a set of common features, thereby ignoring non-overlapping features and losing information. Here we introduce StabMap, a mosaic data integration technique that stabilizes mapping of single-cell data by exploiting the non-overlapping features. StabMap first infers a mosaic data topology based on shared features, then projects all cells onto supervised or unsupervised reference coordinates by traversing shortest paths along the topology. We show that StabMap performs well in various simulation contexts, facilitates 'multi-hop' mosaic data integration where some datasets do not share any features and enables the use of spatial gene expression features for mapping dissociated single-cell data onto a spatial transcriptomic reference.
Collapse
Affiliation(s)
- Shila Ghazanfar
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
- School of Mathematics and Statistics, The University of Sydney, Camperdown, New South Wales, Australia.
- Charles Perkins Centre, The University of Sydney, Camperdown, New South Wales, Australia.
| | - Carolina Guibentif
- Sahlgrenska Center for Cancer Research, Inst. Biomedicine, Dept. Microbiology and Immunology, University of Gothenburg, Gothenburg, Sweden
| | - John C Marioni
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
4
|
Rodríguez-Bejarano OH, Roa L, Vargas-Hernández G, Botero-Espinosa L, Parra-López C, Patarroyo MA. Strategies for studying immune and non-immune human and canine mammary gland cancer tumour infiltrate. Biochim Biophys Acta Rev Cancer 2024; 1879:189064. [PMID: 38158026 DOI: 10.1016/j.bbcan.2023.189064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/11/2023] [Accepted: 12/20/2023] [Indexed: 01/03/2024]
Abstract
The tumour microenvironment (TME) is usually defined as a cell environment associated with tumours or cancerous stem cells where conditions are established affecting tumour development and progression through malignant cell interaction with non-malignant cells. The TME is made up of endothelial, immune and non-immune cells, extracellular matrix (ECM) components and signalling molecules acting specifically on tumour and non-tumour cells. Breast cancer (BC) is the commonest malignant neoplasm worldwide and the main cause of mortality in women globally; advances regarding BC study and understanding it are relevant for acquiring novel, personalised therapeutic tools. Studying canine mammary gland tumours (CMGT) is one of the most relevant options for understanding BC using animal models as they share common epidemiological, clinical, pathological, biological, environmental, genetic and molecular characteristics with human BC. In-depth, detailed investigation regarding knowledge of human BC-related TME and in its canine model is considered extremely relevant for understanding changes in TME composition during tumour development. This review addresses important aspects concerned with different methods used for studying BC- and CMGT-related TME that are important for developing new and more effective therapeutic strategies for attacking a tumour during specific evolutionary stages.
Collapse
Affiliation(s)
- Oscar Hernán Rodríguez-Bejarano
- Health Sciences Faculty, Universidad de Ciencias Aplicadas y Ambientales (U.D.C.A), Calle 222#55-37, Bogotá 111166, Colombia; Molecular Biology and Immunology Department, Fundacion Instituto de Inmunología de Colombia (FIDIC), Carrera 50#26-20, Bogotá 111321, Colombia; PhD Programme in Biotechnology, Faculty of Sciences, Universidad Nacional de Colombia, Carrera 45#26-85, Bogotá 111321, Colombia
| | - Leonardo Roa
- Veterinary Clinic, Faculty of Agricultural Sciences, Universidad de La Salle, Carrera 7 #179-03, Bogotá 110141, Colombia
| | - Giovanni Vargas-Hernández
- Animal Health Department, Faculty of Veterinary Medicine and Zootechnics, Universidad Nacional de Colombia, Carrera 45#26-85, Bogotá 111321, Colombia
| | - Lucía Botero-Espinosa
- Animal Health Department, Faculty of Veterinary Medicine and Zootechnics, Universidad Nacional de Colombia, Carrera 45#26-85, Bogotá 111321, Colombia
| | - Carlos Parra-López
- Microbiology Department, Faculty of Medicine, Universidad Nacional de Colombia, Carrera 45#26-85, Bogotá 111321, Colombia.
| | - Manuel Alfonso Patarroyo
- Molecular Biology and Immunology Department, Fundacion Instituto de Inmunología de Colombia (FIDIC), Carrera 50#26-20, Bogotá 111321, Colombia; Microbiology Department, Faculty of Medicine, Universidad Nacional de Colombia, Carrera 45#26-85, Bogotá 111321, Colombia.
| |
Collapse
|
5
|
Lee Y, Song J, Jeong Y, Choi E, Ahn C, Jang W. Meta-analysis of single-cell RNA-sequencing data for depicting the transcriptomic landscape of chronic obstructive pulmonary disease. Comput Biol Med 2023; 167:107685. [PMID: 37976829 DOI: 10.1016/j.compbiomed.2023.107685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/17/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023]
Abstract
Chronic obstructive pulmonary disease (COPD) is a respiratory disease characterized by airflow limitation and chronic inflammation of the lungs that is a leading cause of death worldwide. Since the complete pathological mechanisms at the single-cell level are not fully understood yet, an integrative approach to characterizing the single-cell-resolution landscape of COPD is required. To identify the cell types and mechanisms associated with the development of COPD, we conducted a meta-analysis using three single-cell RNA-sequencing datasets of COPD. Among the 154,011 cells from 16 COPD patients and 18 healthy subjects, 17 distinct cell types were observed. Of the 17 cell types, monocytes, mast cells, and alveolar type 2 cells (AT2 cells) were found to be etiologically implicated in COPD based on genetic and transcriptomic features. The most transcriptomically diversified states of the three etiological cell types showed significant enrichment in immune/inflammatory responses (monocytes and mast cells) and/or mitochondrial dysfunction (monocytes and AT2 cells). We then identified three chemical candidates that may potentially induce COPD by modulating gene expression patterns in the three etiological cell types. Overall, our study suggests the single-cell level mechanisms underlying the pathogenesis of COPD and may provide information on toxic compounds that could be potential risk factors for COPD.
Collapse
Affiliation(s)
- Yubin Lee
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Jaeseung Song
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Yeonbin Jeong
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Eunyoung Choi
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| | - Chulwoo Ahn
- Department of Internal Medicine, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea.
| | - Wonhee Jang
- Department of Life Sciences, Dongguk University, Seoul, 04620, Republic of Korea.
| |
Collapse
|
6
|
Nale V, Chiodi A, Di Nanni N, Cifola I, Moscatelli M, Cocola C, Gnocchi M, Piscitelli E, Sula A, Zucchi I, Reinbold R, Milanesi L, Mezzelani A, Pelucchi P, Mosca E. scMuffin: an R package to disentangle solid tumor heterogeneity by single-cell gene expression analysis. BMC Bioinformatics 2023; 24:445. [PMID: 38012590 PMCID: PMC10680269 DOI: 10.1186/s12859-023-05563-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 11/08/2023] [Indexed: 11/29/2023] Open
Abstract
INTRODUCTION Single-cell (SC) gene expression analysis is crucial to dissect the complex cellular heterogeneity of solid tumors, which is one of the main obstacles for the development of effective cancer treatments. Such tumors typically contain a mixture of cells with aberrant genomic and transcriptomic profiles affecting specific sub-populations that might have a pivotal role in cancer progression, whose identification eludes bulk RNA-sequencing approaches. We present scMuffin, an R package that enables the characterization of cell identity in solid tumors on the basis of a various and complementary analyses on SC gene expression data. RESULTS scMuffin provides a series of functions to calculate qualitative and quantitative scores, such as: expression of marker sets for normal and tumor conditions, pathway activity, cell state trajectories, Copy Number Variations, transcriptional complexity and proliferation state. Thus, scMuffin facilitates the combination of various evidences that can be used to distinguish normal and tumoral cells, define cell identities, cluster cells in different ways, link genomic aberrations to phenotypes and identify subtle differences between cell subtypes or cell states. We analysed public SC expression datasets of human high-grade gliomas as a proof-of-concept to show the value of scMuffin and illustrate its user interface. Nevertheless, these analyses lead to interesting findings, which suggest that some chromosomal amplifications might underlie the invasive tumor phenotype and the presence of cells that possess tumor initiating cells characteristics. CONCLUSIONS The analyses offered by scMuffin and the results achieved in the case study show that our tool helps addressing the main challenges in the bioinformatics analysis of SC expression data from solid tumors.
Collapse
Affiliation(s)
- Valentina Nale
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Alice Chiodi
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Noemi Di Nanni
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Ingrid Cifola
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Marco Moscatelli
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Cinzia Cocola
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Matteo Gnocchi
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Eleonora Piscitelli
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Ada Sula
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Ileana Zucchi
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Rolland Reinbold
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Alessandra Mezzelani
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy
| | - Paride Pelucchi
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy.
| | - Ettore Mosca
- Institute of Biomedical Technologies, National Research Council, Via Fratelli Cervi 93, 20054, Segrate, Milan, Italy.
| |
Collapse
|
7
|
Tian L, Xie Y, Xie Z, Tian J, Tian W. AtacAnnoR: a reference-based annotation tool for single cell ATAC-seq data. Brief Bioinform 2023; 24:bbad268. [PMID: 37497729 DOI: 10.1093/bib/bbad268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/14/2023] [Accepted: 07/04/2023] [Indexed: 07/28/2023] Open
Abstract
Here, we present AtacAnnoR, a two-round annotation method for scATAC-seq data using well-annotated scRNA-seq data as reference. We evaluate AtacAnnoR's performance against six competing methods on 11 benchmark datasets. Our results show that AtacAnnoR achieves the highest mean accuracy and the highest mean balanced accuracy and performs particularly well when unpaired scRNA-seq data are used as the reference. Furthermore, AtacAnnoR implements a 'Combine and Discard' strategy to further improve annotation accuracy when annotations of multiple references are available. AtacAnnoR has been implemented in an R package and can be directly integrated into currently popular scATAC-seq analysis pipelines.
Collapse
Affiliation(s)
- Lejin Tian
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
| | - Yunxiao Xie
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
| | - Zhaobin Xie
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
| | | | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
- Children's Hospital of Fudan University, Shanghai, China
- Children's Hospital of Shandong University, Jinan, China
| |
Collapse
|
8
|
An integrated single cell and spatial transcriptomic map of human white adipose tissue. Nat Commun 2023; 14:1438. [PMID: 36922516 PMCID: PMC10017705 DOI: 10.1038/s41467-023-36983-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 02/27/2023] [Indexed: 03/18/2023] Open
Abstract
To date, single-cell studies of human white adipose tissue (WAT) have been based on small cohort sizes and no cellular consensus nomenclature exists. Herein, we performed a comprehensive meta-analysis of publicly available and newly generated single-cell, single-nucleus, and spatial transcriptomic results from human subcutaneous, omental, and perivascular WAT. Our high-resolution map is built on data from ten studies and allowed us to robustly identify >60 subpopulations of adipocytes, fibroblast and adipogenic progenitors, vascular, and immune cells. Using these results, we deconvolved spatial and bulk transcriptomic data from nine additional cohorts to provide spatial and clinical dimensions to the map. This identified cell-cell interactions as well as relationships between specific cell subtypes and insulin resistance, dyslipidemia, adipocyte volume, and lipolysis upon long-term weight changes. Altogether, our meta-map provides a rich resource defining the cellular and microarchitectural landscape of human WAT and describes the associations between specific cell types and metabolic states.
Collapse
|
9
|
Juan H, Huang H. Quantitative analysis of high‐throughput biological data. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Hsueh‐Fen Juan
- Department of Life Science, Institute of Biomedical Electronics and Bioinformatics, and Center for Systems Biology National Taiwan University Taipei Taiwan
- Taiwan AI Labs Taipei Taiwan
| | - Hsuan‐Cheng Huang
- Institute of Biomedical Informatics National Yang Ming Chiao Tung University Taipei Taiwan
| |
Collapse
|
10
|
Gan D, Li J. SCIBER: a simple method for removing batch effects from single-cell RNA-sequencing data. Bioinformatics 2023; 39:6957084. [PMID: 36548380 PMCID: PMC9848058 DOI: 10.1093/bioinformatics/btac819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 11/27/2022] [Accepted: 12/21/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Integrative analysis of multiple single-cell RNA-sequencing datasets allows for more comprehensive characterizations of cell types, but systematic technical differences between datasets, known as 'batch effects', need to be removed before integration to avoid misleading interpretation of the data. Although many batch-effect-removal methods have been developed, there is still a large room for improvement: most existing methods only give dimension-reduced data instead of expression data of individual genes, are based on computationally demanding models and are black-box models and thus difficult to interpret or tune. RESULTS Here, we present a new batch-effect-removal method called SCIBER (Single-Cell Integrator and Batch Effect Remover) and study its performance on real datasets. SCIBER matches cell clusters across batches according to the overlap of their differentially expressed genes. As a simple algorithm that has better scalability to data with a large number of cells and is easy to tune, SCIBER shows comparable and sometimes better accuracy in removing batch effects on real datasets compared to the state-of-the-art methods, which are much more complicated. Moreover, SCIBER outputs expression data in the original space, that is, the expression of individual genes, which can be used directly for downstream analyses. Additionally, SCIBER is a reference-based method, which assigns one of the batches as the reference batch and keeps it untouched during the process, making it especially suitable for integrating user-generated datasets with standard reference data such as the Human Cell Atlas. AVAILABILITY AND IMPLEMENTATION SCIBER is publicly available as an R package on CRAN: https://cran.r-project.org/web/packages/SCIBER/. A vignette is included in the CRAN R package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dailin Gan
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Jun Li
- To whom correspondence should be addressed.
| |
Collapse
|
11
|
Abstract
Single-cell studies are enabling our understanding of the molecular processes of normal cell development and the onset of several pathologies. For instance, single-cell RNA sequencing (scRNA-Seq) measures the transcriptome-wide gene expression at a single-cell resolution, allowing for studying the heterogeneity among the cells of the same population and revealing complex and rare cell populations. On the other hand, single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-Seq) can be used to define transcriptional and epigenetic changes by analyzing the chromatin accessibility at the single-cell level. However, the integration of multi-omics data still remains one of the most difficult tasks in bioinformatics. In this chapter, we focus on the combination of scRNA-Seq and scATACSeq data to perform an integrative analysis of the single-cell transcriptome and chromatin accessibility of human fetal progenitors.
Collapse
|
12
|
Stanojevic S, Li Y, Ristivojevic A, Garmire LX. Computational Methods for Single-cell Multi-omics Integration and Alignment. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:836-849. [PMID: 36581065 PMCID: PMC10025765 DOI: 10.1016/j.gpb.2022.11.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 08/09/2022] [Accepted: 11/04/2022] [Indexed: 12/27/2022]
Abstract
Recently developed technologies to generate single-cell genomic data have made a revolutionary impact in the field of biology. Multi-omics assays offer even greater opportunities to understand cellular states and biological processes. The problem of integrating different omics data with very different dimensionality and statistical properties remains, however, quite challenging. A growing body of computational tools is being developed for this task, leveraging ideas ranging from machine translation to the theory of networks, and represents another frontier on the interface of biology and data science. Our goal in this review is to provide a comprehensive, up-to-date survey of computational techniques for the integration of single-cell multi-omics data, while making the concepts behind each algorithm approachable to a non-expert audience.
Collapse
Affiliation(s)
- Stefan Stanojevic
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
13
|
Xu Y, Begoli E, McCord RP. sciCAN: single-cell chromatin accessibility and gene expression data integration via cycle-consistent adversarial network. NPJ Syst Biol Appl 2022; 8:33. [PMID: 36089620 PMCID: PMC9464763 DOI: 10.1038/s41540-022-00245-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 09/01/2022] [Indexed: 11/09/2022] Open
Abstract
The boom in single-cell technologies has brought a surge of high dimensional data that come from different sources and represent cellular systems from different views. With advances in these single-cell technologies, integrating single-cell data across modalities arises as a new computational challenge. Here, we present an adversarial approach, sciCAN, to integrate single-cell chromatin accessibility and gene expression data in an unsupervised manner. We benchmarked sciCAN with 5 existing methods in 5 scATAC-seq/scRNA-seq datasets, and we demonstrated that our method dealt with data integration with consistent performance across datasets and better balance of mutual transferring between modalities than the other 5 existing methods. We further applied sciCAN to 10X Multiome data and confirmed that the integrated representation preserves biological relationships within the hematopoietic hierarchy. Finally, we investigated CRISPR-perturbed single-cell K562 ATAC-seq and RNA-seq data to identify cells with related responses to different perturbations in these different modalities.
Collapse
Affiliation(s)
- Yang Xu
- grid.411461.70000 0001 2315 1184UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN USA
| | - Edmon Begoli
- grid.135519.a0000 0004 0446 2659Oak Ridge National Laboratory, Oak Ridge, TN USA ,grid.411461.70000 0001 2315 1184Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN USA
| | - Rachel Patton McCord
- Biochemistry & Cellular and Molecular Biology Department, University of Tennessee, Knoxville, TN, USA.
| |
Collapse
|
14
|
Zuo F, Yu J, He X. Single-Cell Metabolomics in Hematopoiesis and Hematological Malignancies. Front Oncol 2022; 12:931393. [PMID: 35912231 PMCID: PMC9326066 DOI: 10.3389/fonc.2022.931393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Aberrant metabolism contributes to tumor initiation, progression, metastasis, and drug resistance. Metabolic dysregulation has emerged as a hallmark of several hematologic malignancies. Decoding the molecular mechanism underlying metabolic rewiring in hematological malignancies would provide promising avenues for novel therapeutic interventions. Single-cell metabolic analysis can directly offer a meaningful readout of the cellular phenotype, allowing us to comprehensively dissect cellular states and access biological information unobtainable from bulk analysis. In this review, we first highlight the unique metabolic properties of hematologic malignancies and underscore potential metabolic vulnerabilities. We then emphasize the emerging single-cell metabolomics techniques, aiming to provide a guide to interrogating metabolism at single-cell resolution. Furthermore, we summarize recent studies demonstrating the power of single-cell metabolomics to uncover the roles of metabolic rewiring in tumor biology, cellular heterogeneity, immunometabolism, and therapeutic resistance. Meanwhile, we describe a practical view of the potential applications of single-cell metabolomics in hematopoiesis and hematological malignancies. Finally, we present the challenges and perspectives of single-cell metabolomics development.
Collapse
|
15
|
De Oliveira CS, Nixon B, Lord T. A scRNA-seq Approach to Identifying Changes in Spermatogonial Stem Cell Gene Expression Following in vitro Culture. Front Cell Dev Biol 2022; 10:782996. [PMID: 35433696 PMCID: PMC9010880 DOI: 10.3389/fcell.2022.782996] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 03/08/2022] [Indexed: 01/15/2023] Open
Abstract
Spermatogonial stem cell (SSC) function is essential for male fertility, and these cells hold potential therapeutic value spanning from human infertility treatments to wildlife conservation. As in vitro culture is likely to be an integral component of many therapeutic pipelines, we have elected to explore changes in gene expression occurring in undifferentiated spermatogonia in culture that may be intertwined with the temporal reduction in regenerative capacity that they experience. Single cell RNA-sequencing analysis was conducted, comparing undifferentiated spermatogonia retrieved from the adult mouse testis with those that had been subjected to 10 weeks of in vitro culture. Although the majority of SSC signature genes were conserved between the two populations, a suite of differentially expressed genes were also identified. Gene ontology analysis revealed upregulated expression of genes involved in oxidative phosphorylation in cultured spermatogonia, along with downregulation of integral processes such as DNA repair and ubiquitin-mediated proteolysis. Indeed, our follow-up analyses have provided the first depiction of a significant accumulation of ubiquitinated proteins in cultured spermatogonia, when compared to those residing in the testis. The data produced in this manuscript will provide a valuable platform for future studies looking to improve SSC culture approaches and assess their safety for utilisation in therapeutic pipelines.
Collapse
Affiliation(s)
- Camila Salum De Oliveira
- Priority Research Centre for Reproductive Science, Discipline of Biological Sciences, The University of Newcastle, Callaghan, NSW, Australia
| | - Brett Nixon
- Priority Research Centre for Reproductive Science, Discipline of Biological Sciences, The University of Newcastle, Callaghan, NSW, Australia
- Infertility and Reproduction Program, Hunter Medical Research Institute, Newcastle, NSW, Australia
| | - Tessa Lord
- Priority Research Centre for Reproductive Science, Discipline of Biological Sciences, The University of Newcastle, Callaghan, NSW, Australia
- Infertility and Reproduction Program, Hunter Medical Research Institute, Newcastle, NSW, Australia
- *Correspondence: Tessa Lord,
| |
Collapse
|
16
|
Qian K, Fu S, Li H, Li WV. scINSIGHT for interpreting single-cell gene expression from biologically heterogeneous data. Genome Biol 2022; 23:82. [PMID: 35313930 PMCID: PMC8935111 DOI: 10.1186/s13059-022-02649-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Accepted: 03/07/2022] [Indexed: 12/30/2022] Open
Abstract
The increasing number of scRNA-seq data emphasizes the need for integrative analysis to interpret similarities and differences between single-cell samples. Although different batch effect removal methods have been developed, none are suitable for heterogeneous single-cell samples coming from multiple biological conditions. We propose a method, scINSIGHT, to learn coordinated gene expression patterns that are common among, or specific to, different biological conditions, and identify cellular identities and processes across single-cell samples. We compare scINSIGHT with state-of-the-art methods using simulated and real data, which demonstrate its improved performance. Our results show the applicability of scINSIGHT in diverse biomedical and clinical problems.
Collapse
Affiliation(s)
- Kun Qian
- School of Mathematics and Physics, China University of Geosciences, Wuhan, 430074, Hubei, China
| | - Shiwei Fu
- Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Rutgers, The State University of New Jersey, Piscataway, 08854, NJ, USA
| | - Hongwei Li
- School of Mathematics and Physics, China University of Geosciences, Wuhan, 430074, Hubei, China
| | - Wei Vivian Li
- Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Rutgers, The State University of New Jersey, Piscataway, 08854, NJ, USA.
| |
Collapse
|
17
|
Jackson CA, Vogel C. New horizons in the stormy sea of multimodal single-cell data integration. Mol Cell 2022; 82:248-259. [PMID: 35063095 PMCID: PMC8830781 DOI: 10.1016/j.molcel.2021.12.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/08/2021] [Accepted: 12/13/2021] [Indexed: 01/22/2023]
Abstract
While measurements of RNA expression have dominated the world of single-cell analyses, new single-cell techniques increasingly allow collection of different data modalities, measuring different molecules, structural connections, and intermolecular interactions. Integrating the resulting multimodal single-cell datasets is a new bioinformatics challenge. Equally important, it is a new experimental design challenge for the bench scientist, who is not only choosing from a myriad of techniques for each data modality but also faces new challenges in experimental design. The ultimate goal is to design, execute, and analyze multimodal single-cell experiments that are more than just descriptive but enable the learning of new causal and mechanistic biology. This objective requires strict consideration of the goals behind the analysis, which might range from mapping the heterogeneity of a cellular population to assembling system-wide causal networks that can further our understanding of cellular functions and eventually lead to models of tissues and organs. We review steps and challenges toward this goal. Single-cell transcriptomics is now a mature technology, and methods to measure proteins, lipids, small-molecule metabolites, and other molecular phenotypes at the single-cell level are rapidly developing. Integrating these single-cell readouts so that each cell has measurements of multiple types of data, e.g., transcriptomes, proteomes, and metabolomes, is expected to allow identification of highly specific cellular subpopulations and to provide the basis for inferring causal biological mechanisms.
Collapse
Affiliation(s)
- Christopher A Jackson
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York NY, USA
| | - Christine Vogel
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York NY, USA
| |
Collapse
|
18
|
Ascensión AM, Araúzo-Bravo MJ, Izeta A. Challenges and Opportunities for the Translation of Single-Cell RNA Sequencing Technologies to Dermatology. Life (Basel) 2022; 12:67. [PMID: 35054460 PMCID: PMC8781146 DOI: 10.3390/life12010067] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/21/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
Skin is a complex and heterogeneous organ at the cellular level. This complexity is beginning to be understood through the application of single-cell genomics and computational tools. A large number of datasets that shed light on how the different human skin cell types interact in homeostasis-and what ceases to work in diverse dermatological diseases-have been generated and are publicly available. However, translation of these novel aspects to the clinic is lacking. This review aims to summarize the state-of-the-art of skin biology using single-cell technologies, with a special focus on skin pathologies and the translation of mechanistic findings to the clinic. The main implications of this review are to summarize the benefits and limitations of single-cell analysis and thus help translate the emerging insights from these novel techniques to the bedside.
Collapse
Affiliation(s)
- Alex M. Ascensión
- Tissue Engineering Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
| | - Marcos J. Araúzo-Bravo
- Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- Max Planck Institute for Molecular Biomedicine, 48167 Muenster, Germany
- IKERBASQUE, Basque Foundation for Science, 48012 Bilbao, Spain
| | - Ander Izeta
- Tissue Engineering Group, Biodonostia Health Research Institute, 20014 Donostia-San Sebastián, Spain;
- School of Engineering, Tecnun-University of Navarra, 20009 Donostia-San Sebastián, Spain
| |
Collapse
|
19
|
Xu Y, Das P, McCord RP. SMILE: mutual information learning for integration of single-cell omics data. Bioinformatics 2022; 38:476-486. [PMID: 34623402 DOI: 10.1093/bioinformatics/btab706] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 09/15/2021] [Accepted: 10/06/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Deep learning approaches have empowered single-cell omics data analysis in many ways and generated new insights from complex cellular systems. As there is an increasing need for single-cell omics data to be integrated across sources, types and features of data, the challenges of integrating single-cell omics data are rising. Here, we present an unsupervised deep learning algorithm that learns discriminative representations for single-cell data via maximizing mutual information, SMILE (Single-cell Mutual Information Learning). RESULTS Using a unique cell-pairing design, SMILE successfully integrates multisource single-cell transcriptome data, removing batch effects and projecting similar cell types, even from different tissues, into the shared space. SMILE can also integrate data from two or more modalities, such as joint-profiling technologies using single-cell ATAC-seq, RNA-seq, DNA methylation, Hi-C and ChIP data. When paired cells are known, SMILE can integrate data with unmatched feature, such as genes for RNA-seq and genome-wide peaks for ATAC-seq. Integrated representations learned from joint-profiling technologies can then be used as a framework for comparing independent single source data. AVAILABILITY AND IMPLEMENTATION The source code of SMILE including analyses of key results in the study can be found at: https://github.com/rpmccordlab/SMILE, implemented in Python. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yang Xu
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
| | - Priyojit Das
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
| | - Rachel Patton McCord
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| |
Collapse
|
20
|
Abstract
Spatial transcriptomics is a rapidly growing field that promises to comprehensively characterize tissue organization and architecture at the single-cell or subcellular resolution. Such information provides a solid foundation for mechanistic understanding of many biological processes in both health and disease that cannot be obtained by using traditional technologies. The development of computational methods plays important roles in extracting biological signals from raw data. Various approaches have been developed to overcome technology-specific limitations such as spatial resolution, gene coverage, sensitivity, and technical biases. Downstream analysis tools formulate spatial organization and cell-cell communications as quantifiable properties, and provide algorithms to derive such properties. Integrative pipelines further assemble multiple tools in one package, allowing biologists to conveniently analyze data from beginning to end. In this review, we summarize the state of the art of spatial transcriptomic data analysis methods and pipelines, and discuss how they operate on different technological platforms.
Collapse
Affiliation(s)
- Ruben Dries
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts 02215, USA
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
| | - Jiaji Chen
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
| | - Natalie Del Rossi
- Department of Genetics and Genomic Sciences, Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Mohammed Muzamil Khan
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts 02215, USA
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
| | - Adriana Sistig
- Department of Genetics and Genomic Sciences, Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| |
Collapse
|
21
|
Longo SK, Guo MG, Ji AL, Khavari PA. Integrating single-cell and spatial transcriptomics to elucidate intercellular tissue dynamics. Nat Rev Genet 2021; 22:627-644. [PMID: 34145435 PMCID: PMC9888017 DOI: 10.1038/s41576-021-00370-8] [Citation(s) in RCA: 356] [Impact Index Per Article: 118.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2021] [Indexed: 02/07/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) identifies cell subpopulations within tissue but does not capture their spatial distribution nor reveal local networks of intercellular communication acting in situ. A suite of recently developed techniques that localize RNA within tissue, including multiplexed in situ hybridization and in situ sequencing (here defined as high-plex RNA imaging) and spatial barcoding, can help address this issue. However, no method currently provides as complete a scope of the transcriptome as does scRNA-seq, underscoring the need for approaches to integrate single-cell and spatial data. Here, we review efforts to integrate scRNA-seq with spatial transcriptomics, including emerging integrative computational methods, and propose ways to effectively combine current methodologies.
Collapse
Affiliation(s)
- Sophia K. Longo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Margaret G. Guo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA,Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
| | - Andrew L. Ji
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Paul A. Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA,Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| |
Collapse
|
22
|
Park JH, de Lomana ALG, Marzese DM, Juarez T, Feroze A, Hothi P, Cobbs C, Patel AP, Kesari S, Huang S, Baliga NS. A Systems Approach to Brain Tumor Treatment. Cancers (Basel) 2021; 13:3152. [PMID: 34202449 PMCID: PMC8269017 DOI: 10.3390/cancers13133152] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/11/2021] [Accepted: 06/17/2021] [Indexed: 12/12/2022] Open
Abstract
Brain tumors are among the most lethal tumors. Glioblastoma, the most frequent primary brain tumor in adults, has a median survival time of approximately 15 months after diagnosis or a five-year survival rate of 10%; the recurrence rate is nearly 90%. Unfortunately, this prognosis has not improved for several decades. The lack of progress in the treatment of brain tumors has been attributed to their high rate of primary therapy resistance. Challenges such as pronounced inter-patient variability, intratumoral heterogeneity, and drug delivery across the blood-brain barrier hinder progress. A comprehensive, multiscale understanding of the disease, from the molecular to the whole tumor level, is needed to address the intratumor heterogeneity resulting from the coexistence of a diversity of neoplastic and non-neoplastic cell types in the tumor tissue. By contrast, inter-patient variability must be addressed by subtyping brain tumors to stratify patients and identify the best-matched drug(s) and therapies for a particular patient or cohort of patients. Accomplishing these diverse tasks will require a new framework, one involving a systems perspective in assessing the immense complexity of brain tumors. This would in turn entail a shift in how clinical medicine interfaces with the rapidly advancing high-throughput (HTP) technologies that have enabled the omics-scale profiling of molecular features of brain tumors from the single-cell to the tissue level. However, several gaps must be closed before such a framework can fulfill the promise of precision and personalized medicine for brain tumors. Ultimately, the goal is to integrate seamlessly multiscale systems analyses of patient tumors and clinical medicine. Accomplishing this goal would facilitate the rational design of therapeutic strategies matched to the characteristics of patients and their tumors. Here, we discuss some of the technologies, methodologies, and computational tools that will facilitate the realization of this vision to practice.
Collapse
Affiliation(s)
- James H. Park
- Institute for Systems Biology, Seattle, WA 98109, USA; (J.H.P.); (S.H.)
| | | | - Diego M. Marzese
- Balearic Islands Health Research Institute (IdISBa), 07010 Palma, Spain;
| | - Tiffany Juarez
- St. John’s Cancer Institute, Santa Monica, CA 90401, USA; (T.J.); (S.K.)
| | - Abdullah Feroze
- Department of Neurological Surgery, University of Washington, Seattle, WA 98195, USA; (A.F.); (A.P.P.)
| | - Parvinder Hothi
- Swedish Neuroscience Institute, Seattle, WA 98122, USA; (P.H.); (C.C.)
- Ben and Catherine Ivy Center for Advanced Brain Tumor Treatment, Seattle, WA 98122, USA
| | - Charles Cobbs
- Swedish Neuroscience Institute, Seattle, WA 98122, USA; (P.H.); (C.C.)
- Ben and Catherine Ivy Center for Advanced Brain Tumor Treatment, Seattle, WA 98122, USA
| | - Anoop P. Patel
- Department of Neurological Surgery, University of Washington, Seattle, WA 98195, USA; (A.F.); (A.P.P.)
- Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Brotman-Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA
| | - Santosh Kesari
- St. John’s Cancer Institute, Santa Monica, CA 90401, USA; (T.J.); (S.K.)
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA 98109, USA; (J.H.P.); (S.H.)
| | - Nitin S. Baliga
- Institute for Systems Biology, Seattle, WA 98109, USA; (J.H.P.); (S.H.)
- Departments of Microbiology, Biology, and Molecular Engineering Sciences, University of Washington, Seattle, WA 98105, USA
| |
Collapse
|
23
|
Tarazona S, Arzalluz-Luque A, Conesa A. Undisclosed, unmet and neglected challenges in multi-omics studies. NATURE COMPUTATIONAL SCIENCE 2021; 1:395-402. [PMID: 38217236 DOI: 10.1038/s43588-021-00086-z] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/17/2021] [Indexed: 01/15/2024]
Abstract
Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Angeles Arzalluz-Luque
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
- Institute for Integrative Systems Biology, Spanish National Research Council, Valencia, Spain.
| |
Collapse
|
24
|
Computational principles and challenges in single-cell data integration. Nat Biotechnol 2021; 39:1202-1215. [PMID: 33941931 DOI: 10.1038/s41587-021-00895-7] [Citation(s) in RCA: 155] [Impact Index Per Article: 51.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/16/2021] [Indexed: 02/07/2023]
Abstract
The development of single-cell multimodal assays provides a powerful tool for investigating multiple dimensions of cellular heterogeneity, enabling new insights into development, tissue homeostasis and disease. A key challenge in the analysis of single-cell multimodal data is to devise appropriate strategies for tying together data across different modalities. The term 'data integration' has been used to describe this task, encompassing a broad collection of approaches ranging from batch correction of individual omics datasets to association of chromatin accessibility and genetic variation with transcription. Although existing integration strategies exploit similar mathematical ideas, they typically have distinct goals and rely on different principles and assumptions. Consequently, new definitions and concepts are needed to contextualize existing methods and to enable development of new methods.
Collapse
|
25
|
García-Sanz R, Jiménez C. Time to Move to the Single-Cell Level: Applications of Single-Cell Multi-Omics to Hematological Malignancies and Waldenström's Macroglobulinemia-A Particularly Heterogeneous Lymphoma. Cancers (Basel) 2021; 13:1541. [PMID: 33810569 PMCID: PMC8037673 DOI: 10.3390/cancers13071541] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/19/2021] [Accepted: 03/24/2021] [Indexed: 02/07/2023] Open
Abstract
Single-cell sequencing techniques have become a powerful tool for characterizing intra-tumor heterogeneity, which has been reflected in the increasing number of studies carried out and reported. We have rigorously reviewed and compiled the information about these techniques inasmuch as they are relative to the area of hematology to provide a practical view of their potential applications. Studies show how single-cell multi-omics can overcome the limitations of bulk sequencing and be applied at all stages of tumor development, giving insights into the origin and pathogenesis of the tumors, the clonal architecture and evolution, or the mechanisms of therapy resistance. Information at the single-cell level may help resolve questions related to intra-tumor heterogeneity that have not been previously explained by other techniques. With that in mind, we review the existing knowledge about a heterogeneous lymphoma called Waldenström's macroglobulinemia and discuss how single-cell studies may help elucidate the underlying causes of this heterogeneity.
Collapse
Affiliation(s)
- Ramón García-Sanz
- Hematology Department, University Hospital of Salamanca (HUS/IBSAL), CIBERONC and Cancer Research Institute of Salamanca-IBMCC (USAL-CSIC), 37007 Salamanca, Spain;
| | | |
Collapse
|
26
|
Vlachavas EI, Bohn J, Ückert F, Nürnberg S. A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci 2021; 22:2822. [PMID: 33802234 PMCID: PMC8000236 DOI: 10.3390/ijms22062822] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 02/06/2023] Open
Abstract
Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.
Collapse
Affiliation(s)
- Efstathios Iason Vlachavas
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Jonas Bohn
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Frank Ückert
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Sylvia Nürnberg
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| |
Collapse
|
27
|
Yuan X, Wang J, Huang Y, Shangguan D, Zhang P. Single-Cell Profiling to Explore Immunological Heterogeneity of Tumor Microenvironment in Breast Cancer. Front Immunol 2021; 12:643692. [PMID: 33717201 PMCID: PMC7947360 DOI: 10.3389/fimmu.2021.643692] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/05/2021] [Indexed: 01/23/2023] Open
Abstract
Immune infiltrates in the tumor microenvironment (TME) of breast cancer (BRCA) have been shown to play a critical role in tumorigenesis, progression, invasion, and therapy resistance, and thereby will affect the clinical outcomes of BRCA patients. However, a wide range of intratumoral heterogeneity shaped by the tumor cells and immune cells in the surrounding microenvironment is a major obstacle in understanding and treating BRCA. Recent progress in single-cell technologies such as single-cell RNA sequencing (scRNA-seq), mass cytometry, and digital spatial profiling has enabled the detailed characterization of intratumoral immune cells and vastly improved our understanding of less-defined cell subsets in the tumor immune environment. By measuring transcriptomes or proteomics at the single-cell level, it provides an unprecedented view of the cellular architecture consist of phenotypical and functional diversities of tumor-infiltrating immune cells. In this review, we focus on landmark studies of single-cell profiling of immunological heterogeneity in the TME, and discuss its clinical applications, translational outlook, and limitations in breast cancer studies.
Collapse
Affiliation(s)
- Xiao Yuan
- Changsha KingMed Center for Clinical Laboratory Co., Ltd, Changsha, China
| | - Jinxi Wang
- First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, China
| | - Yixuan Huang
- Division of Immunotherapy, Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, United States
| | | | - Peng Zhang
- Division of Immunotherapy, Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD, United States
| |
Collapse
|
28
|
Peng Y, Qiao H. The Application of Single-Cell RNA Sequencing in Mammalian Meiosis Studies. Front Cell Dev Biol 2021; 9:673642. [PMID: 34485276 PMCID: PMC8416306 DOI: 10.3389/fcell.2021.673642] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 07/05/2021] [Indexed: 12/14/2022] Open
Abstract
Meiosis is a cellular division process that produces gametes for sexual reproduction. Disruption of complex events throughout meiosis, such as synapsis and homologous recombination, can lead to infertility and aneuploidy. To reveal the molecular mechanisms of these events, transcriptome studies of specific substages must be conducted. However, conventional methods, such as bulk RNA-seq and RT-qPCR, are not able to detect the transcriptional variations effectively and precisely, especially for identifying cell types and stages with subtle differences. In recent years, mammalian meiotic transcriptomes have been intensively studied at the single-cell level by using single-cell RNA-seq (scRNA-seq) approaches, especially through two widely used platforms, Smart-seq2 and Drop-seq. The scRNA-seq protocols along with their downstream analysis enable researchers to accurately identify cell heterogeneities and investigate meiotic transcriptomes at a higher resolution. In this review, we compared bulk RNA-seq and scRNA-seq to show the advantages of the scRNA-seq in meiosis studies; meanwhile, we also pointed out the challenges and limitations of the scRNA-seq. We listed recent findings from mammalian meiosis (male and female) studies where scRNA-seq applied. Next, we summarized the scRNA-seq analysis methods and the meiotic marker genes from spermatocytes and oocytes. Specifically, we emphasized the different features of the two scRNA-seq protocols (Smart-seq2 and Drop-seq) in the context of meiosis studies and discussed their strengths and weaknesses in terms of different research purposes. Finally, we discussed the future applications of scRNA-seq in the meiosis field.
Collapse
Affiliation(s)
- Yiheng Peng
- Department of Comparative Biosciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Huanyu Qiao
- Department of Comparative Biosciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|