51
|
Fonseca MAS, Haro M, Wright KN, Lin X, Abbasi F, Sun J, Hernandez L, Orr NL, Hong J, Choi-Kuaea Y, Maluf HM, Balzer BL, Fishburn A, Hickey R, Cass I, Goodridge HS, Truong M, Wang Y, Pisarska MD, Dinh HQ, El-Naggar A, Huntsman DG, Anglesio MS, Goodman MT, Medeiros F, Siedhoff M, Lawrenson K. Single-cell transcriptomic analysis of endometriosis. Nat Genet 2023; 55:255-267. [PMID: 36624343 PMCID: PMC10950360 DOI: 10.1038/s41588-022-01254-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 10/28/2022] [Indexed: 01/11/2023]
Abstract
Endometriosis is a common condition in women that causes chronic pain and infertility and is associated with an elevated risk of ovarian cancer. We profiled transcriptomes of >370,000 individual cells from endometriomas (n = 8), endometriosis (n = 28), eutopic endometrium (n = 10), unaffected ovary (n = 4) and endometriosis-free peritoneum (n = 4), generating a cellular atlas of endometrial-type epithelial cells, stromal cells and microenvironmental cell populations across tissue sites. Cellular and molecular signatures of endometrial-type epithelium and stroma differed across tissue types, suggesting a role for cellular restructuring and transcriptional reprogramming in the disease. Epithelium, stroma and proximal mesothelial cells of endometriomas showed dysregulation of pro-inflammatory pathways and upregulation of complement proteins. Somatic ARID1A mutation in epithelial cells was associated with upregulation of pro-angiogenic and pro-lymphangiogenic factors and remodeling of the endothelial cell compartment, with enrichment of lymphatic endothelial cells. Finally, signatures of ciliated epithelial cells were enriched in ovarian cancers, reinforcing epidemiologic associations between these two diseases.
Collapse
Affiliation(s)
- Marcos A S Fonseca
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Marcela Haro
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Kelly N Wright
- Division of Minimally Invasive Gynecologic Surgery, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Xianzhi Lin
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Forough Abbasi
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Jennifer Sun
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Lourdes Hernandez
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Natasha L Orr
- Department of Obstetrics and Gynecology, UBC, Vancouver, British Columbia, Canada
| | - Jooyoon Hong
- Department of Obstetrics and Gynecology, UBC, Vancouver, British Columbia, Canada
| | - Yunhee Choi-Kuaea
- Cancer Prevention and Control Program, Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Horacio M Maluf
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Bonnie L Balzer
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Aaron Fishburn
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Ryan Hickey
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Ilana Cass
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Obstetrics and Gynecology, Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA
| | - Helen S Goodridge
- Board of Governors Regenerative Medicine Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Mireille Truong
- Division of Minimally Invasive Gynecologic Surgery, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Yemin Wang
- Department of Obstetrics and Gynecology, UBC, Vancouver, British Columbia, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, and Department of Molecular Oncology, British Columbia Cancer Research Institute, Vancouver, British Columbia, Canada
| | - Margareta D Pisarska
- Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Obstetrics and Gynecology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Huy Q Dinh
- McArdle Laboratory for Cancer Research, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI, USA
| | - Amal El-Naggar
- Board of Governors Regenerative Medicine Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology, Faculty of Medicine, Menoufia University, Menoufia Governorate, Egypt
| | - David G Huntsman
- Department of Obstetrics and Gynecology, UBC, Vancouver, British Columbia, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, and Department of Molecular Oncology, British Columbia Cancer Research Institute, Vancouver, British Columbia, Canada
| | - Michael S Anglesio
- Department of Obstetrics and Gynecology, UBC, Vancouver, British Columbia, Canada
- British Columbia's Gynecological Cancer Research (OVCARE) Program, University of British Columbia, Vancouver General Hospital, and BC Cancer, Vancouver, British Columbia, Canada
| | - Marc T Goodman
- Cancer Prevention and Control Program, Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Fabiola Medeiros
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Matthew Siedhoff
- Division of Minimally Invasive Gynecologic Surgery, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Kate Lawrenson
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Cancer Prevention and Control Program, Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Center for Bioinformatics and Functional Genomics, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
52
|
Drake RS, Villanueva MA, Vilme M, Russo DD, Navia A, Love JC, Shalek AK. Profiling Transcriptional Heterogeneity with Seq-Well S 3: A Low-Cost, Portable, High-Fidelity Platform for Massively Parallel Single-Cell RNA-Seq. Methods Mol Biol 2023; 2584:57-104. [PMID: 36495445 DOI: 10.1007/978-1-0716-2756-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Seq-Well is a high-throughput, picowell-based single-cell RNA-seq technology that can be used to simultaneously profile the transcriptomes of thousands of cells (Gierahn et al. Nat Methods 14(4):395-398, 2017). Relative to its reverse-emulsion-droplet-based counterparts, Seq-Well addresses key cost, portability, and scalability limitations. Recently, we introduced an improved molecular biology for Seq-Well to enhance the information content that can be captured from individual cells using the platform. This update, which we call Seq-Well S3 (S3: Second-Strand Synthesis), incorporates a second-strand-synthesis step after reverse transcription to boost the detection of cellular transcripts normally missed when running the original Seq-Well protocol (Hughes et al. Immunity 53(4):878-894.e7, 2020). This chapter provides details and tips on how to perform Seq-Well S3, along with general pointers on how to subsequently analyze the resultant single-cell RNA-seq data.
Collapse
Affiliation(s)
- Riley S Drake
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Martin Arreola Villanueva
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Mike Vilme
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniela D Russo
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrew Navia
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - J Christopher Love
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Alex K Shalek
- Institute for Medical Engineering and Science (IMES), Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA.
- The Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
53
|
Schriever H, Kostka D. Vaeda computationally annotates doublets in single-cell RNA sequencing data. Bioinformatics 2023; 39:6808614. [PMID: 36342203 PMCID: PMC9805559 DOI: 10.1093/bioinformatics/btac720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 10/23/2022] [Accepted: 11/05/2022] [Indexed: 11/09/2022] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology's utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed. RESULTS We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows. AVAILABILITY AND IMPLEMENTATION Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hannah Schriever
- Department of Developmental Biology, University of Pittsburgh, Pittsburgh, PA 15201, USA
- Canegie Mellon—University of Pittsburgh Joint PhD Program, University of Pittsburgh, Pittsburgh, PA 15201, USA
| | | |
Collapse
|
54
|
Umu SU, Rapp Vander-Elst K, Karlsen VT, Chouliara M, Bækkevold ES, Jahnsen FL, Domanska D. Cellsnake: a user-friendly tool for single-cell RNA sequencing analysis. Gigascience 2022; 12:giad091. [PMID: 37889009 PMCID: PMC10603768 DOI: 10.1093/gigascience/giad091] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 08/25/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) provides high-resolution transcriptome data to understand the heterogeneity of cell populations at the single-cell level. The analysis of scRNA-seq data requires the utilization of numerous computational tools. However, nonexpert users usually experience installation issues, a lack of critical functionality or batch analysis modes, and the steep learning curves of existing pipelines. RESULTS We have developed cellsnake, a comprehensive, reproducible, and accessible single-cell data analysis workflow, to overcome these problems. Cellsnake offers advanced features for standard users and facilitates downstream analyses in both R and Python environments. It is also designed for easy integration into existing workflows, allowing for rapid analyses of multiple samples. CONCLUSION As an open-source tool, cellsnake is accessible through Bioconda, PyPi, Docker, and GitHub, making it a cost-effective and user-friendly option for researchers. By using cellsnake, researchers can streamline the analysis of scRNA-seq data and gain insights into the complex biology of single cells.
Collapse
Affiliation(s)
- Sinan U Umu
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo 0372, Norway
| | | | - Victoria T Karlsen
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Manto Chouliara
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Espen Sønderaal Bækkevold
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
- Institute of Oral Biology, University of Oslo, Oslo 0372, Norway
| | - Frode Lars Jahnsen
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo 0372, Norway
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Diana Domanska
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
- Department of Microbiology, University of Oslo, Rikshospitalet, Oslo 0372, Norway
| |
Collapse
|
55
|
Salcher S, Sturm G, Horvath L, Untergasser G, Kuempers C, Fotakis G, Panizzolo E, Martowicz A, Trebo M, Pall G, Gamerith G, Sykora M, Augustin F, Schmitz K, Finotello F, Rieder D, Perner S, Sopper S, Wolf D, Pircher A, Trajanoski Z. High-resolution single-cell atlas reveals diversity and plasticity of tissue-resident neutrophils in non-small cell lung cancer. Cancer Cell 2022; 40:1503-1520.e8. [PMID: 36368318 PMCID: PMC9767679 DOI: 10.1016/j.ccell.2022.10.008] [Citation(s) in RCA: 98] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 08/26/2022] [Accepted: 10/06/2022] [Indexed: 11/12/2022]
Abstract
Non-small cell lung cancer (NSCLC) is characterized by molecular heterogeneity with diverse immune cell infiltration patterns, which has been linked to therapy sensitivity and resistance. However, full understanding of how immune cell phenotypes vary across different patient subgroups is lacking. Here, we dissect the NSCLC tumor microenvironment at high resolution by integrating 1,283,972 single cells from 556 samples and 318 patients across 29 datasets, including our dataset capturing cells with low mRNA content. We stratify patients into immune-deserted, B cell, T cell, and myeloid cell subtypes. Using bulk samples with genomic and clinical information, we identify cellular components associated with tumor histology and genotypes. We then focus on the analysis of tissue-resident neutrophils (TRNs) and uncover distinct subpopulations that acquire new functional properties in the tissue microenvironment, providing evidence for the plasticity of TRNs. Finally, we show that a TRN-derived gene signature is associated with anti-programmed cell death ligand 1 (PD-L1) treatment failure.
Collapse
Affiliation(s)
- Stefan Salcher
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Gregor Sturm
- Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Lena Horvath
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Gerold Untergasser
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Christiane Kuempers
- Institute of Pathology, University of Luebeck and University Hospital Schleswig-Holstein, Campus Luebeck, Luebeck, Germany
| | - Georgios Fotakis
- Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Elisa Panizzolo
- Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Agnieszka Martowicz
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria; Tyrolpath Obrist Brunhuber GmbH, Zams, Austria
| | - Manuel Trebo
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Georg Pall
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Gabriele Gamerith
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Martina Sykora
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Florian Augustin
- Department of Visceral, Transplant and Thoracic Surgery, Medical University Innsbruck, Innsbruck, Austria
| | - Katja Schmitz
- Tyrolpath Obrist Brunhuber GmbH, Zams, Austria; INNPATH GmbH, Institute of Pathology, Innsbruck, Austria
| | - Francesca Finotello
- Institute of Molecular Biology, University of Innsbruck, Innsbruck, Austria; Digital Science Center, University of Innsbruck, Innsbruck, Austria
| | - Dietmar Rieder
- Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Sven Perner
- Institute of Pathology, University of Luebeck and University Hospital Schleswig-Holstein, Campus Luebeck, Luebeck, Germany; Pathology, Research Center Borstel, Leibniz Lung Center, Borstel, Germany; German Center for Lung Research (DZL), Luebeck and Borstel, Germany
| | - Sieghart Sopper
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Dominik Wolf
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria
| | - Andreas Pircher
- Department of Internal Medicine V, Haematology & Oncology, Comprehensive Cancer Center Innsbruck (CCCI) and Tyrolean Cancer Research Institute (TKFI), Medical University of Innsbruck, Innsbruck, Austria.
| | - Zlatko Trajanoski
- Biocenter, Institute of Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria.
| |
Collapse
|
56
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China.,Department of Immunology, Nanjing Medical University, Nanjing, 211166, China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children's Hospital of Chongqing Medical University, Chongqing, 401174, China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110, Guangdong, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| | - Murray J Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia. .,Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW, 2305, Australia.
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China.
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China.
| |
Collapse
|
57
|
Yuan M, Chen L, Deng M. Clustering single-cell multi-omics data with MoClust. Bioinformatics 2022; 39:6831092. [PMID: 36383167 PMCID: PMC9805570 DOI: 10.1093/bioinformatics/btac736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 11/09/2022] [Accepted: 11/14/2022] [Indexed: 11/17/2022] Open
Abstract
MOTIVATION Single-cell multi-omics sequencing techniques have rapidly developed in the past few years. Clustering analysis with single-cell multi-omics data may give us novel perspectives to dissect cellular heterogeneity. However, multi-omics data have the properties of inherited large dimension, high sparsity and existence of doublets. Moreover, representations of different omics from even the same cell follow diverse distributions. Without proper distribution alignment techniques, clustering methods will encounter less separable clusters easily affected by less informative omics data. RESULTS We developed MoClust, a novel joint clustering framework that can be applied to several types of single-cell multi-omics data. A selective automatic doublet detection module that can identify and filter out doublets is introduced in the pretraining stage to improve data quality. Omics-specific autoencoders are introduced to characterize the multi-omics data. A contrastive learning way of distribution alignment is adopted to adaptively fuse omics representations into an omics-invariant representation. This novel way of alignment boosts the compactness and separableness of clusters, while accurately weighting the contribution of each omics to the clustering object. Extensive experiments, over both simulated and real multi-omics datasets, demonstrated the powerful alignment, doublet detection and clustering ability features of MoClust. AVAILABILITY AND IMPLEMENTATION An implementation of MoClust is available from https://doi.org/10.5281/zenodo.7306504. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Musu Yuan
- Center for Quantitative Biology, Peking University, Beijing 100871, China
| | - Liang Chen
- To whom correspondence should be addressed. or
| | | |
Collapse
|
58
|
Zheng Z, He H, Tang XT, Zhang H, Gou F, Yang H, Cao J, Shi S, Yang Z, Sun G, Xie X, Zeng Y, Wen A, Lan Y, Zhou J, Liu B, Zhou BO, Cheng T, Cheng H. Uncovering the emergence of HSCs in the human fetal bone marrow by single-cell RNA-seq analysis. Cell Stem Cell 2022; 29:1562-1579.e7. [DOI: 10.1016/j.stem.2022.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 02/24/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022]
|
59
|
Couckuyt A, Seurinck R, Emmaneel A, Quintelier K, Novak D, Van Gassen S, Saeys Y. Challenges in translational machine learning. Hum Genet 2022; 141:1451-1466. [PMID: 35246744 PMCID: PMC8896412 DOI: 10.1007/s00439-022-02439-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 02/08/2022] [Indexed: 11/25/2022]
Abstract
Machine learning (ML) algorithms are increasingly being used to help implement clinical decision support systems. In this new field, we define as "translational machine learning", joint efforts and strong communication between data scientists and clinicians help to span the gap between ML and its adoption in the clinic. These collaborations also improve interpretability and trust in translational ML methods and ultimately aim to result in generalizable and reproducible models. To help clinicians and bioinformaticians refine their translational ML pipelines, we review the steps from model building to the use of ML in the clinic. We discuss experimental setup, computational analysis, interpretability and reproducibility, and emphasize the challenges involved. We highly advise collaboration and data sharing between consortia and institutes to build multi-centric cohorts that facilitate ML methodologies that generalize across centers. In the end, we hope that this review provides a way to streamline translational ML and helps to tackle the challenges that come with it.
Collapse
Affiliation(s)
- Artuur Couckuyt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
| | - Ruth Seurinck
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
| | - Annelies Emmaneel
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
| | - Katrien Quintelier
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
- Department of Pulmonary Diseases, Erasmus MC, Rotterdam, The Netherlands
| | - David Novak
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
| | - Sofie Van Gassen
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium
| | - Yvan Saeys
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium.
- Data Mining and Modeling for Biomedicine, VIB-UGent Center for Inflammation Research, Gent, Belgium.
| |
Collapse
|
60
|
Panchy N, Watanabe K, Takahashi M, Willems A, Hong T. Comparative single-cell transcriptomes of dose and time dependent epithelial–mesenchymal spectrums. NAR Genom Bioinform 2022; 4:lqac072. [PMID: 36159174 PMCID: PMC9492285 DOI: 10.1093/nargab/lqac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 08/17/2022] [Accepted: 08/31/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Epithelial–mesenchymal transition (EMT) is a cellular process involved in development and disease progression. Intermediate EMT states were observed in tumors and fibrotic tissues, but previous in vitro studies focused on time-dependent responses with single doses of signals; it was unclear whether single-cell transcriptomes support stable intermediates observed in diseases. Here, we performed single-cell RNA-sequencing with human mammary epithelial cells treated with multiple doses of TGF-β. We found that dose-dependent EMT harbors multiple intermediate states at nearly steady state. Comparisons of dose- and time-dependent EMT transcriptomes revealed that the dose-dependent data enable higher sensitivity to detect genes associated with EMT. We identified cell clusters unique to time-dependent EMT, reflecting cells en route to stable states. Combining dose- and time-dependent cell clusters gave rise to accurate prognosis for cancer patients. Our transcriptomic data and analyses uncover a stable EMT continuum at the single-cell resolution, and complementary information of two types of single-cell experiments.
Collapse
Affiliation(s)
- Nicholas Panchy
- Department of Biochemistry & Cellular and Molecular Biology. The University of Tennessee , Knoxville, Knoxville, TN 37996, USA
| | - Kazuhide Watanabe
- RIKEN Center for Integrative Medical Sciences , 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Masataka Takahashi
- RIKEN Center for Integrative Medical Sciences , 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Andrew Willems
- School of Genome Science and Technology, The University of Tennessee , Knoxville, Knoxville, TN 37916, USA
| | - Tian Hong
- Department of Biochemistry & Cellular and Molecular Biology. The University of Tennessee , Knoxville, Knoxville, TN 37996, USA
- National Institute for Mathematical and Biological Synthesis , Knoxville, TN 37996, USA
| |
Collapse
|
61
|
Xi NM, Wang L, Yang C. Improving the diagnosis of thyroid cancer by machine learning and clinical data. Sci Rep 2022; 12:11143. [PMID: 35778428 PMCID: PMC9249901 DOI: 10.1038/s41598-022-15342-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/22/2022] [Indexed: 12/13/2022] Open
Abstract
Thyroid cancer is a common endocrine carcinoma that occurs in the thyroid gland. Much effort has been invested in improving its diagnosis, and thyroidectomy remains the primary treatment method. A successful operation without unnecessary side injuries relies on an accurate preoperative diagnosis. Current human assessment of thyroid nodule malignancy is prone to errors and may not guarantee an accurate preoperative diagnosis. This study proposed a machine learning framework to predict thyroid nodule malignancy based on our collected novel clinical dataset. The ten-fold cross-validation, bootstrap analysis, and permutation predictor importance were applied to estimate and interpret the model performance under uncertainty. The comparison between model prediction and expert assessment shows the advantage of our framework over human judgment in predicting thyroid nodule malignancy. Our method is accurate, interpretable, and thus useable as additional evidence in the preoperative diagnosis of thyroid cancer.
Collapse
Affiliation(s)
- Nan Miles Xi
- Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL, 60660, USA
| | - Lin Wang
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
| | - Chuanjia Yang
- Department of General Surgery, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China.
| |
Collapse
|
62
|
Ellis D, Wu D, Datta S. SAREV: A review on statistical analytics of single-cell RNA sequencing data. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2022; 14:e1558. [PMID: 36034329 PMCID: PMC9400796 DOI: 10.1002/wics.1558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 04/09/2021] [Indexed: 06/15/2023]
Abstract
Due to the development of next-generation RNA sequencing (NGS) technologies, there has been tremendous progress in research involving determining the role of genomics, transcriptomics and epigenomics in complex biological systems. However, scientists have realized that information obtained using earlier technology, frequently called 'bulk RNA-seq' data, provides information averaged across all the cells present in a tissue. Relatively newly developed single cell (scRNA-seq) technology allows us to provide transcriptomic information at a single-cell resolution. Nevertheless, these high-resolution data have their own complex natures and demand novel statistical data analysis methods to provide effective and highly accurate results on complex biological systems. In this review, we cover many such recently developed statistical methods for researchers wanting to pursue scRNA-seq statistical and computational research as well as scientific research about these existing methods and free software tools available for their generated data. This review is certainly not exhaustive due to page limitations. We have tried to cover the popular methods starting from quality control to the downstream analysis of finding differentially expressed genes and concluding with a brief description of network analysis.
Collapse
Affiliation(s)
- Dorothy Ellis
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| | - Dongyuan Wu
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| | - Susmita Datta
- Department of Biostatistics, University of Florida, School of Public Health and Health Professions, Gainesville, FL
| |
Collapse
|
63
|
Xi NM, Hsu YY, Dang Q, Huang DP. Statistical learning in preclinical drug proarrhythmic assessment. J Biopharm Stat 2022; 32:450-473. [PMID: 35771997 DOI: 10.1080/10543406.2022.2065505] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Torsades de pointes (TdP) is an irregular heart rhythm characterized by faster beat rates and potentially could lead to sudden cardiac death. Much effort has been invested in understanding the drug-induced TdP in preclinical studies. However, a comprehensive statistical learning framework that can accurately predict the drug-induced TdP risk from preclinical data is still lacking. We proposed ordinal logistic regression and ordinal random forest models to predict low-, intermediate-, and high-risk drugs based on datasets generated from two experimental protocols. Leave-one-drug-out cross-validation, stratified bootstrap, and permutation predictor importance were applied to estimate and interpret the model performance under uncertainty. The potential outlier drugs identified by our models are consistent with their descriptions in the literature. Our method is accurate, interpretable, and thus useable as supplemental evidence in the drug safety assessment.
Collapse
Affiliation(s)
- Nan Miles Xi
- Department of Mathematics and Statistics, Loyola University Chicago, Chicago, Illinois, USA
| | - Yu-Yi Hsu
- Office of Biostatistics, Office of Translational Science, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| | - Qianyu Dang
- Office of Biostatistics, Office of Translational Science, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| | - Dalong Patrick Huang
- Office of Biostatistics, Office of Translational Science, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| |
Collapse
|
64
|
From single-omics to interactomics: How can ligand-induced perturbations modulate single-cell phenotypes? ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:45-83. [PMID: 35871896 DOI: 10.1016/bs.apcsb.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Cells suffer from perturbations by different stimuli, which, consequently, rise to individual alterations in their profile and function that may end up affecting the tissue as a whole. This is no different if we consider the effect of a therapeutic agent on a biological system. As cells are exposed to external ligands their profile can change at different single-omics levels. Detecting how these changes take place through different sequencing technologies is key to a better understanding of the effects of therapeutic agents. Single-cell RNA-sequencing stands out as one of the most common approaches for cell profiling and perturbation analysis. As a result, single-cell transcriptomics data can be integrated with other omics data sources, such as proteomics and epigenomics data, to clarify the perturbation effects and mechanism at the cell level. Appropriate computational tools are key to process and integrate the available information. This chapter focuses on the recent advances on ligand-induced perturbation and single-cell omics computational tools and algorithms, their current limitations, and how the deluge of data can be used to improve the current process of drug research and development.
Collapse
|
65
|
Xiong KX, Zhou HL, Lin C, Yin JH, Kristiansen K, Yang HM, Li GB. Chord: an ensemble machine learning algorithm to identify doublets in single-cell RNA sequencing data. Commun Biol 2022; 5:510. [PMID: 35637301 PMCID: PMC9151659 DOI: 10.1038/s42003-022-03476-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Accepted: 05/11/2022] [Indexed: 12/16/2022] Open
Abstract
High-throughput single-cell RNA sequencing (scRNA-seq) is a popular method, but it is accompanied by doublet rate problems that disturb the downstream analysis. Several computational approaches have been developed to detect doublets. However, most of these methods may yield satisfactory performance in some datasets but lack stability in others; thus, it is difficult to regard a single method as the gold standard which can be applied to all types of scenarios. It is a difficult and time-consuming task for researchers to choose the most appropriate software. We here propose Chord which implements a machine learning algorithm that integrates multiple doublet detection methods to address these issues. Chord had higher accuracy and stability than the individual approaches on different datasets containing real and synthetic data. Moreover, Chord was designed with a modular architecture port, which has high flexibility and adaptability to the incorporation of any new tools. Chord is a general solution to the doublet detection problem. For the unmet need to choose the suitable doublet detection method, an ensemble machine learning algorithm called Chord was developed, which integrates multiple methods and achieves higher accuracy and stability on different scRNA-seq datasets.
Collapse
Affiliation(s)
- Ke-Xu Xiong
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.,BGI-Shenzhen, Shenzhen, 518083, China
| | - Han-Lin Zhou
- BGI-Shenzhen, Shenzhen, 518083, China. .,BGI College & Henan Institute of Medical and Pharmaceutical Science, Zhengzhou University, Zhengzhou, China. .,BGI-Henan, BGI-Shenzhen, Xinxiang, 453000, China. .,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, 518083, China. .,Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, DK-2100, Denmark.
| | - Cong Lin
- BGI-Shenzhen, Shenzhen, 518083, China.,BGI-Henan, BGI-Shenzhen, Xinxiang, 453000, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, 518083, China.,Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, 518083, China
| | - Jian-Hua Yin
- BGI-Shenzhen, Shenzhen, 518083, China.,BGI-Henan, BGI-Shenzhen, Xinxiang, 453000, China.,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, 518083, China.,Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, 518083, China
| | - Karsten Kristiansen
- BGI-Shenzhen, Shenzhen, 518083, China.,Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Huan-Ming Yang
- BGI-Shenzhen, Shenzhen, 518083, China.,James D. Watson Institute of Genome Science, 310008, Hangzhou, China
| | - Gui-Bo Li
- BGI-Shenzhen, Shenzhen, 518083, China. .,BGI College & Henan Institute of Medical and Pharmaceutical Science, Zhengzhou University, Zhengzhou, China. .,BGI-Henan, BGI-Shenzhen, Xinxiang, 453000, China. .,Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI-Shenzhen, Shenzhen, 518083, China. .,Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen, 518083, China.
| |
Collapse
|
66
|
Germain PL, Lun A, Garcia Meixide C, Macnair W, Robinson MD. Doublet identification in single-cell sequencing data using scDblFinder. F1000Res 2022; 10:979. [PMID: 35814628 PMCID: PMC9204188 DOI: 10.12688/f1000research.73600.2] [Citation(s) in RCA: 90] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/28/2022] [Indexed: 11/20/2022] Open
Abstract
Doublets are prevalent in single-cell sequencing data and can lead to artifactual findings. A number of strategies have therefore been proposed to detect them. Building on the strengths of existing approaches, we developed
scDblFinder, a fast, flexible and accurate Bioconductor-based doublet detection method. Here we present the method, justify its design choices, demonstrate its performance on both single-cell RNA and accessibility (ATAC) sequencing data, and provide some observations on doublet formation, detection, and enrichment analysis. Even in complex datasets,
scDblFinder can accurately identify most heterotypic doublets, and was already found by an independent benchmark to outcompete alternatives.
Collapse
Affiliation(s)
- Pierre-Luc Germain
- DMLS Lab of Statistical Bioinformatics, University of Zürich, Zürich, 805, Switzerland
- D-HEST Institute for Neuroscience, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Aaron Lun
- Genentech Inc., South San Francisco, CA, USA
| | | | - Will Macnair
- Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, F. Hoffmann-LaRoche Ltd, Basel, Switzerland
| | - Mark D. Robinson
- DMLS Lab of Statistical Bioinformatics, University of Zürich, Zürich, 805, Switzerland
- Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| |
Collapse
|
67
|
Ghaddar B, De S. Reconstructing physical cell interaction networks from single-cell data using Neighbor-seq. Nucleic Acids Res 2022; 50:e82. [PMID: 35536255 PMCID: PMC9371920 DOI: 10.1093/nar/gkac333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/19/2022] [Accepted: 05/05/2022] [Indexed: 11/12/2022] Open
Abstract
Cell-cell interactions are the fundamental building blocks of tissue organization and multicellular life. We developed Neighbor-seq, a method to identify and annotate the architecture of direct cell–cell interactions and relevant ligand–receptor signaling from the undissociated cell fractions in massively parallel single cell sequencing data. Neighbor-seq accurately identifies microanatomical features of diverse tissue types such as the small intestinal epithelium, terminal respiratory tract, and splenic white pulp. It also captures the differing topologies of cancer-immune-stromal cell communications in pancreatic and skin tumors, which are consistent with the patterns observed in spatial transcriptomic data. Neighbor-seq is fast and scalable. It draws inferences from routine single-cell data and does not require prior knowledge about sample cell-types or multiplets. Neighbor-seq provides a framework to study the organ-level cellular interactome in health and disease, bridging the gap between single-cell and spatial transcriptomics.
Collapse
Affiliation(s)
- Bassel Ghaddar
- Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Subhajyoti De
- Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey, New Brunswick, NJ 08901, USA
| |
Collapse
|
68
|
Zeng Z, Li Y, Li Y, Luo Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol 2022; 23:83. [PMID: 35337374 PMCID: PMC8951701 DOI: 10.1186/s13059-022-02653-7] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 03/15/2022] [Indexed: 01/28/2023] Open
Abstract
The recent advancement in spatial transcriptomics technology has enabled multiplexed profiling of cellular transcriptomes and spatial locations. As the capacity and efficiency of the experimental technologies continue to improve, there is an emerging need for the development of analytical approaches. Furthermore, with the continuous evolution of sequencing protocols, the underlying assumptions of current analytical methods need to be re-evaluated and adjusted to harness the increasing data complexity. To motivate and aid future model development, we herein review the recent development of statistical and machine learning methods in spatial transcriptomics, summarize useful resources, and highlight the challenges and opportunities ahead.
Collapse
Affiliation(s)
- Zexian Zeng
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Department of Data Sciences, Dana Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Yawei Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yiming Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yuan Luo
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA.
- Northwestern University Clinical and Translational Sciences Institute, Chicago, IL, 60611, USA.
- Institute for Augmented Intelligence in Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Health Information Partnerships, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
69
|
Saigusa R, Durant CP, Suryawanshi V, Ley K. Single-Cell Antibody Sequencing in Atherosclerosis Research. Methods Mol Biol 2022; 2419:765-778. [PMID: 35238000 PMCID: PMC10155217 DOI: 10.1007/978-1-0716-1924-7_46] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The transcriptomic information obtained by single cell RNA sequencing (scRNA-seq) can be supplemented by information on the cell surface phenotype by using oligonucleotide-tagged monoclonal antibodies (scAb-Seq). This is of particular importance in immune cells, where the correlation between mRNA and cell surface expression is very weak. scAb-Seq is facilitated by the availability of commercial antibodies and antibody mixes. Now panels of up to 200 antibodies are available for human and mouse cells. Proteins are detected by antibodies conjugated to a tripartite DNA sequence that contains a primer for amplification and sequencing, a unique oligonucleotide that acts as an antibody barcode and a poly(dA) sequence, simultaneously detecting extension of antibody-specific DNA sequences and cDNAs in the same poly(dT)-primed reaction. For each cell, surface protein expression is captured and sequenced along with the cell's transcriptome. Here, we list the steps needed to produce antibody sequencing data from tissue or blood cells.
Collapse
Affiliation(s)
| | | | | | - Klaus Ley
- La Jolla Institute for Immunology, La Jolla, CA, USA.
| |
Collapse
|
70
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
71
|
Chen J, Du L, Wang F, Shao X, Wang X, Yu W, Bi S, Chen D, Pan X, Zeng S, Huang L, Liang Y, Li Y, Chen R, Xue F, Li X, Wang S, Zhuang M, Liu M, Lin L, Yan H, He F, Yu L, Jiang Q, Xiong Z, Zhang L, Cao B, Wang YL, Chen D. Cellular and molecular atlas of the placenta from a COVID-19 pregnant woman infected at midgestation highlights the defective impacts on foetal health. Cell Prolif 2022; 55:e13204. [PMID: 35141964 PMCID: PMC9055894 DOI: 10.1111/cpr.13204] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/10/2022] [Accepted: 01/26/2022] [Indexed: 12/15/2022] Open
Abstract
Objectives The impacts of the current COVID‐19 pandemic on maternal and foetal health are enormous and of serious concern. However, the influence of SARS‐CoV‐2 infection at early‐to‐mid gestation on maternal and foetal health remains unclear. Materials and methods Here, we report the follow‐up study of a pregnant woman of her whole infective course of SARS‐CoV‐2, from asymptomatic infection at gestational week 20 to mild and then severe illness state, and finally cured at Week 24. Following caesarean section due to incomplete uterine rupture at Week 28, histological examinations on the placenta and foetal tissues as well as single‐cell RNA sequencing (scRNA‐seq) for the placenta were performed. Results Compared with the gestational age‐matched control placentas, the placenta from this COVID‐19 case exhibited more syncytial knots and lowered expression of syncytiotrophoblast‐related genes. The scRNA‐seq analysis demonstrated impaired trophoblast differentiation, activation of antiviral and inflammatory CD8 T cells, as well as the tight association of increased inflammatory responses in the placenta with complement over‐activation in macrophages. In addition, levels of several inflammatory factors increased in the placenta and foetal blood. Conclusion These findings illustrate a systematic cellular and molecular signature of placental insufficiency and immune activation at the maternal–foetal interface that may be attributed to SARS‐CoV‐2 infection at the midgestation stage, which highly suggests the extensive care for maternal and foetal outcomes in pregnant women suffering from COVID‐19.
Collapse
Affiliation(s)
- Jingsi Chen
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.,Guangdong Engineering and Technology Research Center of Maternal-Fetal Medicine, Guangzhou, PR China.,Guangdong-Hong Kong-Macao Greater Bay Area Higher Education Joint Laboratory of Maternal-Fetal Medicine, Guangzhou, China
| | - Lili Du
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.,Guangdong Engineering and Technology Research Center of Maternal-Fetal Medicine, Guangzhou, PR China.,Guangdong-Hong Kong-Macao Greater Bay Area Higher Education Joint Laboratory of Maternal-Fetal Medicine, Guangzhou, China
| | - Feiyang Wang
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xuan Shao
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoyi Wang
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Wenzhe Yu
- Fujian Provincial Key Laboratory of Reproductive Health Research, School of Medicine, Xiamen University, Xiamen, China
| | - Shilei Bi
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Dexiong Chen
- Department of General Practice, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Xingfei Pan
- Department of Infectious Diseases, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shanshan Zeng
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lijun Huang
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Yingyu Liang
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Yulian Li
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Rufang Chen
- Department of Obstetrics, The First People's Hospital of Foshan, Foshan, Guangdong Province, China
| | - Fengwu Xue
- Department of Obstetrics, The First People's Hospital of Foshan, Foshan, Guangdong Province, China
| | - Xiuying Li
- Department of Obstetrics, The First People's Hospital of Foshan, Foshan, Guangdong Province, China
| | - Shouping Wang
- Department of Operating Room, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Manli Zhuang
- Department of Operating Room, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Mingxing Liu
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lin Lin
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Hao Yan
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Fang He
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lin Yu
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Qingping Jiang
- Department of Pathology, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Zhongtang Xiong
- Department of Pathology, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lizi Zhang
- Department of Obstetrics and Gynecology, Nanfang Hospital, Southern Medical University, Guangzhou, PR China
| | - Bin Cao
- Fujian Provincial Key Laboratory of Reproductive Health Research, School of Medicine, Xiamen University, Xiamen, China
| | - Yan-Ling Wang
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Dunjin Chen
- Department of Obstetrics and Gynecology, Key Laboratory for Major Obstetric Diseases of Guangdong Province, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.,Guangdong Engineering and Technology Research Center of Maternal-Fetal Medicine, Guangzhou, PR China.,Guangdong-Hong Kong-Macao Greater Bay Area Higher Education Joint Laboratory of Maternal-Fetal Medicine, Guangzhou, China
| |
Collapse
|
72
|
Multi-Omics Profiling of the Tumor Microenvironment. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:283-326. [DOI: 10.1007/978-3-030-91836-1_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
73
|
Epithelial GPR35 protects from Citrobacter rodentium infection by preserving goblet cells and mucosal barrier integrity. Mucosal Immunol 2022; 15:443-458. [PMID: 35264769 PMCID: PMC9038528 DOI: 10.1038/s41385-022-00494-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 01/26/2022] [Accepted: 01/26/2022] [Indexed: 02/04/2023]
Abstract
Goblet cells secrete mucin to create a protective mucus layer against invasive bacterial infection and are therefore essential for maintaining intestinal health. However, the molecular pathways that regulate goblet cell function remain largely unknown. Although GPR35 is highly expressed in colonic epithelial cells, its importance in promoting the epithelial barrier is unclear. In this study, we show that epithelial Gpr35 plays a critical role in goblet cell function. In mice, cell-type-specific deletion of Gpr35 in epithelial cells but not in macrophages results in goblet cell depletion and dysbiosis, rendering these animals more susceptible to Citrobacter rodentium infection. Mechanistically, scRNA-seq analysis indicates that signaling of epithelial Gpr35 is essential to maintain normal pyroptosis levels in goblet cells. Our work shows that the epithelial presence of Gpr35 is a critical element for the function of goblet cell-mediated symbiosis between host and microbiota.
Collapse
|
74
|
Miao Z, Humphreys BD, McMahon AP, Kim J. Multi-omics integration in the age of million single-cell data. Nat Rev Nephrol 2021; 17:710-724. [PMID: 34417589 PMCID: PMC9191639 DOI: 10.1038/s41581-021-00463-x] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/25/2021] [Indexed: 02/06/2023]
Abstract
An explosion in single-cell technologies has revealed a previously underappreciated heterogeneity of cell types and novel cell-state associations with sex, disease, development and other processes. Starting with transcriptome analyses, single-cell techniques have extended to multi-omics approaches and now enable the simultaneous measurement of data modalities and spatial cellular context. Data are now available for millions of cells, for whole-genome measurements and for multiple modalities. Although analyses of such multimodal datasets have the potential to provide new insights into biological processes that cannot be inferred with a single mode of assay, the integration of very large, complex, multimodal data into biological models and mechanisms represents a considerable challenge. An understanding of the principles of data integration and visualization methods is required to determine what methods are best applied to a particular single-cell dataset. Each class of method has advantages and pitfalls in terms of its ability to achieve various biological goals, including cell-type classification, regulatory network modelling and biological process inference. In choosing a data integration strategy, consideration must be given to whether the multi-omics data are matched (that is, measured on the same cell) or unmatched (that is, measured on different cells) and, more importantly, the overall modelling and visualization goals of the integrated analysis.
Collapse
Affiliation(s)
- Zhen Miao
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA,Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Benjamin D. Humphreys
- Division of Nephrology, Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Andrew P. McMahon
- Department of Stem Cell Biology and Regenerative Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Junhyong Kim
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA,Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA,
| |
Collapse
|
75
|
Wu L, Xue Z, Jin S, Zhang J, Guo Y, Bai Y, Jin X, Wang C, Wang L, Liu Z, Wang JQ, Lu L, Liu W. huARdb: human Antigen Receptor database for interactive clonotype-transcriptome analysis at the single-cell level. Nucleic Acids Res 2021; 50:D1244-D1254. [PMID: 34606616 PMCID: PMC8728177 DOI: 10.1093/nar/gkab857] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 08/31/2021] [Accepted: 09/14/2021] [Indexed: 12/15/2022] Open
Abstract
T-cell receptors (TCRs) and B-cell receptors (BCRs) are critical in recognizing antigens and activating the adaptive immune response. Stochastic V(D)J recombination generates massive TCR/BCR repertoire diversity. Single-cell immune profiling with transcriptome analysis allows the high-throughput study of individual TCR/BCR clonotypes and functions under both normal and pathological settings. However, a comprehensive database linking these data is not yet readily available. Here, we present the human Antigen Receptor database (huARdb), a large-scale human single-cell immune profiling database that contains 444 794 high confidence T or B cells (hcT/B cells) with full-length TCR/BCR sequence and transcriptomes from 215 datasets. All datasets were processed in a uniform workflow, including sequence alignment, cell subtype prediction, unsupervised cell clustering, and clonotype definition. We also developed a multi-functional and user-friendly web interface that provides interactive visualization modules for biologists to analyze the transcriptome and TCR/BCR features at the single-cell level. HuARdb is freely available at https://huarc.net/database with functions for data querying, browsing, downloading, and depositing. In conclusion, huARdb is a comprehensive and multi-perspective atlas for human antigen receptors.
Collapse
Affiliation(s)
- Lize Wu
- Institute of Immunology and Department of Rheumatology at Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310058, China.,Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou, Zhejiang 311121, China
| | - Ziwei Xue
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Siqian Jin
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Jinchun Zhang
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Yixin Guo
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Yadan Bai
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Xuexiao Jin
- Institute of Immunology and Department of Rheumatology at Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310058, China
| | - Chaochen Wang
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Lie Wang
- Department of Immunology, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310058, China
| | - Zuozhu Liu
- Zhejiang University-University of Illinois at Urbana-Champaign Institute (ZJU-UIUC Institute), International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - James Q Wang
- Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China
| | - Linrong Lu
- Institute of Immunology and Department of Rheumatology at Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310058, China.,Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou, Zhejiang 311121, China.,Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China.,Dr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative Medicine, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Wanlu Liu
- Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou, Zhejiang 311121, China.,Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, International Campus, Zhejiang University, Haining, Zhejiang 314400, China.,Dr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative Medicine, Zhejiang University, Hangzhou, Zhejiang 310058, China.,Department of Orthopedic Surgery of the Second Affiliated Hospital of Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang 310003, China.,Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
76
|
Wang X, Chen Y, Li Z, Huang B, Xu L, Lai J, Lu Y, Zha X, Liu B, Lan Y, Li Y. Single-Cell RNA-Seq of T Cells in B-ALL Patients Reveals an Exhausted Subset with Remarkable Heterogeneity. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2021; 8:e2101447. [PMID: 34365737 PMCID: PMC8498858 DOI: 10.1002/advs.202101447] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 06/27/2021] [Indexed: 06/02/2023]
Abstract
Characterization of functional T cell clusters is key to developing strategies for immunotherapy and predicting clinical responses in leukemia. Here, single-cell RNA sequencing is performed with T cells sorted from the peripheral blood of healthy individuals and patients with B cell-acute lymphoblastic leukemia (B-ALL). Unbiased bioinformatics analysis enabled the authors to identify 13 T cell clusters in the patients based on their molecular properties. All 11 major T cell subsets in healthy individuals are found in the patients with B-ALL, with the counterparts in the patients universally showing more activated characteristics. Two exhausted T cell populations, characterized by up-regulation of TIGIT, PDCD1, HLADRA, LAG3, and CTLA4 are specifically discovered in B-ALL patients. Of note, these exhausted T cells possess remarkable heterogeneity, and ten sub-clusters are further identified, which are characterized by different cell cycle phases, naïve states, and GNLY (coding granulysin) expression. Coupled with single-cell T cell receptor repertoire profiling, diverse originations of the exhausted T cells in B-ALL are suggested, and clonally expanded exhausted T cells are likely to originate from CD8+ effector memory/terminal effector cells. Together, these data provide for the first-time valuable insights for understanding exhausted T cell populations in leukemia.
Collapse
Affiliation(s)
- Xiaofang Wang
- Department of HematologyFirst Affiliated HospitalJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| | - Yanjuan Chen
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| | - Zongcheng Li
- State Key Laboratory of Experimental HematologyInstitute of HematologyFifth Medical Center of Chinese PLA General HospitalBeijing100071China
| | - Bingyan Huang
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| | - Ling Xu
- Department of HematologyFirst Affiliated HospitalJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| | - Jing Lai
- Department of HematologyFirst Affiliated HospitalJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
| | - Yuhong Lu
- Department of HematologyFirst Affiliated HospitalJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
| | - Xianfeng Zha
- Department of Clinical LaboratoryFirst Affiliated HospitalSchool of MedicineJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
| | - Bing Liu
- State Key Laboratory of Experimental HematologyInstitute of HematologyFifth Medical Center of Chinese PLA General HospitalBeijing100071China
| | - Yu Lan
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| | - Yangqiu Li
- Department of HematologyFirst Affiliated HospitalJinan UniversityNo. 601 West of Huangpu AvenueGuangzhou510632China
- Key Laboratory for Regenerative Medicine of Ministry of EducationInstitute of HematologySchool of MedicineJinan UniversityGuangzhou510632China
| |
Collapse
|
77
|
Germain PL, Lun A, Garcia Meixide C, Macnair W, Robinson MD. Doublet identification in single-cell sequencing data using scDblFinder. F1000Res 2021; 10:979. [PMID: 35814628 PMCID: PMC9204188 DOI: 10.12688/f1000research.73600.1] [Citation(s) in RCA: 160] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/28/2022] [Indexed: 07/27/2023] Open
Abstract
Doublets are prevalent in single-cell sequencing data and can lead to artifactual findings. A number of strategies have therefore been proposed to detect them. Building on the strengths of existing approaches, we developed scDblFinder, a fast, flexible and accurate Bioconductor-based doublet detection method. Here we present the method, justify its design choices, demonstrate its performance on both single-cell RNA and accessibility (ATAC) sequencing data, and provide some observations on doublet formation, detection, and enrichment analysis. Even in complex datasets, scDblFinder can accurately identify most heterotypic doublets, and was already found by an independent benchmark to outcompete alternatives.
Collapse
Affiliation(s)
- Pierre-Luc Germain
- DMLS Lab of Statistical Bioinformatics, University of Zürich, Zürich, 805, Switzerland
- D-HEST Institute for Neuroscience, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| | - Aaron Lun
- Genentech Inc., South San Francisco, CA, USA
| | | | - Will Macnair
- Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, F. Hoffmann-LaRoche Ltd, Basel, Switzerland
| | - Mark D. Robinson
- DMLS Lab of Statistical Bioinformatics, University of Zürich, Zürich, 805, Switzerland
- Swiss Institute of Bioinformatics, University of Zürich, Zürich, Switzerland
| |
Collapse
|
78
|
Weber LM, Hippen AA, Hickey PF, Berrett KC, Gertz J, Doherty JA, Greene CS, Hicks SC. Genetic demultiplexing of pooled single-cell RNA-sequencing samples in cancer facilitates effective experimental design. Gigascience 2021; 10:giab062. [PMID: 34553212 PMCID: PMC8458035 DOI: 10.1093/gigascience/giab062] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 07/19/2021] [Accepted: 08/26/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Pooling cells from multiple biological samples prior to library preparation within the same single-cell RNA sequencing experiment provides several advantages, including lower library preparation costs and reduced unwanted technological variation, such as batch effects. Computational demultiplexing tools based on natural genetic variation between individuals provide a simple approach to demultiplex samples, which does not require complex additional experimental procedures. However, to our knowledge these tools have not been evaluated in cancer, where somatic variants, which could differ between cells from the same sample, may obscure the signal in natural genetic variation. RESULTS Here, we performed in silico benchmark evaluations by combining raw sequencing reads from multiple single-cell samples in high-grade serous ovarian cancer, which has a high copy number burden, and lung adenocarcinoma, which has a high tumor mutational burden. Our results confirm that genetic demultiplexing tools can be effectively deployed on cancer tissue using a pooled experimental design, although high proportions of ambient RNA from cell debris reduce performance. CONCLUSIONS This strategy provides significant cost savings through pooled library preparation. To facilitate similar analyses at the experimental design phase, we provide freely accessible code and a reproducible Snakemake workflow built around the best-performing tools found in our in silico benchmark evaluations, available at https://github.com/lmweber/snp-dmx-cancer.
Collapse
Affiliation(s)
- Lukas M Weber
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| | - Ariel A Hippen
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Peter F Hickey
- Advanced Technology & Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
| | - Kristofer C Berrett
- Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, USA
| | - Jason Gertz
- Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, USA
| | - Jennifer Anne Doherty
- Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, USA
| | - Casey S Greene
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| |
Collapse
|
79
|
Thibodeau A, Eroglu A, McGinnis CS, Lawlor N, Nehar-Belaid D, Kursawe R, Marches R, Conrad DN, Kuchel GA, Gartner ZJ, Banchereau J, Stitzel ML, Cicek AE, Ucar D. AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data. Genome Biol 2021; 22:252. [PMID: 34465366 PMCID: PMC8408950 DOI: 10.1186/s13059-021-02469-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022] Open
Abstract
Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved.
Collapse
Affiliation(s)
- Asa Thibodeau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Alper Eroglu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Christopher S McGinnis
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Nathan Lawlor
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | | | - Romy Kursawe
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Radu Marches
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Daniel N Conrad
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - George A Kuchel
- University of Connecticut Center on Aging, UConn Health Center, Farmington, CT, 06030, USA
| | - Zev J Gartner
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, 94158, USA
- NSF Center for Cellular Construction, San Francisco, CA, 94158, USA
| | | | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, 06030, USA
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, 06800, Ankara, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| |
Collapse
|
80
|
Vallejo J, Cochain C, Zernecke A, Ley K. Heterogeneity of immune cells in human atherosclerosis revealed by scRNA-Seq. Cardiovasc Res 2021; 117:2537-2543. [PMID: 34343272 DOI: 10.1093/cvr/cvab260] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 07/02/2021] [Accepted: 07/30/2021] [Indexed: 12/14/2022] Open
Abstract
Immune cells in atherosclerosis include T, B, natural killer (NK) and NKT cells, macrophages, monocytes, dendritic cells (DCs), neutrophils and mast cells. Advances in single cell RNA sequencing (sRNA-Seq) have refined our understanding of immune cell subsets. Four recent studies have used scRNA-Seq of immune cells in human atherosclerotic lesions and peripheral blood mononuclear cells (PBMCs), some including cell surface phenotypes revealed by oligonucleotide-tagged antibodies, which confirmed known and identified new immune cell subsets and identified genes significantly upregulated in PBMCs from HIV+ subjects with atherosclerosis compared to PBMCs from matched HIV+ subjects without atherosclerosis. The ability of scRNA-Seq to identify cell types is greatly augmented by adding cell surface phenotype using antibody sequencing. In this review we summarize the latest data obtained by scRNA-Seq on plaques and human PBMCs in human subjects with atherosclerosis.
Collapse
Affiliation(s)
- Jenifer Vallejo
- Division of Inflammation Biology, La Jolla Institute for Immunology, California, USA
| | - Clément Cochain
- Institute of Experimental Biomedicine, University Hospital Würzburg, Germany.,Comprehensive Heart Failure Center, University Hospital Würzburg, Germany
| | - Alma Zernecke
- Institute of Experimental Biomedicine, University Hospital Würzburg, Germany
| | - Klaus Ley
- Division of Inflammation Biology, La Jolla Institute for Immunology, California, USA.,Department of Bioengineering, University of California San Diego, California, USA
| |
Collapse
|
81
|
Xi NM, Li JJ. Protocol for executing and benchmarking eight computational doublet-detection methods in single-cell RNA sequencing data analysis. STAR Protoc 2021; 2:100699. [PMID: 34382023 PMCID: PMC8339294 DOI: 10.1016/j.xpro.2021.100699] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The existence of doublets is a key confounder in single-cell RNA sequencing (scRNA-seq) data analysis. Computational techniques have been developed for detecting doublets from scRNA-seq data. We developed an R package DoubletCollection to integrate the installation and execution of eight doublet detection methods. DoubletCollection provides a unified interface to perform and visualize downstream analysis after doublet detection. Here, we present a protocol of using DoubletCollection to benchmark doublet-detection methods. This protocol can accommodate new doublet-detection methods in the fast-growing scRNA-seq field. For details on the use and execution of this protocol, please refer to Xi and Li (2020). Integrate the installation and execution of eight doublet-detection methods An interface to perform and visualize downstream analysis after doublet detection A collection of real and synthetic scRNA-seq datasets with doublet annotations A user-friendly R package to perform doublet detection on scRNA-seq count matrices
Collapse
Affiliation(s)
- Nan Miles Xi
- Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL 60660, USA.,Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095-1554, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095-1554, USA.,Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095-7088, USA.,Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095-1766, USA.,Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095-1772, USA
| |
Collapse
|
82
|
Weber LL, Sashittal P, El-Kebir M. doubletD: detecting doublets in single-cell DNA sequencing data. Bioinformatics 2021; 37:i214-i221. [PMID: 34252961 PMCID: PMC8275324 DOI: 10.1093/bioinformatics/btab266] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation While single-cell DNA sequencing (scDNA-seq) has enabled the study of intratumor heterogeneity at an unprecedented resolution, current technologies are error-prone and often result in doublets where two or more cells are mistaken for a single cell. Not only do doublets confound downstream analyses, but the increase in doublet rate is also a major bottleneck preventing higher throughput with current single-cell technologies. Although doublet detection and removal are standard practice in scRNA-seq data analysis, options for scDNA-seq data are limited. Current methods attempt to detect doublets while also performing complex downstream analyses tasks, leading to decreased efficiency and/or performance. Results We present doubletD, the first standalone method for detecting doublets in scDNA-seq data. Underlying our method is a simple maximum likelihood approach with a closed-form solution. We demonstrate the performance of doubletD on simulated data as well as real datasets, outperforming current methods for downstream analysis of scDNA-seq data that jointly infer doublets as well as standalone approaches for doublet detection in scRNA-seq data. Incorporating doubletD in scDNA-seq analysis pipelines will reduce complexity and lead to more accurate results. Availability and implementation https://github.com/elkebir-group/doubletD. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leah L Weber
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbama, IL 61801, USA
| | - Palash Sashittal
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbama, IL 61801, USA.,Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Mohammed El-Kebir
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbama, IL 61801, USA
| |
Collapse
|
83
|
Urbonaite G, Lee JTH, Liu P, Parada GE, Hemberg M, Acar M. A yeast-optimized single-cell transcriptomics platform elucidates how mycophenolic acid and guanine alter global mRNA levels. Commun Biol 2021; 4:822. [PMID: 34193958 PMCID: PMC8245502 DOI: 10.1038/s42003-021-02320-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 06/03/2021] [Indexed: 11/09/2022] Open
Abstract
Stochastic gene expression leads to inherent variability in expression outcomes even in isogenic single-celled organisms grown in the same environment. The Drop-Seq technology facilitates transcriptomic studies of individual mammalian cells, and it has had transformative effects on the characterization of cell identity and function based on single-cell transcript counts. However, application of this technology to organisms with different cell size and morphology characteristics has been challenging. Here we present yeastDrop-Seq, a yeast-optimized platform for quantifying the number of distinct mRNA molecules in a cell-specific manner in individual yeast cells. Using yeastDrop-Seq, we measured the transcriptomic impact of the lifespan-extending compound mycophenolic acid and its epistatic agent guanine. Each treatment condition had a distinct transcriptomic footprint on isogenic yeast cells as indicated by distinct clustering with clear separations among the different groups. The yeastDrop-Seq platform facilitates transcriptomic profiling of yeast cells for basic science and biotechnology applications.
Collapse
Affiliation(s)
- Guste Urbonaite
- Systems Biology Institute, Yale University, West Haven, CT, USA.,Department of Molecular Cellular and Developmental Biology, Yale University, New Haven, CT, USA
| | | | - Ping Liu
- Systems Biology Institute, Yale University, West Haven, CT, USA.,Department of Molecular Cellular and Developmental Biology, Yale University, New Haven, CT, USA
| | | | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK. .,Evergrande Center for Immunologic Disease, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA.
| | - Murat Acar
- Systems Biology Institute, Yale University, West Haven, CT, USA. .,Department of Molecular Cellular and Developmental Biology, Yale University, New Haven, CT, USA. .,Department of Physics, Yale University, New Haven, CT, USA.
| |
Collapse
|
84
|
Sun W, Modica S, Dong H, Wolfrum C. Plasticity and heterogeneity of thermogenic adipose tissue. Nat Metab 2021; 3:751-761. [PMID: 34158657 DOI: 10.1038/s42255-021-00417-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 05/19/2021] [Indexed: 12/13/2022]
Abstract
The perception of adipose tissue, both in the scientific community and in the general population, has changed dramatically in the past 20 years. While adipose tissue was thought for a long time to be a rather simple lipid storage entity, it is now recognized as a highly heterogeneous organ and a critical regulator of systemic metabolism, composed of many different subtypes of cells, with important endocrine functions. Additionally, adipose tissue is nowadays recognized to contribute to energy turnover, due to the presence of specialized thermogenic adipocytes, which can be found in many adipose depots. This review discusses the unprecedented insights that we have gained into the heterogeneity of thermogenic adipocytes and their respective precursors due to the technical developments in single-cell and nucleus technologies. These methodological advances have increased our understanding of how adipose tissue catabolic function is influenced by developmental and intercellular communication events.
Collapse
Affiliation(s)
- Wenfei Sun
- Institute of Food, Nutrition and Health, ETH Zurich, Schwerzenbach, Switzerland
| | - Salvatore Modica
- Institute of Food, Nutrition and Health, ETH Zurich, Schwerzenbach, Switzerland
| | - Hua Dong
- Institute of Food, Nutrition and Health, ETH Zurich, Schwerzenbach, Switzerland
| | - Christian Wolfrum
- Institute of Food, Nutrition and Health, ETH Zurich, Schwerzenbach, Switzerland.
| |
Collapse
|
85
|
Sun T, Song D, Li WV, Li JJ. scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. Genome Biol 2021; 22:163. [PMID: 34034771 PMCID: PMC8147071 DOI: 10.1186/s13059-021-02367-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 04/27/2021] [Indexed: 12/13/2022] Open
Abstract
A pressing challenge in single-cell transcriptomics is to benchmark experimental protocols and computational methods. A solution is to use computational simulators, but existing simulators cannot simultaneously achieve three goals: preserving genes, capturing gene correlations, and generating any number of cells with varying sequencing depths. To fill this gap, we propose scDesign2, a transparent simulator that achieves all three goals and generates high-fidelity synthetic data for multiple single-cell gene expression count-based technologies. In particular, scDesign2 is advantageous in its transparent use of probabilistic models and its ability to capture gene correlations via copulas.
Collapse
Affiliation(s)
- Tianyi Sun
- grid.19006.3e0000 0000 9632 6718Department of Statistics, University of California, Los Angeles, 90095-1554 CA USA
| | - Dongyuan Song
- grid.19006.3e0000 0000 9632 6718Interdepartmental Program of Bioinformatics, University of California, Los Angeles, 90095-7246 CA USA
| | - Wei Vivian Li
- Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, 08854, NJ, USA.
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA. .,Department of Human Genetics, University of California, Los Angeles, 90095-7088, CA, USA. .,Department of Computational Medicine, University of California, Los Angeles, 90095-1766, CA, USA. .,Department of Biostatistics, University of California, Los Angeles, 90095-1772, CA, USA.
| |
Collapse
|
86
|
Davies P, Jones M, Liu J, Hebenstreit D. Anti-bias training for (sc)RNA-seq: experimental and computational approaches to improve precision. Brief Bioinform 2021; 22:6265204. [PMID: 33959753 PMCID: PMC8574610 DOI: 10.1093/bib/bbab148] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 03/10/2021] [Accepted: 03/26/2021] [Indexed: 12/29/2022] Open
Abstract
RNA-seq, including single cell RNA-seq (scRNA-seq), is plagued by insufficient sensitivity and lack of precision. As a result, the full potential of (sc)RNA-seq is limited. Major factors in this respect are the presence of global bias in most datasets, which affects detection and quantitation of RNA in a length-dependent fashion. In particular, scRNA-seq is affected by technical noise and a high rate of dropouts, where the vast majority of original transcripts is not converted into sequencing reads. We discuss these biases origins and implications, bioinformatics approaches to correct for them, and how biases can be exploited to infer characteristics of the sample preparation process, which in turn can be used to improve library preparation.
Collapse
Affiliation(s)
- Philip Davies
- Daniel Hebenstreit's Research Group University of Warwick, CV4 7AL Coventry, UK
| | - Matt Jones
- Daniel Hebenstreit's Research Group University of Warwick, CV4 7AL Coventry, UK
| | - Juntai Liu
- Physics Department, University of Warwick, CV4 7AL Coventry, UK
| | | |
Collapse
|