1
|
Gupta P, O’Neill H, Wolvetang E, Chatterjee A, Gupta I. Advances in single-cell long-read sequencing technologies. NAR Genom Bioinform 2024; 6:lqae047. [PMID: 38774511 PMCID: PMC11106032 DOI: 10.1093/nargab/lqae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/18/2024] [Accepted: 04/29/2024] [Indexed: 05/24/2024] Open
Abstract
With an increase in accuracy and throughput of long-read sequencing technologies, they are rapidly being assimilated into the single-cell sequencing pipelines. For transcriptome sequencing, these techniques provide RNA isoform-level information in addition to the gene expression profiles. Long-read sequencing technologies not only help in uncovering complex patterns of cell-type specific splicing, but also offer unprecedented insights into the origin of cellular complexity and thus potentially new avenues for drug development. Additionally, single-cell long-read DNA sequencing enables high-quality assemblies, structural variant detection, haplotype phasing, resolving high-complexity regions, and characterization of epigenetic modifications. Given that significant progress has primarily occurred in single-cell RNA isoform sequencing (scRiso-seq), this review will delve into these advancements in depth and highlight the practical considerations and operational challenges, particularly pertaining to downstream analysis. We also aim to offer a concise introduction to complementary technologies for single-cell sequencing of the genome, epigenome and epitranscriptome. We conclude by identifying certain key areas of innovation that may drive these technologies further and foster more widespread application in biomedical science.
Collapse
Affiliation(s)
- Pallavi Gupta
- University of Queensland – IIT Delhi Research Academy, Hauz Khas, New Delhi 110016, India
- Australian Institute of Bioengineering and Nanotechnology (AIBN), The University of Queensland, St Lucia, QLD 4072, Australia
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Hannah O’Neill
- Department of Pathology, Dunedin School of Medicine, University of Otago, 58 Hanover Street, Dunedin 9054, New Zealand
| | - Ernst J Wolvetang
- Australian Institute of Bioengineering and Nanotechnology (AIBN), The University of Queensland, St Lucia, QLD 4072, Australia
| | - Aniruddha Chatterjee
- Department of Pathology, Dunedin School of Medicine, University of Otago, 58 Hanover Street, Dunedin 9054, New Zealand
| | - Ishaan Gupta
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| |
Collapse
|
2
|
Kumari P, Kaur M, Dindhoria K, Ashford B, Amarasinghe SL, Thind AS. Advances in long-read single-cell transcriptomics. Hum Genet 2024:10.1007/s00439-024-02678-x. [PMID: 38787419 DOI: 10.1007/s00439-024-02678-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 05/07/2024] [Indexed: 05/25/2024]
Abstract
Long-read single-cell transcriptomics (scRNA-Seq) is revolutionizing the way we profile heterogeneity in disease. Traditional short-read scRNA-Seq methods are limited in their ability to provide complete transcript coverage, resolve isoforms, and identify novel transcripts. The scRNA-Seq protocols developed for long-read sequencing platforms overcome these limitations by enabling the characterization of full-length transcripts. Long-read scRNA-Seq techniques initially suffered from comparatively poor accuracy compared to short read scRNA-Seq. However, with improvements in accuracy, accessibility, and cost efficiency, long-reads are gaining popularity in the field of scRNA-Seq. This review details the advances in long-read scRNA-Seq, with an emphasis on library preparation protocols and downstream bioinformatics analysis tools.
Collapse
Affiliation(s)
- Pallawi Kumari
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Manmeet Kaur
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Kiran Dindhoria
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Bruce Ashford
- Illawarra Shoalhaven Local Health District (ISLHD), NSW Health, Wollongong, NSW, Australia
| | - Shanika L Amarasinghe
- Monash Biomedical Discovery Institute, Monash University, Clayton, VIC, 3800, Australia
- Walter and Eliza Hall Institute of Medical Research, 1G, Royal Parade, Parkville, VIC, 3025, Australia
| | - Amarinder Singh Thind
- Illawarra Shoalhaven Local Health District (ISLHD), NSW Health, Wollongong, NSW, Australia.
- The School of Chemistry and Molecular Bioscience (SCMB), University of Wollongong, Loftus St, Wollongong, NSW, 2500, Australia.
| |
Collapse
|
3
|
Booeshaghi AS, Chen X, Pachter L. A machine-readable specification for genomics assays. Bioinformatics 2024; 40:btae168. [PMID: 38579259 PMCID: PMC11009023 DOI: 10.1093/bioinformatics/btae168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/04/2023] [Accepted: 04/04/2024] [Indexed: 04/07/2024] Open
Abstract
MOTIVATION Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries. RESULTS We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays. AVAILABILITY AND IMPLEMENTATION The specification and associated seqspec command line tool is available at https://www.doi.org/10.5281/zenodo.10213865.
Collapse
Affiliation(s)
- Ali Sina Booeshaghi
- Department of Bioengineering, University of California, Berkeley, CA, 94720, United States
| | - Xi Chen
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, United States
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, United States
| |
Collapse
|
4
|
Booeshaghi AS, Chen X, Pachter L. A machine-readable specification for genomics assays. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.17.533215. [PMID: 36993635 PMCID: PMC10055303 DOI: 10.1101/2023.03.17.533215] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries. We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays. The specification and associated seqspec command line tool is available at https://github.com/IGVF/seqspec.
Collapse
Affiliation(s)
- A. Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Xi Chen
- School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
5
|
Dudchenko O, Ordovas-Montanes J, Bingle CD. Respiratory epithelial cell types, states and fates in the era of single-cell RNA-sequencing. Biochem J 2023; 480:921-939. [PMID: 37410389 PMCID: PMC10422933 DOI: 10.1042/bcj20220572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 06/19/2023] [Accepted: 06/20/2023] [Indexed: 07/07/2023]
Abstract
Standalone and consortia-led single-cell atlases of healthy and diseased human airways generated with single-cell RNA-sequencing (scRNA-seq) have ushered in a new era in respiratory research. Numerous discoveries, including the pulmonary ionocyte, potentially novel cell fates, and a diversity of cell states among common and rare epithelial cell types have highlighted the extent of cellular heterogeneity and plasticity in the respiratory tract. scRNA-seq has also played a pivotal role in our understanding of host-virus interactions in coronavirus disease 2019 (COVID-19). However, as our ability to generate large quantities of scRNA-seq data increases, along with a growing number of scRNA-seq protocols and data analysis methods, new challenges related to the contextualisation and downstream applications of insights are arising. Here, we review the fundamental concept of cellular identity from the perspective of single-cell transcriptomics in the respiratory context, drawing attention to the need to generate reference annotations and to standardise the terminology used in literature. Findings about airway epithelial cell types, states and fates obtained from scRNA-seq experiments are compared and contrasted with information accumulated through the use of conventional methods. This review attempts to discuss major opportunities and to outline some of the key limitations of the modern-day scRNA-seq that need to be addressed to enable efficient and meaningful integration of scRNA-seq data from different platforms and studies, with each other as well as with data from other high-throughput sequencing-based genomic, transcriptomic and epigenetic analyses.
Collapse
Affiliation(s)
- Oleksandr Dudchenko
- Department of Infection, Immunity and Cardiovascular Disease, The Medical School, University of Sheffield, Sheffield, South Yorkshire, U.K
| | - Jose Ordovas-Montanes
- Division of Gastroenterology, Hepatology and Nutrition, Boston Children's Hospital, Boston, MA, U.S.A
- Programme in Immunology, Harvard Medical School, Boston, MA, U.S.A
| | - Colin D. Bingle
- Department of Infection, Immunity and Cardiovascular Disease, The Medical School, University of Sheffield, Sheffield, South Yorkshire, U.K
| |
Collapse
|
6
|
Darolti I, Mank JE. Sex-biased gene expression at single-cell resolution: cause and consequence of sexual dimorphism. Evol Lett 2023; 7:148-156. [PMID: 37251587 PMCID: PMC10210449 DOI: 10.1093/evlett/qrad013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 03/08/2023] [Accepted: 04/06/2023] [Indexed: 05/31/2023] Open
Abstract
Gene expression differences between males and females are thought to be key for the evolution of sexual dimorphism, and sex-biased genes are often used to study the molecular footprint of sex-specific selection. However, gene expression is often measured from complex aggregations of diverse cell types, making it difficult to distinguish between sex differences in expression that are due to regulatory rewiring within similar cell types and those that are simply a consequence of developmental differences in cell-type abundance. To determine the role of regulatory versus developmental differences underlying sex-biased gene expression, we use single-cell transcriptomic data from multiple somatic and reproductive tissues of male and female guppies, a species that exhibits extensive phenotypic sexual dimorphism. Our analysis of gene expression at single-cell resolution demonstrates that nonisometric scaling between the cell populations within each tissue and heterogeneity in cell-type abundance between the sexes can influence inferred patterns of sex-biased gene expression by increasing both the false-positive and false-negative rates. Moreover, we show that, at the bulk level, the subset of sex-biased genes that are the product of sex differences in cell-type abundance can significantly confound patterns of coding-sequence evolution. Taken together, our results offer a unique insight into the effects of allometry and cellular heterogeneity on perceived patterns of sex-biased gene expression and highlight the power of single-cell RNA-sequencing in distinguishing between sex-biased genes that are the result of regulatory change and those that stem from sex differences in cell-type abundance, and hence are a consequence rather than a cause of sexual dimorphism.
Collapse
Affiliation(s)
- Iulia Darolti
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Judith E Mank
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
7
|
Sant P, Rippe K, Mallm JP. Approaches for single-cell RNA sequencing across tissues and cell types. Transcription 2023; 14:127-145. [PMID: 37062951 PMCID: PMC10807473 DOI: 10.1080/21541264.2023.2200721] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 03/30/2023] [Indexed: 04/18/2023] Open
Abstract
Single-cell sequencing of RNA (scRNA-seq) has advanced our understanding of cellular heterogeneity and signaling in developmental biology and disease. A large number of complementary assays have been developed to profile transcriptomes of individual cells, also in combination with other readouts, such as chromatin accessibility or antibody-based analysis of protein surface markers. As scRNA-seq technologies are advancing fast, it is challenging to establish robust workflows and up-to-date protocols that are best suited to address the large range of research questions. Here, we review scRNA-seq techniques from mRNA end-counting to total RNA in relation to their specific features and outline the necessary sample preparation steps and quality control measures. Based on our experience in dealing with the continuously growing portfolio from the perspective of a central single-cell facility, we aim to provide guidance on how workflows can be best automatized and share our experience in coping with the continuous expansion of scRNA-seq techniques.
Collapse
Affiliation(s)
- Pooja Sant
- Single-cell Open Lab, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany
| | - Karsten Rippe
- Division Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany
| | - Jan-Philipp Mallm
- Single-cell Open Lab, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany
| |
Collapse
|
8
|
Fuess LE, Bolnick DI. Single-Cell RNA Sequencing Reveals Microevolution of the Stickleback Immune System. Genome Biol Evol 2023; 15:evad053. [PMID: 37039516 PMCID: PMC10116603 DOI: 10.1093/gbe/evad053] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 03/16/2023] [Accepted: 03/23/2023] [Indexed: 04/12/2023] Open
Abstract
The risk and severity of pathogen infections in humans, livestock, or wild organisms depend on host immune function, which can vary between closely related host populations or even among individuals. This immune variation can entail between-population differences in immune gene coding sequences, copy number, or expression. In recent years, many studies have focused on population divergence in immunity using whole-tissue transcriptomics. But, whole-tissue transcriptomics cannot distinguish between evolved differences in gene regulation within cells, versus changes in cell composition within the focal tissue. Here, we leverage single-cell transcriptomic approaches to document signatures of microevolution of immune system structure in a natural system, the three-spined stickleback (Gasterosteus aculeatus). We sampled nine adult fish from three populations with variability in resistance to a cestode parasite, Schistocephalus solidus, to create the first comprehensive immune cell atlas for G. aculeatus. Eight broad immune cell types, corresponding to major vertebrate immune cells, were identified. We were also able to document significant variation in both abundance and expression profiles of the individual immune cell types among the three populations of fish. Furthermore, we demonstrate that identified cell type markers can be used to reinterpret traditional transcriptomic data: we reevaluate previously published whole-tissue transcriptome data from a quantitative genetic experimental infection study to gain better resolution relating infection outcomes to inferred cell type variation. Our combined study demonstrates the power of single-cell sequencing to not only document evolutionary phenomena (i.e., microevolution of immune cells) but also increase the power of traditional transcriptomic data sets.
Collapse
Affiliation(s)
- Lauren E Fuess
- Department of Biology, Texas State University
- Department of Ecology and Evolutionary Biology, University of Connecticut
| | - Daniel I Bolnick
- Department of Ecology and Evolutionary Biology, University of Connecticut
| |
Collapse
|
9
|
Jiao L, Wang G, Dai H, Li X, Wang S, Song T. scTransSort: Transformers for Intelligent Annotation of Cell Types by Gene Embeddings. Biomolecules 2023; 13:biom13040611. [PMID: 37189359 DOI: 10.3390/biom13040611] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 03/05/2023] [Accepted: 03/10/2023] [Indexed: 03/31/2023] Open
Abstract
Single-cell transcriptomics is rapidly advancing our understanding of the composition of complex tissues and biological cells, and single-cell RNA sequencing (scRNA-seq) holds great potential for identifying and characterizing the cell composition of complex tissues. Cell type identification by analyzing scRNA-seq data is mostly limited by time-consuming and irreproducible manual annotation. As scRNA-seq technology scales to thousands of cells per experiment, the exponential increase in the number of cell samples makes manual annotation more difficult. On the other hand, the sparsity of gene transcriptome data remains a major challenge. This paper applied the idea of the transformer to single-cell classification tasks based on scRNA-seq data. We propose scTransSort, a cell-type annotation method pretrained with single-cell transcriptomics data. The scTransSort incorporates a method of representing genes as gene expression embedding blocks to reduce the sparsity of data used for cell type identification and reduce the computational complexity. The feature of scTransSort is that its implementation of intelligent information extraction for unordered data, automatically extracting valid features of cell types without the need for manually labeled features and additional references. In experiments on cells from 35 human and 26 mouse tissues, scTransSort successfully elucidated its high accuracy and high performance for cell type identification, and demonstrated its own high robustness and generalization ability.
Collapse
|
10
|
Hazzard B, Sá JM, Ellis AC, Pascini TV, Amin S, Wellems TE, Serre D. Long read single cell RNA sequencing reveals the isoform diversity of Plasmodium vivax transcripts. PLoS Negl Trop Dis 2022; 16:e0010991. [PMID: 36525464 PMCID: PMC9803293 DOI: 10.1371/journal.pntd.0010991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 12/30/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Plasmodium vivax infections often consist of heterogenous populations of parasites at different developmental stages and with distinct transcriptional profiles, which complicates gene expression analyses. The advent of single cell RNA sequencing (scRNA-seq) enabled disentangling this complexity and has provided robust and stage-specific characterization of Plasmodium gene expression. However, scRNA-seq information is typically derived from the end of each mRNA molecule (usually the 3'-end) and therefore fails to capture the diversity in transcript isoforms documented in bulk RNA-seq data. Here, we describe the sequencing of scRNA-seq libraries using Pacific Biosciences (PacBio) chemistry to characterize full-length Plasmodium vivax transcripts from single cell parasites. Our results show that many P. vivax genes are transcribed into multiple isoforms, primarily through variations in untranslated region (UTR) length or splicing, and that the expression of many isoforms is developmentally regulated. Our findings demonstrate that long read sequencing can be used to characterize mRNA molecules at the single cell level and provides an additional resource to better understand the regulation of gene expression throughout the Plasmodium life cycle.
Collapse
Affiliation(s)
- Brittany Hazzard
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Juliana M. Sá
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Angela C. Ellis
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Tales V. Pascini
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Shuchi Amin
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Thomas E. Wellems
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - David Serre
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|