1
|
Rich JM, Moses L, Einarsson PH, Jackson K, Luebbert L, Booeshaghi AS, Antonsson S, Sullivan DK, Bray N, Melsted P, Pachter L. The impact of package selection and versioning on single-cell RNA-seq analysis. bioRxiv 2024:2024.04.04.588111. [PMID: 38617255 PMCID: PMC11014608 DOI: 10.1101/2024.04.04.588111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Standard single-cell RNA-sequencing analysis (scRNA-seq) workflows consist of converting raw read data into cell-gene count matrices through sequence alignment, followed by analyses including filtering, highly variable gene selection, dimensionality reduction, clustering, and differential expression analysis. Seurat and Scanpy are the most widely-used packages implementing such workflows, and are generally thought to implement individual steps similarly. We investigate in detail the algorithms and methods underlying Seurat and Scanpy and find that there are, in fact, considerable differences in the outputs of Seurat and Scanpy. The extent of differences between the programs is approximately equivalent to the variability that would be introduced in benchmarking scRNA-seq datasets by sequencing less than 5% of the reads or analyzing less than 20% of the cell population. Additionally, distinct versions of Seurat and Scanpy can produce very different results, especially during parts of differential expression analysis. Our analysis highlights the need for users of scRNA-seq to carefully assess the tools on which they rely, and the importance of developers of scientific software to prioritize transparency, consistency, and reproducibility for their tools.
Collapse
Affiliation(s)
- Joseph M Rich
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- USC-Caltech MD/PhD Program, Keck School of Medicine, Los Angeles, CA, 90033, USA
| | - Lambda Moses
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Pétur Helgi Einarsson
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | - Kayla Jackson
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- USC-Caltech MD/PhD Program, Keck School of Medicine, Los Angeles, CA, 90033, USA
| | - Laura Luebbert
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - A. Sina Booeshaghi
- Department of Bioengineering, University of California Berkeley, Berkeley, CA, USA
| | - Sindri Antonsson
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | - Delaney K. Sullivan
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | | | - Páll Melsted
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | - Lior Pachter
- Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA
- Lead Contact
| |
Collapse
|
2
|
Sullivan DK, Min KHJ, Hjörleifsson KE, Luebbert L, Holley G, Moses L, Gustafsson J, Bray NL, Pimentel H, Booeshaghi AS, Melsted P, Pachter L. kallisto, bustools, and kb-python for quantifying bulk, single-cell, and single-nucleus RNA-seq. bioRxiv 2024:2023.11.21.568164. [PMID: 38045414 PMCID: PMC10690192 DOI: 10.1101/2023.11.21.568164] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
The term "RNA-seq" refers to a collection of assays based on sequencing experiments that involve quantifying RNA species from bulk tissue, from single cells, or from single nuclei. The kallisto, bustools, and kb-python programs are free, open-source software tools for performing this analysis that together can produce gene expression quantification from raw sequencing reads. The quantifications can be individualized for multiple cells, multiple samples, or both. Additionally, these tools allow gene expression values to be classified as originating from nascent RNA species or mature RNA species, making this workflow amenable to both cell-based and nucleus-based assays. This protocol describes in detail how to use kallisto and bustools in conjunction with a wrapper, kb-python, to preprocess RNA-seq data.
Collapse
Affiliation(s)
- Delaney K Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | | | | | - Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | | | - Lambda Moses
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | | | - Nicolas L Bray
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Harold Pimentel
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - A Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Páll Melsted
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA
| |
Collapse
|
3
|
Booeshaghi AS, Min KH(J, Gehring J, Pachter L. Quantifying orthogonal barcodes for sequence census assays. Bioinform Adv 2023; 4:vbad181. [PMID: 38213823 PMCID: PMC10783946 DOI: 10.1093/bioadv/vbad181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 10/02/2023] [Accepted: 12/19/2023] [Indexed: 01/13/2024]
Abstract
Summary Barcode-based sequence census assays utilize custom or random oligonucloetide sequences to label various biological features, such as cell-surface proteins or CRISPR perturbations. These assays all rely on barcode quantification, a task that is complicated by barcode design and technical noise. We introduce a modular approach to quantifying barcodes that achieves speed and memory improvements over existing tools. We also introduce a set of quality control metrics, and accompanying tool, for validating barcode designs. Availability and implementation https://github.com/pachterlab/kb_python, https://github.com/pachterlab/qcbc.
Collapse
Affiliation(s)
- A Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States
| | - Kyung Hoi (Joseph) Min
- Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Jase Gehring
- Arcadia Science, Berkeley, CA 94702, United States
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, United States
| |
Collapse
|
4
|
Abstract
We describe a workflow for preprocessing a wide variety of single-cell genomics data types. The approach is based on parsing of machine-readable seqspec assay specifications to customize inputs for kb-python, which uses kallisto and bustools to catalog reads, error correct barcodes, and count reads. The universal preprocessing method is implemented in the Python package cellatlas that is available for download at: https://github.com/cellatlas/cellatlas/.
Collapse
Affiliation(s)
- A. Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Delaney K. Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
5
|
Booeshaghi AS, Beltrame EDV, Bannon D, Gehring J, Pachter L. Author Correction: Principles of open source bioinstrumentation applied to the poseidon syringe pump system. Sci Rep 2023; 13:14834. [PMID: 37684312 PMCID: PMC10491597 DOI: 10.1038/s41598-023-42035-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2023] Open
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Eduardo da Veiga Beltrame
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Dylan Bannon
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Jase Gehring
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
| | - Lior Pachter
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
- Department of Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|
6
|
Moses L, Einarsson PH, Jackson K, Luebbert L, Booeshaghi AS, Antonsson S, Bray N, Melsted P, Pachter L. Voyager: exploratory single-cell genomics data analysis with geospatial statistics. bioRxiv 2023:2023.07.20.549945. [PMID: 37645732 PMCID: PMC10461913 DOI: 10.1101/2023.07.20.549945] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Exploratory spatial data analysis (ESDA) can be a powerful approach to understanding single-cell genomics datasets, but it is not yet part of standard data analysis workflows. In particular, geospatial analyses, which have been developed and refined for decades, have yet to be fully adapted and applied to spatial single-cell analysis. We introduce the Voyager platform, which systematically brings the geospatial ESDA tradition to (spatial) -omics, with local, bivariate, and multivariate spatial methods not yet commonly applied to spatial -omics, united by a uniform user interface. Using Voyager, we showcase biological insights that can be derived with its methods, such as biologically relevant negative spatial autocorrelation. Underlying Voyager is the SpatialFeatureExperiment data structure, which combines Simple Feature with SingleCellExperiment and AnnData to represent and operate on geometries bundled with gene expression data. Voyager has comprehensive tutorials demonstrating ESDA built on GitHub Actions to ensure reproducibility and scalability, using data from popular commercial technologies. Voyager is implemented in both R/Bioconductor and Python/PyPI, and features compatibility tests to ensure that both implementations return consistent results.
Collapse
Affiliation(s)
- Lambda Moses
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Pétur Helgi Einarsson
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | - Kayla Jackson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - A. Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Sindri Antonsson
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | | | - Páll Melsted
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, Reykjavík, Iceland
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
7
|
Abstract
Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries. We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays. The specification and associated seqspec command line tool is available at https://github.com/IGVF/seqspec.
Collapse
Affiliation(s)
- A. Sina Booeshaghi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Xi Chen
- School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
8
|
Gálvez-Merchán Á, Min KH(J, Pachter L, Booeshaghi AS. Metadata retrieval from sequence databases with ffq. Bioinformatics 2023; 39:6971839. [PMID: 36610997 PMCID: PMC9883619 DOI: 10.1093/bioinformatics/btac667] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/15/2022] [Accepted: 10/07/2022] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper's DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq's modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. AVAILABILITY AND IMPLEMENTATION ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq.
Collapse
Affiliation(s)
- Ángel Gálvez-Merchán
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kyung Hoi (Joseph) Min
- Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA 91125, USA
| | | | | |
Collapse
|
9
|
Yao Z, Liu H, Xie F, Fischer S, Adkins RS, Aldridge AI, Ament SA, Bartlett A, Behrens MM, Van den Berge K, Bertagnolli D, de Bézieux HR, Biancalani T, Booeshaghi AS, Bravo HC, Casper T, Colantuoni C, Crabtree J, Creasy H, Crichton K, Crow M, Dee N, Dougherty EL, Doyle WI, Dudoit S, Fang R, Felix V, Fong O, Giglio M, Goldy J, Hawrylycz M, Herb BR, Hertzano R, Hou X, Hu Q, Kancherla J, Kroll M, Lathia K, Li YE, Lucero JD, Luo C, Mahurkar A, McMillen D, Nadaf NM, Nery JR, Nguyen TN, Niu SY, Ntranos V, Orvis J, Osteen JK, Pham T, Pinto-Duarte A, Poirion O, Preissl S, Purdom E, Rimorin C, Risso D, Rivkin AC, Smith K, Street K, Sulc J, Svensson V, Tieu M, Torkelson A, Tung H, Vaishnav ED, Vanderburg CR, van Velthoven C, Wang X, White OR, Huang ZJ, Kharchenko PV, Pachter L, Ngai J, Regev A, Tasic B, Welch JD, Gillis J, Macosko EZ, Ren B, Ecker JR, Zeng H, Mukamel EA. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 2021; 598:103-110. [PMID: 34616066 PMCID: PMC8494649 DOI: 10.1038/s41586-021-03500-8] [Citation(s) in RCA: 113] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 03/26/2021] [Indexed: 12/30/2022]
Abstract
Single-cell transcriptomics can provide quantitative molecular signatures for large, unbiased samples of the diverse cell types in the brain1-3. With the proliferation of multi-omics datasets, a major challenge is to validate and integrate results into a biological understanding of cell-type organization. Here we generated transcriptomes and epigenomes from more than 500,000 individual cells in the mouse primary motor cortex, a structure that has an evolutionarily conserved role in locomotion. We developed computational and statistical methods to integrate multimodal data and quantitatively validate cell-type reproducibility. The resulting reference atlas-containing over 56 neuronal cell types that are highly replicable across analysis methods, sequencing technologies and modalities-is a comprehensive molecular and genomic account of the diverse neuronal and non-neuronal cell types in the mouse primary motor cortex. The atlas includes a population of excitatory neurons that resemble pyramidal cells in layer 4 in other cortical regions4. We further discovered thousands of concordant marker genes and gene regulatory elements for these cell types. Our results highlight the complex molecular regulation of cell types in the brain and will directly enable the design of reagents to target specific cell types in the mouse primary motor cortex for functional analysis.
Collapse
Affiliation(s)
- Zizhen Yao
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Hanqing Liu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Fangming Xie
- Department of Physics, University of California, San Diego, La Jolla, CA, USA
| | - Stephan Fischer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Ricky S Adkins
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Andrew I Aldridge
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Seth A Ament
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Anna Bartlett
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - M Margarita Behrens
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Koen Van den Berge
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
| | | | - Hector Roux de Bézieux
- Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA, USA
| | | | | | - Héctor Corrada Bravo
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA
| | | | - Carlo Colantuoni
- Johns Hopkins School of Medicine, Department of Neurology, Baltimore, MD, USA
- Johns Hopkins School of Medicine, Department of Neuroscience, Baltimore, MD, USA
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Jonathan Crabtree
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Heather Creasy
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Megan Crow
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Nick Dee
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Wayne I Doyle
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Sandrine Dudoit
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Rongxin Fang
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, San Diego, CA, USA
| | - Victor Felix
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Olivia Fong
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jeff Goldy
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Brian R Herb
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ronna Hertzano
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Otorhinolaryngology, Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Xiaomeng Hou
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Qiwen Hu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jayaram Kancherla
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA
| | | | - Kanan Lathia
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Yang Eric Li
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Jacinta D Lucero
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Chongyuan Luo
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Anup Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Naeem M Nadaf
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph R Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Sheng-Yong Niu
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Vasilis Ntranos
- University of California, San Francisco, San Francisco, CA, USA
| | - Joshua Orvis
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Julia K Osteen
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Thanh Pham
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Antonio Pinto-Duarte
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Olivier Poirion
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Sebastian Preissl
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Elizabeth Purdom
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | | | - Davide Risso
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Angeline C Rivkin
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Kelly Street
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Josef Sulc
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Michael Tieu
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | - Herman Tung
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | | | | | - Xinxin Wang
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Owen R White
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Z Josh Huang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Peter V Kharchenko
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Lior Pachter
- California Institute of Technology, Pasadena, CA, USA
| | - John Ngai
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Aviv Regev
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Department of Biology, MIT, Cambridge, MA, USA
| | | | - Joshua D Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Bing Ren
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA, USA.
| | - Eran A Mukamel
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
10
|
Callaway EM, Dong HW, Ecker JR, Hawrylycz MJ, Huang ZJ, Lein ES, Ngai J, Osten P, Ren B, Tolias AS, White O, Zeng H, Zhuang X, Ascoli GA, Behrens MM, Chun J, Feng G, Gee JC, Ghosh SS, Halchenko YO, Hertzano R, Lim BK, Martone ME, Ng L, Pachter L, Ropelewski AJ, Tickle TL, Yang XW, Zhang K, Bakken TE, Berens P, Daigle TL, Harris JA, Jorstad NL, Kalmbach BE, Kobak D, Li YE, Liu H, Matho KS, Mukamel EA, Naeemi M, Scala F, Tan P, Ting JT, Xie F, Zhang M, Zhang Z, Zhou J, Zingg B, Armand E, Yao Z, Bertagnolli D, Casper T, Crichton K, Dee N, Diep D, Ding SL, Dong W, Dougherty EL, Fong O, Goldman M, Goldy J, Hodge RD, Hu L, Keene CD, Krienen FM, Kroll M, Lake BB, Lathia K, Linnarsson S, Liu CS, Macosko EZ, McCarroll SA, McMillen D, Nadaf NM, Nguyen TN, Palmer CR, Pham T, Plongthongkum N, Reed NM, Regev A, Rimorin C, Romanow WJ, Savoia S, Siletti K, Smith K, Sulc J, Tasic B, Tieu M, Torkelson A, Tung H, van Velthoven CTJ, Vanderburg CR, Yanny AM, Fang R, Hou X, Lucero JD, Osteen JK, Pinto-Duarte A, Poirion O, Preissl S, Wang X, Aldridge AI, Bartlett A, Boggeman L, O’Connor C, Castanon RG, Chen H, Fitzpatrick C, Luo C, Nery JR, Nunn M, Rivkin AC, Tian W, Dominguez B, Ito-Cole T, Jacobs M, Jin X, Lee CT, Lee KF, Miyazaki PA, Pang Y, Rashid M, Smith JB, Vu M, Williams E, Biancalani T, Booeshaghi AS, Crow M, Dudoit S, Fischer S, Gillis J, Hu Q, Kharchenko PV, Niu SY, Ntranos V, Purdom E, Risso D, de Bézieux HR, Somasundaram S, Street K, Svensson V, Vaishnav ED, Van den Berge K, Welch JD, An X, Bateup HS, Bowman I, Chance RK, Foster NN, Galbavy W, Gong H, Gou L, Hatfield JT, Hintiryan H, Hirokawa KE, Kim G, Kramer DJ, Li A, Li X, Luo Q, Muñoz-Castañeda R, Stafford DA, Feng Z, Jia X, Jiang S, Jiang T, Kuang X, Larsen R, Lesnar P, Li Y, Li Y, Liu L, Peng H, Qu L, Ren M, Ruan Z, Shen E, Song Y, Wakeman W, Wang P, Wang Y, Wang Y, Yin L, Yuan J, Zhao S, Zhao X, Narasimhan A, Palaniswamy R, Banerjee S, Ding L, Huilgol D, Huo B, Kuo HC, Laturnus S, Li X, Mitra PP, Mizrachi J, Wang Q, Xie P, Xiong F, Yu Y, Eichhorn SW, Berg J, Bernabucci M, Bernaerts Y, Cadwell CR, Castro JR, Dalley R, Hartmanis L, Horwitz GD, Jiang X, Ko AL, Miranda E, Mulherkar S, Nicovich PR, Owen SF, Sandberg R, Sorensen SA, Tan ZH, Allen S, Hockemeyer D, Lee AY, Veldman MB, Adkins RS, Ament SA, Bravo HC, Carter R, Chatterjee A, Colantuoni C, Crabtree J, Creasy H, Felix V, Giglio M, Herb BR, Kancherla J, Mahurkar A, McCracken C, Nickel L, Olley D, Orvis J, Schor M, Hood G, Dichter B, Grauer M, Helba B, Bandrowski A, Barkas N, Carlin B, D’Orazi FD, Degatano K, Gillespie TH, Khajouei F, Konwar K, Thompson C, Kelly K, Mok S, Sunkin S. A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 2021; 598:86-102. [PMID: 34616075 PMCID: PMC8494634 DOI: 10.1038/s41586-021-03950-0] [Citation(s) in RCA: 205] [Impact Index Per Article: 68.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 08/25/2021] [Indexed: 12/14/2022]
Abstract
Here we report the generation of a multimodal cell census and atlas of the mammalian primary motor cortex as the initial product of the BRAIN Initiative Cell Census Network (BICCN). This was achieved by coordinated large-scale analyses of single-cell transcriptomes, chromatin accessibility, DNA methylomes, spatially resolved single-cell transcriptomes, morphological and electrophysiological properties and cellular resolution input-output mapping, integrated through cross-modal computational analysis. Our results advance the collective knowledge and understanding of brain cell-type organization1-5. First, our study reveals a unified molecular genetic landscape of cortical cell types that integrates their transcriptome, open chromatin and DNA methylation maps. Second, cross-species analysis achieves a consensus taxonomy of transcriptomic types and their hierarchical organization that is conserved from mouse to marmoset and human. Third, in situ single-cell transcriptomics provides a spatially resolved cell-type atlas of the motor cortex. Fourth, cross-modal analysis provides compelling evidence for the transcriptomic, epigenomic and gene regulatory basis of neuronal phenotypes such as their physiological and anatomical properties, demonstrating the biological validity and genomic underpinning of neuron types. We further present an extensive genetic toolset for targeting glutamatergic neuron types towards linking their molecular and developmental identity to their circuit function. Together, our results establish a unifying and mechanistic framework of neuronal cell-type organization that integrates multi-layered molecular genetic and spatial information with multi-faceted phenotypic properties.
Collapse
|
11
|
Booeshaghi AS, Kil Y(A, Min KH(J, Gehring J, Pachter L. Low-cost, scalable, and automated fluid sampling for fluidics applications. HardwareX 2021; 10:e00201. [PMID: 35607693 PMCID: PMC9123361 DOI: 10.1016/j.ohx.2021.e00201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 05/03/2021] [Accepted: 05/08/2021] [Indexed: 06/14/2023]
Abstract
We present colosseum, a low-cost, modular, and automated fluid sampling device for scalable fluidic applications. The colosseum fraction collector uses a single motor, can be built for less than $100 using off-the-shelf and 3D-printed components, and can be assembled in less than an hour. Build Instructions and source files are available at https://doi.org/10.5281/zenodo.4677604.
Collapse
Affiliation(s)
- A. Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Yeokyoung (Anne) Kil
- Department of Medical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Kyung Hoi (Joseph) Min
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jase Gehring
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
12
|
Booeshaghi AS, Yao Z, van Velthoven C, Smith K, Tasic B, Zeng H, Pachter L. Isoform cell-type specificity in the mouse primary motor cortex. Nature 2021; 598:195-199. [PMID: 34616073 PMCID: PMC8494650 DOI: 10.1038/s41586-021-03969-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 08/27/2021] [Indexed: 12/17/2022]
Abstract
Full-length SMART-seq1 single-cell RNA sequencing can be used to measure gene expression at isoform resolution, making possible the identification of specific isoform markers for different cell types. Used in conjunction with spatial RNA capture and gene-tagging methods, this enables the inference of spatially resolved isoform expression for different cell types. Here, in a comprehensive analysis of 6,160 mouse primary motor cortex cells assayed with SMART-seq, 280,327 cells assayed with MERFISH2 and 94,162 cells assayed with 10x Genomics sequencing3, we find examples of isoform specificity in cell types-including isoform shifts between cell types that are masked in gene-level analysis-as well as examples of transcriptional regulation. Additionally, we show that isoform specificity helps to refine cell types, and that a multi-platform analysis of single-cell transcriptomic data leveraging multiple measurements provides a comprehensive atlas of transcription in the mouse primary motor cortex that improves on the possibilities offered by any single technology.
Collapse
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Zizhen Yao
- Allen Institute for Brain Science, Seattle, WA, USA
| | | | | | | | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
13
|
Booeshaghi AS, Pachter L. Normalization of single-cell RNA-seq counts by log(x + 1)† or log(1 + x)†. Bioinformatics 2021; 37:2223-2224. [PMID: 33676365 PMCID: PMC7989636 DOI: 10.1093/bioinformatics/btab085] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 12/23/2020] [Accepted: 03/01/2021] [Indexed: 01/25/2023] Open
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
14
|
Bloom JS, Sathe L, Munugala C, Jones EM, Gasperini M, Lubock NB, Yarza F, Thompson EM, Kovary KM, Park J, Marquette D, Kay S, Lucas M, Love T, Sina Booeshaghi A, Brandenberg OF, Guo L, Boocock J, Hochman M, Simpkins SW, Lin I, LaPierre N, Hong D, Zhang Y, Oland G, Choe BJ, Chandrasekaran S, Hilt EE, Butte MJ, Damoiseaux R, Kravit C, Cooper AR, Yin Y, Pachter L, Garner OB, Flint J, Eskin E, Luo C, Kosuri S, Kruglyak L, Arboleda VA. Massively scaled-up testing for SARS-CoV-2 RNA via next-generation sequencing of pooled and barcoded nasal and saliva samples. Nat Biomed Eng 2021; 5:657-665. [PMID: 34211145 PMCID: PMC10810734 DOI: 10.1038/s41551-021-00754-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 05/20/2021] [Indexed: 02/02/2023]
Abstract
Frequent and widespread testing of members of the population who are asymptomatic for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is essential for the mitigation of the transmission of the virus. Despite the recent increases in testing capacity, tests based on quantitative polymerase chain reaction (qPCR) assays cannot be easily deployed at the scale required for population-wide screening. Here, we show that next-generation sequencing of pooled samples tagged with sample-specific molecular barcodes enables the testing of thousands of nasal or saliva samples for SARS-CoV-2 RNA in a single run without the need for RNA extraction. The assay, which we named SwabSeq, incorporates a synthetic RNA standard that facilitates end-point quantification and the calling of true negatives, and that reduces the requirements for automation, purification and sample-to-sample normalization. We used SwabSeq to perform 80,000 tests, with an analytical sensitivity and specificity comparable to or better than traditional qPCR tests, in less than two months with turnaround times of less than 24 h. SwabSeq could be rapidly adapted for the detection of other pathogens.
Collapse
Affiliation(s)
- Joshua S Bloom
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
- Octant Inc., Emeryville, CA, USA.
| | - Laila Sathe
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Chetan Munugala
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | | | | | | | | | | | | | - Dawn Marquette
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Stephania Kay
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Mark Lucas
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - TreQuan Love
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | | | - Oliver F Brandenberg
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Biological Chemistry, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Longhua Guo
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Biological Chemistry, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - James Boocock
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Biological Chemistry, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | | | | | - Isabella Lin
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Nathan LaPierre
- Department of Computer Science, Samueli School of Engineering, UCLA, Los Angeles, CA, USA
| | - Duke Hong
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Yi Zhang
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Gabriel Oland
- Department of Surgery, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Bianca Judy Choe
- Department of Emergency Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Sukantha Chandrasekaran
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Evann E Hilt
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Manish J Butte
- Department of Pediatrics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Microbiology, Immunology & Molecular Genetics, UCLA, Los Angeles, CA, USA
| | - Robert Damoiseaux
- California NanoSystems Institute, UCLA, Los Angeles, CA, USA
- Department of Bioengineering, Samueli School of Engineering, UCLA, Los Angeles, CA, USA
- Department of Medical and Molecular Pharmacology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Clifford Kravit
- Department of Digital Technology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | | | - Yi Yin
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Lior Pachter
- Division of Biology and Bioengineering, Department of Computing and Mathematical Sciences, Caltech, Pasadena, CA, USA
| | - Omai B Garner
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Jonathan Flint
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Eleazar Eskin
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Computer Science, Samueli School of Engineering, UCLA, Los Angeles, CA, USA
| | - Chongyuan Luo
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Sriram Kosuri
- Octant Inc., Emeryville, CA, USA.
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, USA.
| | - Leonid Kruglyak
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
- Department of Biological Chemistry, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
| | - Valerie A Arboleda
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
15
|
Bloom JS, Sathe L, Munugala C, Jones EM, Gasperini M, Lubock NB, Yarza F, Thompson EM, Kovary KM, Park J, Marquette D, Kay S, Lucas M, Love T, Booeshaghi AS, Brandenberg OF, Guo L, Boocock J, Hochman M, Simpkins SW, Lin I, LaPierre N, Hong D, Zhang Y, Oland G, Choe BJ, Chandrasekaran S, Hilt EE, Butte MJ, Damoiseaux R, Kravit C, Cooper AR, Yin Y, Pachter L, Garner OB, Flint J, Eskin E, Luo C, Kosuri S, Kruglyak L, Arboleda VA. Swab-Seq: A high-throughput platform for massively scaled up SARS-CoV-2 testing. medRxiv 2021. [PMID: 32909008 PMCID: PMC7480060 DOI: 10.1101/2020.08.04.20167874] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The rapid spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is due to the high rates of transmission by individuals who are asymptomatic at the time of transmission1,2. Frequent, widespread testing of the asymptomatic population for SARS-CoV-2 is essential to suppress viral transmission. Despite increases in testing capacity, multiple challenges remain in deploying traditional reverse transcription and quantitative PCR (RT-qPCR) tests at the scale required for population screening of asymptomatic individuals. We have developed SwabSeq, a high-throughput testing platform for SARS-CoV-2 that uses next-generation sequencing as a readout. SwabSeq employs sample-specific molecular barcodes to enable thousands of samples to be combined and simultaneously analyzed for the presence or absence of SARS-CoV-2 in a single run. Importantly, SwabSeq incorporates an in vitro RNA standard that mimics the viral amplicon, but can be distinguished by sequencing. This standard allows for end-point rather than quantitative PCR, improves quantitation, reduces requirements for automation and sample-to-sample normalization, enables purification-free detection, and gives better ability to call true negatives. After setting up SwabSeq in a high-complexity CLIA laboratory, we performed more than 80,000 tests for COVID-19 in less than two months, confirming in a real world setting that SwabSeq inexpensively delivers highly sensitive and specific results at scale, with a turn-around of less than 24 hours. Our clinical laboratory uses SwabSeq to test both nasal and saliva samples without RNA extraction, while maintaining analytical sensitivity comparable to or better than traditional RT-qPCR tests. Moving forward, SwabSeq can rapidly scale up testing to mitigate devastating spread of novel pathogens.
Collapse
Affiliation(s)
- Joshua S Bloom
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI.,Octant, Inc
| | - Laila Sathe
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA
| | - Chetan Munugala
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI
| | | | | | | | | | | | | | | | - Dawn Marquette
- Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | - Stephania Kay
- Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | - Mark Lucas
- Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | - TreQuan Love
- Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | | | - Oliver F Brandenberg
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI.,Department of Biological Chemistry, David Geffen School of Medicine, UCLA
| | - Longhua Guo
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI.,Department of Biological Chemistry, David Geffen School of Medicine, UCLA
| | - James Boocock
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI.,Department of Biological Chemistry, David Geffen School of Medicine, UCLA
| | | | | | - Isabella Lin
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA
| | - Nathan LaPierre
- Department of Computer Science, Samueli School of Engineering, UCLA
| | - Duke Hong
- Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | - Yi Zhang
- Department of Human Genetics, David Geffen School of Medicine, UCLA
| | - Gabriel Oland
- Department of Surgery, David Geffen School of Medicine, UCLA
| | - Bianca Judy Choe
- Department of Emergency Medicine, David Geffen School of Medicine, UCLA
| | | | - Evann E Hilt
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA
| | - Manish J Butte
- Department of Pediatrics, David Geffen School of Medicine, UCLA.,Department of Microbiology, Immunology & Molecular Genetics, David Geffen School of Medicine, UCLA
| | - Robert Damoiseaux
- California NanoSystems Institute, UCLA.,Department of Bioengineering, Samueli School of Engineering, UCLA.,David Geffen School of Medicine, Research Information Technology
| | - Clifford Kravit
- David Geffen School of Medicine, Research Information Technology
| | | | - Yi Yin
- Department of Human Genetics, David Geffen School of Medicine, UCLA
| | - Lior Pachter
- Division of Biology and Bioengineering & Department of Computing and Mathematical Sciences, Caltech
| | - Omai B Garner
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA
| | - Jonathan Flint
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, UCLA
| | - Eleazar Eskin
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Department of Computer Science, Samueli School of Engineering, UCLA.,Department of Computational Medicine, David Geffen School of Medicine, UCLA
| | - Chongyuan Luo
- Department of Human Genetics, David Geffen School of Medicine, UCLA
| | - Sriram Kosuri
- Octant, Inc.,Department of Chemistry and Biochemistry, UCLA
| | - Leonid Kruglyak
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Howard Hughes Medical Institute, HHMI.,Department of Biological Chemistry, David Geffen School of Medicine, UCLA
| | - Valerie A Arboleda
- Department of Human Genetics, David Geffen School of Medicine, UCLA.,Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA
| |
Collapse
|
16
|
Booeshaghi AS, Lubock NB, Cooper AR, Simpkins SW, Bloom JS, Gehring J, Luebbert L, Kosuri S, Pachter L. Reliable and accurate diagnostics from highly multiplexed sequencing assays. Sci Rep 2020; 10:21759. [PMID: 33303831 PMCID: PMC7730459 DOI: 10.1038/s41598-020-78942-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 11/24/2020] [Indexed: 11/09/2022] Open
Abstract
Scalable, inexpensive, and secure testing for SARS-CoV-2 infection is crucial for control of the novel coronavirus pandemic. Recently developed highly multiplexed sequencing assays (HMSAs) that rely on high-throughput sequencing can, in principle, meet these demands, and present promising alternatives to currently used RT-qPCR-based tests. However, reliable analysis, interpretation, and clinical use of HMSAs requires overcoming several computational, statistical and engineering challenges. Using recently acquired experimental data, we present and validate a computational workflow based on kallisto and bustools, that utilizes robust statistical methods and fast, memory efficient algorithms, to quickly, accurately and reliably process high-throughput sequencing data. We show that our workflow is effective at processing data from all recently proposed SARS-CoV-2 sequencing based diagnostic tests, and is generally applicable to any diagnostic HMSA.
Collapse
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, USA
| | | | | | | | - Joshua S Bloom
- Octant Inc., Emeryville, CA, USA.,Department of Human Genetics, University of California, Los Angeles, Los Angeles, USA
| | - Jase Gehring
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | | | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA. .,Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
17
|
Booeshaghi AS, Beltrame EDV, Bannon D, Gehring J, Pachter L. Principles of open source bioinstrumentation applied to the poseidon syringe pump system. Sci Rep 2019; 9:12385. [PMID: 31455877 PMCID: PMC6711986 DOI: 10.1038/s41598-019-48815-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 08/08/2019] [Indexed: 12/18/2022] Open
Abstract
The poseidon syringe pump and microscope system is an open source alternative to commercial systems. It costs less than $400 and can be assembled in under an hour using the instructions and source files available at https://pachterlab.github.io/poseidon . We describe the poseidon system and use it to illustrate design principles that can facilitate the adoption and development of open source bioinstruments. The principles are functionality, robustness, safety, simplicity, modularity, benchmarking, and documentation.
Collapse
Affiliation(s)
- A Sina Booeshaghi
- Department of Mechanical Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Eduardo da Veiga Beltrame
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Dylan Bannon
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Jase Gehring
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
| | - Lior Pachter
- Department of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.
- Department of Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125, USA.
| |
Collapse
|