51
|
Abstract
High-throughput single-cell technologies provide an unprecedented view into cellular heterogeneity, yet they pose new challenges in data analysis and interpretation. In this protocol, we describe the use of Spanning-tree Progression Analysis of Density-normalized Events (SPADE), a density-based algorithm for visualizing single-cell data and enabling cellular hierarchy inference among subpopulations of similar cells. It was initially developed for flow and mass cytometry single-cell data. We describe SPADE's implementation and application using an open-source R package that runs on Mac OS X, Linux and Windows systems. A typical SPADE analysis on a 2.27-GHz processor laptop takes ∼5 min. We demonstrate the applicability of SPADE to single-cell RNA-seq data. We compare SPADE with recently developed single-cell visualization approaches based on the t-distribution stochastic neighborhood embedding (t-SNE) algorithm. We contrast the implementation and outputs of these methods for normal and malignant hematopoietic cells analyzed by mass cytometry and provide recommendations for appropriate use. Finally, we provide an integrative strategy that combines the strengths of t-SNE and SPADE to infer cellular hierarchy from high-dimensional single-cell data.
Collapse
|
52
|
Paquette ST, Gilels F, White PM. Noise exposure modulates cochlear inner hair cell ribbon volumes, correlating with changes in auditory measures in the FVB/nJ mouse. Sci Rep 2016; 6:25056. [PMID: 27162161 PMCID: PMC4861931 DOI: 10.1038/srep25056] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 04/08/2016] [Indexed: 12/25/2022] Open
Abstract
Cochlear neuropathy resulting from unsafe noise exposure is a life altering condition that affects many people. This hearing dysfunction follows a conserved mechanism where inner hair cell synapses are lost, termed cochlear synaptopathy. Here we investigate cochlear synaptopathy in the FVB/nJ mouse strain as a prelude for the investigation of candidate genetic mutations for noise damage susceptibility. We used measurements of auditory brainstem response (ABR) and distortion product otoacoustic emissions (DPOAE) to assess hearing recovery in FVB/nJ mice exposed to two different noise levels. We also utilized confocal fluorescence microscopy in mapped whole mount cochlear tissue, in conjunction with deconvolution and three-dimensional modeling, to analyze numbers, volumes and positions of paired synaptic components. We find evidence for significant synapse reorganization in response to both synaptopathic and sub-synaptopathic noise exposures in FVB/nJ. Specifically, we find that the modulation in volume of very small synaptic ribbons correlates with the presence of reduced ABR peak one amplitudes in both levels of noise exposures. These experiments define the use of FVB/nJ mice for further genetic investigations into the mechanisms of noise damage. They further suggest that in the cochlea, neuronal-inner hair cell connections may dynamically reshape as part of the noise response.
Collapse
Affiliation(s)
- Stephen T Paquette
- Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Box 603, 601 Elmwood Avenue, Rochester, NY, 14642, USA
| | - Felicia Gilels
- Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Box 603, 601 Elmwood Avenue, Rochester, NY, 14642, USA
| | - Patricia M White
- Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Box 603, 601 Elmwood Avenue, Rochester, NY, 14642, USA
| |
Collapse
|
53
|
Roychoudhury P, De Silva Feelixge HS, Pietz HL, Stone D, Jerome KR, Schiffer JT. Pharmacodynamics of anti-HIV gene therapy using viral vectors and targeted endonucleases. J Antimicrob Chemother 2016; 71:2089-99. [PMID: 27090632 DOI: 10.1093/jac/dkw104] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Accepted: 02/29/2016] [Indexed: 12/14/2022] Open
Abstract
OBJECTIVES A promising curative approach for HIV is to use designer endonucleases that bind and cleave specific target sequences within latent genomes, resulting in mutations that render the virus replication incompetent. We developed a mathematical model to describe the expression and activity of endonucleases delivered to HIV-infected cells using engineered viral vectors in order to guide dose selection and predict therapeutic outcomes. METHODS We developed a mechanistic model that predicts the number of transgene copies expressed at a given dose in individual target cells from fluorescence of a reporter gene. We fitted the model to flow cytometry datasets to determine the optimal vector serotype, promoter and dose required to achieve maximum expression. RESULTS We showed that our model provides a more accurate measure of transduction efficiency compared with gating-based methods, which underestimate the percentage of cells expressing reporter genes. We identified that gene expression follows a sigmoid dose-response relationship and that the level of gene expression saturation depends on vector serotype and promoter. We also demonstrated that significant bottlenecks exist at the level of viral uptake and gene expression: only ∼1 in 220 added vectors enter a cell and, of these, depending on the dose and promoter used, between 1 in 15 and 1 in 1500 express transgene. CONCLUSIONS Our model provides a quantitative method of dose selection and optimization that can be readily applied to a wide range of other gene therapy applications. Reducing bottlenecks in delivery will be key to reducing the number of doses required for a functional cure.
Collapse
Affiliation(s)
- Pavitra Roychoudhury
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Harlan L Pietz
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of Microbiology, University of Washington, Seattle, WA, USA Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Daniel Stone
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Keith R Jerome
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of Microbiology, University of Washington, Seattle, WA, USA Department of Laboratory Medicine, University of Washington, Seattle, WA, USA
| | - Joshua T Schiffer
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of Medicine, University of Washington, Seattle, WA, USA
| |
Collapse
|
54
|
Zimmerman KCK, Levitis DA, Pringle A. Beyond animals and plants: dynamic maternal effects in the fungus Neurospora crassa. J Evol Biol 2016; 29:1379-93. [PMID: 27062053 DOI: 10.1111/jeb.12878] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Accepted: 04/05/2016] [Indexed: 11/28/2022]
Abstract
Maternal effects are widely documented in animals and plants, but not in fungi or other eukaryotes. A principal cause of maternal effects is asymmetrical parental investment in a zygote, creating greater maternal vs. paternal influence on offspring phenotypes. Asymmetrical investments are not limited to animals and plants, but are also prevalent in fungi and groups including apicomplexans, dinoflagellates and red algae. Evidence suggesting maternal effects among fungi is sparse and anecdotal. In an experiment designed to test for maternal effects across sexual reproduction in the model fungus Neurospora crassa, we measured offspring phenotypes from crosses of all possible pairs of 22 individuals. Crosses encompassed reciprocals of 11 mating-type 'A' and 11 mating-type 'a' wild strains. After controlling for the genetic and geographic distances between strains in any individual cross, we found strong evidence for maternal control of perithecia (sporocarp) production, as well as maternal effects on spore numbers and spore germination. However, both parents exert equal influence on the percentage of spores that are pigmented and size of pigmented spores. We propose a model linking the stage-specific presence or absence of maternal effects to cellular developmental processes: effects appear to be mediated primarily through the maternal cytoplasm, and, after spore cell walls form, maternal influence on spore development is limited. Maternal effects in fungi, thus far largely ignored, are likely to shape species' evolution and ecologies. Moreover, the association of anisogamy and maternal effects in a fungus suggests maternal effects may also influence the biology of other anisogamous eukaryotes.
Collapse
Affiliation(s)
- K C K Zimmerman
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - D A Levitis
- Department of Biology, Bates College, Lewiston, ME, USA
| | - A Pringle
- Departments of Botany and Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
55
|
Gondois-Rey F, Granjeaud S, Rouillier P, Rioualen C, Bidaut G, Olive D. Multi-parametric cytometry from a complex cellular sample: Improvements and limits of manual versus computational-based interactive analyses. Cytometry A 2016; 89:480-90. [PMID: 27059253 DOI: 10.1002/cyto.a.22850] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Revised: 02/18/2016] [Accepted: 03/08/2016] [Indexed: 01/07/2023]
Abstract
The wide possibilities opened by the developments of multi-parametric cytometry are limited by the inadequacy of the classical methods of analysis to the multi-dimensional characteristics of the data. While new computational tools seemed ideally adapted and were applied successfully, their adoption is still low among the flow cytometrists. In the purpose to integrate unsupervised computational tools for the management of multi-stained samples, we investigated their advantages and limits by comparison to manual gating on a typical sample analyzed in immunomonitoring routine. A single tube of PBMC, containing 11 populations characterized by different sizes and stained with 9 fluorescent markers, was used. We investigated the impact of the strategy choice on manual gating variability, an undocumented pitfall of the analysis process, and we identified rules to optimize it. While assessing automatic gating as an alternate, we introduced the Multi-Experiment Viewer software (MeV) and validated it for merging clusters and annotating interactively populations. This procedure allowed the finding of both targeted and unexpected populations. However, the careful examination of computed clusters in standard dot plots revealed some heterogeneity, often below 10%, that was overcome by increasing the number of clusters to be computed. MeV facilitated the identification of populations by displaying both the MFI and the marker signature of the dataset simultaneously. The procedure described here appears fully adapted to manage homogeneously high number of multi-stained samples and allows improving multi-parametric analyses in a way close to the classic approach. © 2016 International Society for Advancement of Cytometry.
Collapse
Affiliation(s)
- F Gondois-Rey
- Team Immunity and Cancer, Inserm, U1068, CRCM, Marseille, F-13009, France.,Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France
| | - S Granjeaud
- Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France.,CiBi Platform, Inserm, U1068, CRCM, Marseille, F-13009, France
| | - P Rouillier
- Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France.,CiBi Platform, Inserm, U1068, CRCM, Marseille, F-13009, France
| | - C Rioualen
- Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France.,CiBi Platform, Inserm, U1068, CRCM, Marseille, F-13009, France
| | - G Bidaut
- Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France.,CiBi Platform, Inserm, U1068, CRCM, Marseille, F-13009, France
| | - D Olive
- Team Immunity and Cancer, Inserm, U1068, CRCM, Marseille, F-13009, France.,Institut Paoli-Calmettes, Marseille, F-13009, France.,Aix-Marseille Univ, UM 105, Marseille, F-13284, France.,CNRS, UMR7258, CRCM, Marseille, F-13009, France
| |
Collapse
|
56
|
Metzger BPH, Duveau F, Yuan DC, Tryban S, Yang B, Wittkopp PJ. Contrasting Frequencies and Effects of cis- and trans-Regulatory Mutations Affecting Gene Expression. Mol Biol Evol 2016; 33:1131-46. [PMID: 26782996 DOI: 10.1093/molbev/msw011] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Heritable differences in gene expression are caused by mutations in DNA sequences encoding cis-regulatory elements and trans-regulatory factors. These two classes of regulatory change differ in their relative contributions to expression differences in natural populations because of the combined effects of mutation and natural selection. Here, we investigate how new mutations create the regulatory variation upon which natural selection acts by quantifying the frequencies and effects of hundreds of new cis- and trans-acting mutations altering activity of the TDH3 promoter in the yeast Saccharomyces cerevisiae in the absence of natural selection. We find that cis-regulatory mutations have larger effects on expression than trans-regulatory mutations and that while trans-regulatory mutations are more common overall, cis- and trans-regulatory changes in expression are equally abundant when only the largest changes in expression are considered. In addition, we find that cis-regulatory mutations are skewed toward decreased expression while trans-regulatory mutations are skewed toward increased expression. We also measure the effects of cis- and trans-regulatory mutations on the variability in gene expression among genetically identical cells, a property of gene expression known as expression noise, finding that trans-regulatory mutations are much more likely to decrease expression noise than cis-regulatory mutations. Because new mutations are the raw material upon which natural selection acts, these differences in the frequencies and effects of cis- and trans-regulatory mutations should be considered in models of regulatory evolution.
Collapse
Affiliation(s)
- Brian P H Metzger
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| | - Fabien Duveau
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| | - David C Yuan
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor Department of Biology, Stanford University
| | - Stephen Tryban
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| | - Bing Yang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor
| | - Patricia J Wittkopp
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor
| |
Collapse
|
57
|
Stepwise discriminant function analysis for rapid identification of acute promyelocytic leukemia from acute myeloid leukemia with multiparameter flow cytometry. Int J Hematol 2016; 103:306-15. [DOI: 10.1007/s12185-015-1923-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Revised: 12/08/2015] [Accepted: 12/11/2015] [Indexed: 01/27/2023]
|
58
|
Abstract
Multi-color flow cytometry has become a valuable and highly informative tool for diagnosis and therapeutic monitoring of patients with immune deficiencies or inflammatory disorders. However, the method complexity and error-prone conventional manual data analysis often result in a high variability between different analysts and research laboratories. Here, we provide strategies and guidelines aiming at a more standardized multi-color flow cytometric staining and unsupervised data analysis for whole blood patient samples.
Collapse
|
59
|
Chretien AS, Granjeaud S, Gondois-Rey F, Harbi S, Orlanducci F, Blaise D, Vey N, Arnoulet C, Fauriat C, Olive D. Increased NK Cell Maturation in Patients with Acute Myeloid Leukemia. Front Immunol 2015; 6:564. [PMID: 26594214 PMCID: PMC4635854 DOI: 10.3389/fimmu.2015.00564] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 10/23/2015] [Indexed: 01/23/2023] Open
Abstract
Understanding immune alterations in cancer patients is a major challenge and requires precise phenotypic study of immune subsets. Improvement of knowledge regarding the biology of natural killer (NK) cells and technical advances leads to the generation of high dimensional dataset. High dimensional flow cytometry requires tools adapted to complex dataset analyses. This study presents an example of NK cell maturation analysis in Healthy Volunteers (HV) and patients with Acute Myeloid Leukemia (AML) with an automated procedure using the FLOCK algorithm. This procedure enabled to automatically identify NK cell subsets according to maturation profiles, with 2D mapping of a four-dimensional dataset. Differences were highlighted in AML patients compared to HV, with an overall increase of NK maturation. Among patients, a strong heterogeneity in NK cell maturation defined three distinct profiles. Overall, automatic gating with FLOCK algorithm is a recent procedure, which enables fast and reliable identification of cell populations from high-dimensional cytometry data. Such tools are necessary for immune subset characterization and standardization of data analyses. This tool is adapted to new immune cell subsets discovery, and may lead to a better knowledge of NK cell defects in cancer patients. Overall, 2D mapping of NK maturation profiles enabled fast and reliable identification of NK cell subsets.
Collapse
Affiliation(s)
- Anne-Sophie Chretien
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| | - Samuel Granjeaud
- Centre de Cancérologie de Marseille, Systems Biology Platform, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| | - Françoise Gondois-Rey
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France ; Centre de Cancérologie de Marseille, Plateforme d'Immunomonitoring en Cancérologie, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| | - Samia Harbi
- Hematology and Transplant and Cellular Therapy Department, Institut Paoli-Calmettes , Marseille , France
| | - Florence Orlanducci
- Centre de Cancérologie de Marseille, Systems Biology Platform, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| | - Didier Blaise
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France ; Hematology and Transplant and Cellular Therapy Department, Institut Paoli-Calmettes , Marseille , France
| | - Norbert Vey
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France ; Hematology Department, Institut Paoli-Calmettes , Marseille , France
| | - Christine Arnoulet
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France ; Biopathology Department, Institut Paoli Calmettes , Marseille , France
| | - Cyril Fauriat
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| | - Daniel Olive
- Centre de Cancérologie de Marseille, Team Immunity and Cancer, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France ; Centre de Cancérologie de Marseille, Systems Biology Platform, INSERM, U1068, Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, CNRS, UMR7258 , Marseille , France
| |
Collapse
|
60
|
Hyrkas J, Clayton S, Ribalet F, Halperin D, Armbrust EV, Howe B. Scalable clustering algorithms for continuous environmental flow cytometry. Bioinformatics 2015; 32:417-23. [PMID: 26476780 DOI: 10.1093/bioinformatics/btv594] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 10/12/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers. RESULTS We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data. AVAILABILITY AND IMPLEMENTATION Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. CONTACT hyrkas@cs.washington.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Daniel Halperin
- Department of Computer Science and Engineering, eScience Institute, University of Washington, Seattle, WA 98195, USA
| | - E Virginia Armbrust
- School of Oceanography and eScience Institute, University of Washington, Seattle, WA 98195, USA
| | - Bill Howe
- Department of Computer Science and Engineering, eScience Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
61
|
Rebhahn JA, Roumanes DR, Qi Y, Khan A, Thakar J, Rosenberg A, Lee FEH, Quataert SA, Sharma G, Mosmann TR. Competitive SWIFT cluster templates enhance detection of aging changes. Cytometry A 2015; 89:59-70. [PMID: 26441030 PMCID: PMC4737406 DOI: 10.1002/cyto.a.22740] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Revised: 04/21/2015] [Accepted: 08/05/2015] [Indexed: 12/17/2022]
Abstract
Clustering‐based algorithms for automated analysis of flow cytometry datasets have achieved more efficient and objective analysis than manual processing. Clustering organizes flow cytometry data into subpopulations with substantially homogenous characteristics but does not directly address the important problem of identifying the salient differences in subpopulations between subjects and groups. Here, we address this problem by augmenting SWIFT—a mixture model based clustering algorithm reported previously. First, we show that SWIFT clustering using a “template” mixture model, in which all subpopulations are represented, identifies small differences in cell numbers per subpopulation between samples. Second, we demonstrate that resolution of inter‐sample differences is increased by “competition” wherein a joint model is formed by combining the mixture model templates obtained from different groups. In the joint model, clusters from individual groups compete for the assignment of cells, sharpening differences between samples, particularly differences representing subpopulation shifts that are masked under clustering with a single template model. The benefit of competition was demonstrated first with a semisynthetic dataset obtained by deliberately shifting a known subpopulation within an actual flow cytometry sample. Single templates correctly identified changes in the number of cells in the subpopulation, but only the competition method detected small changes in median fluorescence. In further validation studies, competition identified a larger number of significantly altered subpopulations between young and elderly subjects. This enrichment was specific, because competition between templates from consensus male and female samples did not improve the detection of age‐related differences. Several changes between the young and elderly identified by SWIFT template competition were consistent with known alterations in the elderly, and additional altered subpopulations were also identified. Alternative algorithms detected far fewer significantly altered clusters. Thus SWIFT template competition is a powerful approach to sharpen comparisons between selected groups in flow cytometry datasets. © 2015 The Authors. Published Wiley Periodicals Inc.
Collapse
Affiliation(s)
- Jonathan A Rebhahn
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, New York
| | - David R Roumanes
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, New York
| | - Yilin Qi
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, New York
| | - Atif Khan
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York
| | - Juilee Thakar
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York.,Department of Microbiology and Immunology, University of Rochester
| | | | - F Eun-Hyung Lee
- Department of Medicine, Emory University School of Medicine, Atlanta, Georgia
| | - Sally A Quataert
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, New York
| | - Gaurav Sharma
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York.,Department of Electrical and Computer Engineering, University of Rochester
| | - Tim R Mosmann
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, New York.,Department of Microbiology and Immunology, University of Rochester
| |
Collapse
|
62
|
Amariei C, Machné R, Sasidharan K, Gottstein W, Tomita M, Soga T, Lloyd D, Murray DB. The dynamics of cellular energetics during continuous yeast culture. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2013:2708-11. [PMID: 24110286 DOI: 10.1109/embc.2013.6610099] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A plethora of data is accumulating from high throughput methods on metabolites, coenzymes, proteins, and nucleic acids and their interactions as well as the signalling and regulatory functions and pathways of the cellular network. The frozen moment viewed in a single discrete time sample requires frequent repetition and updating before any appreciation of the dynamics of component interaction becomes possible. Even then in a sample derived from a cell population, time-averaging of processes and events that occur in out-of-phase individuals blur the detailed complexity of single cell organization. Continuously-grown cultures of yeast can become spontaneously self-synchronized, thereby enabling resolution of far more detailed temporal structure. Continuous on-line monitoring by rapidly responding sensors (O2 electrode and membrane-inlet mass spectrometry for O2, CO2 and H2S; direct fluorimetry for NAD(P)H and flavins) gives dynamic information from time-scales of minutes to hours. Supplemented with capillary electophoresis and gas chromatography mass spectrometry and transcriptomics the predominantly oscillatory behaviour of network components becomes evident, with a 40 min cycle between a phase of increased respiration (oxidative phase) and decreased respiration (reductive phase). Highly pervasive, this ultradian clock provides a coordinating function that links mitochondrial energetics and redox balance to transcriptional regulation, mitochondrial structure and organelle remodelling, DNA duplication and cell division events. Ultimately, this leads to a global partitioning of anabolism and catabolism and the enzymes involved, mediated by a relatively simple ATP feedback loop on chromatin architecture.
Collapse
|
63
|
Hsiao C, Liu M, Stanton R, McGee M, Qian Y, Scheuermann RH. Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure. Cytometry A 2015; 89:71-88. [PMID: 26274018 PMCID: PMC5014134 DOI: 10.1002/cyto.a.22735] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Revised: 04/26/2015] [Accepted: 07/22/2015] [Indexed: 12/05/2022]
Abstract
Flow cytometry (FCM) is a fluorescence‐based single‐cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap‐FR, a novel method for cell population mapping across FCM samples. FlowMap‐FR is based on the Friedman–Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap‐FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap‐FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap‐FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap‐FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap‐FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback–Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL‐distance in distinguishing equivalent from nonequivalent cell populations. FlowMap‐FR was also employed as a distance metric to match cell populations delineated by manual gating across 30 FCM samples from a benchmark FlowCAP data set. An F‐measure of 0.88 was obtained, indicating high precision and recall of the FR‐based population matching results. FlowMap‐FR has been implemented as a standalone R/Bioconductor package so that it can be easily incorporated into current FCM data analytical workflows. © 2015 International Society for Advancement of Cytometry
Collapse
Affiliation(s)
- Chiaowen Hsiao
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland.,Applied Mathematics, Applied Statistics, and Scientific Computing, University of Maryland, College Park, Maryland
| | - Mengya Liu
- Department of Statistical Science, Southern Methodist University, Dallas, Texas
| | - Rick Stanton
- Department of Informatics, J. Craig Venter Institute, La Jolla, California
| | - Monnie McGee
- Department of Statistical Science, Southern Methodist University, Dallas, Texas
| | - Yu Qian
- Department of Informatics, J. Craig Venter Institute, La Jolla, California
| | - Richard H Scheuermann
- Department of Informatics, J. Craig Venter Institute, La Jolla, California.,Department of Pathology, University of California, San Diego, California
| |
Collapse
|
64
|
Lin L, Frelinger J, Jiang W, Finak G, Seshadri C, Bart PA, Pantaleo G, McElrath J, DeRosa S, Gottardo R. Identification and visualization of multidimensional antigen-specific T-cell populations in polychromatic cytometry data. Cytometry A 2015; 87:675-82. [PMID: 25908275 PMCID: PMC4482785 DOI: 10.1002/cyto.a.22623] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Revised: 10/24/2014] [Accepted: 12/10/2014] [Indexed: 11/08/2022]
Abstract
An important aspect of immune monitoring for vaccine development, clinical trials, and research is the detection, measurement, and comparison of antigen-specific T-cells from subject samples under different conditions. Antigen-specific T-cells compose a very small fraction of total T-cells. Developments in cytometry technology over the past five years have enabled the measurement of single-cells in a multivariate and high-throughput manner. This growth in both dimensionality and quantity of data continues to pose a challenge for effective identification and visualization of rare cell subsets, such as antigen-specific T-cells. Dimension reduction and feature extraction play pivotal role in both identifying and visualizing cell populations of interest in large, multi-dimensional cytometry datasets. However, the automated identification and visualization of rare, high-dimensional cell subsets remains challenging. Here we demonstrate how a systematic and integrated approach combining targeted feature extraction with dimension reduction can be used to identify and visualize biological differences in rare, antigen-specific cell populations. By using OpenCyto to perform semi-automated gating and features extraction of flow cytometry data, followed by dimensionality reduction with t-SNE we are able to identify polyfunctional subpopulations of antigen-specific T-cells and visualize treatment-specific differences between them.
Collapse
Affiliation(s)
- Lin Lin
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Jacob Frelinger
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Wenxin Jiang
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Greg Finak
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Chetan Seshadri
- Division of Allergy and Infectious Diseases, University of Washington, Seattle, Washington
| | | | | | - Julie McElrath
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Steve DeRosa
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Raphael Gottardo
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| |
Collapse
|
65
|
Diggins KE, Ferrell PB, Irish JM. Methods for discovery and characterization of cell subsets in high dimensional mass cytometry data. Methods 2015; 82:55-63. [PMID: 25979346 PMCID: PMC4468028 DOI: 10.1016/j.ymeth.2015.05.008] [Citation(s) in RCA: 109] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 04/24/2015] [Accepted: 05/06/2015] [Indexed: 02/03/2023] Open
Abstract
The flood of high-dimensional data resulting from mass cytometry experiments that measure more than 40 features of individual cells has stimulated creation of new single cell computational biology tools. These tools draw on advances in the field of machine learning to capture multi-parametric relationships and reveal cells that are easily overlooked in traditional analysis. Here, we introduce a workflow for high dimensional mass cytometry data that emphasizes unsupervised approaches and visualizes data in both single cell and population level views. This workflow includes three central components that are common across mass cytometry analysis approaches: (1) distinguishing initial populations, (2) revealing cell subsets, and (3) characterizing subset features. In the implementation described here, viSNE, SPADE, and heatmaps were used sequentially to comprehensively characterize and compare healthy and malignant human tissue samples. The use of multiple methods helps provide a comprehensive view of results, and the largely unsupervised workflow facilitates automation and helps researchers avoid missing cell populations with unusual or unexpected phenotypes. Together, these methods develop a framework for future machine learning of cell identity.
Collapse
Affiliation(s)
- Kirsten E Diggins
- Cancer Biology, Vanderbilt University School of Medicine, United States
| | - P Brent Ferrell
- Medicine/Division of Hematology-Oncology, Vanderbilt University School of Medicine, United States
| | - Jonathan M Irish
- Cancer Biology, Vanderbilt University School of Medicine, United States; Pathology, Microbiology and Immunology, Vanderbilt University School of Medicine, United States.
| |
Collapse
|
66
|
Selection on noise constrains variation in a eukaryotic promoter. Nature 2015; 521:344-7. [PMID: 25778704 PMCID: PMC4455047 DOI: 10.1038/nature14244] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 01/19/2015] [Indexed: 01/19/2023]
Abstract
Genetic variation segregating within a species reflects the combined activities of mutation, selection, and genetic drift. In the absence of selection, polymorphisms are expected to be a random subset of new mutations; thus, comparing the effects of polymorphisms and new mutations provides a test for selection1–4. When evidence of selection exists, such comparisons can identify properties of mutations that are most likely to persist in natural populations2. Here, we investigate how mutation and selection have shaped variation in a cis-regulatory sequence controlling gene expression by empirically determining the effects of polymorphisms segregating in the TDH3 promoter among 85 strains of Saccharomyces cerevisiae and comparing their effects to a distribution of mutational effects defined by 236 point mutations in the same promoter. Surprisingly, we find that selection on expression noise (i.e., variability in expression among genetically identical cells5) appears to have had a greater impact on sequence variation in the TDH3 promoter than selection on mean expression level. This is not necessarily because variation in expression noise impacts fitness more than variation in mean expression level, but rather because of differences in the distributions of mutational effects for these two phenotypes. This study shows how systematically examining the effects of new mutations can enrich our understanding of evolutionary mechanisms and provides rare empirical evidence of selection acting on expression noise.
Collapse
|
67
|
Van Gassen S, Callebaut B, Van Helden MJ, Lambrecht BN, Demeester P, Dhaene T, Saeys Y. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data. Cytometry A 2015; 87:636-45. [PMID: 25573116 DOI: 10.1002/cyto.a.22625] [Citation(s) in RCA: 1138] [Impact Index Per Article: 126.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.
Collapse
Affiliation(s)
- Sofie Van Gassen
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium.,Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Britt Callebaut
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Mary J Van Helden
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Bart N Lambrecht
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Piet Demeester
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Tom Dhaene
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Yvan Saeys
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| |
Collapse
|
68
|
Gasol JM, Morán XAG. Flow Cytometric Determination of Microbial Abundances and Its Use to Obtain Indices of Community Structure and Relative Activity. SPRINGER PROTOCOLS HANDBOOKS 2015. [DOI: 10.1007/8623_2015_139] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
69
|
Feher K, Kirsch J, Radbruch A, Chang HD, Kaiser T. Cell population identification using fluorescence-minus-one controls with a one-class classifying algorithm. ACTA ACUST UNITED AC 2014; 30:3372-8. [PMID: 25170025 DOI: 10.1093/bioinformatics/btu575] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
MOTIVATION The tried and true approach of flow cytometry data analysis is to manually gate on each biomarker separately, which is feasible for a small number of biomarkers, e.g. less than five. However, this rapidly becomes confusing as the number of biomarker increases. Furthermore, multivariate structure is not taken into account. Recently, automated gating algorithms have been implemented, all of which rely on unsupervised learning methodology. However, all unsupervised learning outputs suffer the same difficulties in validation in the absence of external knowledge, regardless of application domain. RESULTS We present a new semi-automated algorithm for population discovery that is based on comparison to fluorescence-minus-one controls, thus transferring the problem into that of one-class classification, as opposed to being an unsupervised learning problem. The novel one-class classification algorithm is based on common principal components and can accommodate complex mixtures of multivariate densities. Computational time is short, and the simple nature of the calculations means the algorithm can easily be adapted to process large numbers of cells (10(6)). Furthermore, we are able to find rare cell populations as well as populations with low biomarker concentration, both of which are inherently hard to do in an unsupervised learning context without prior knowledge of the samples' composition. AVAILABILITY AND IMPLEMENTATION R scripts are available via https://fccf.mpiib-berlin.mpg.de/daten/drfz/bioinformatics/with{username,password}={bioinformatics,Sar=Gac4}.
Collapse
Affiliation(s)
- Kristen Feher
- Deutsches Rheuma-Forschungszentrum, Berlin 10117, Germany
| | - Jenny Kirsch
- Deutsches Rheuma-Forschungszentrum, Berlin 10117, Germany
| | | | | | - Toralf Kaiser
- Deutsches Rheuma-Forschungszentrum, Berlin 10117, Germany
| |
Collapse
|
70
|
Di Palma S, Bodenmiller B. Unraveling cell populations in tumors by single-cell mass cytometry. Curr Opin Biotechnol 2014; 31:122-9. [PMID: 25123841 DOI: 10.1016/j.copbio.2014.07.004] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 07/22/2014] [Indexed: 11/26/2022]
Abstract
The development of new biotechnologies for the analysis of individual cells in heterogeneous populations is an important direction of life science research. This review provides a critical overview of relevant and recent advances in the field of single-cell mass cytometry, focusing on the latest applications in the study of cell heterogeneity. New approaches for multiparameter single-cell imaging, alongside advanced computational tools for deep mining of high-dimensional mass cytometric data, are facilitating the visualization of specific cell types and their interactions in complex cellular assemblies, such as tumors, potentially revealing new insights into cancer biology.
Collapse
Affiliation(s)
- Serena Di Palma
- Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland
| | - Bernd Bodenmiller
- Institute of Molecular Life Sciences, University of Zürich, Zürich, Switzerland.
| |
Collapse
|
71
|
Abstract
A periodic bias in nucleotide frequency with a period of about 11 bp is characteristic for bacterial genomes. This signal is commonly interpreted to relate to the helical pitch of negatively supercoiled DNA. Functions in supercoiling-dependent RNA transcription or as a 'structural code' for DNA packaging have been suggested. Cyanobacterial genomes showed especially strong periodic signals and, on the other hand, DNA supercoiling and supercoiling-dependent transcription are highly dynamic and underlie circadian rhythms of these phototrophic bacteria. Focusing on this phylum and dinucleotides, we find that a minimal motif of AT-tracts (AT2) yields the strongest signal. Strong genome-wide periodicity is ancestral to a clade of unicellular and polyploid species but lost upon morphological transitions into two baeocyte-forming and a symbiotic species. The signal is intermediate in heterocystous species and weak in monoploid picocyanobacteria. A pronounced 'structural code' may support efficient nucleoid condensation and segregation in polyploid cells. The major source of the AT2 signal are protein-coding regions, where it is encoded preferentially in the first and third codon positions. The signal shows only few relations to supercoiling-dependent and diurnal RNA transcription in Synechocystis sp. PCC 6803. Strong and specific signals in two distinct transposons suggest roles in transposase transcription and transpososome formation.
Collapse
Affiliation(s)
- Robert Lehmann
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany
| | - Rainer Machné
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany Institute for Theoretical Chemistry, University of Vienna, Währinger Straße 17, A-1090, Vienna, Austria
| | - Hanspeter Herzel
- Institute for Theoretical Biology, Humboldt University, Berlin, Invalidenstraße 43, D-10115, Berlin, Germany
| |
Collapse
|
72
|
Finak G, Frelinger J, Jiang W, Newell EW, Ramey J, Davis MM, Kalams SA, De Rosa SC, Gottardo R. OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis. PLoS Comput Biol 2014; 10:e1003806. [PMID: 25167361 PMCID: PMC4148203 DOI: 10.1371/journal.pcbi.1003806] [Citation(s) in RCA: 142] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 07/10/2014] [Indexed: 12/13/2022] Open
Abstract
Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.
Collapse
Affiliation(s)
- Greg Finak
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Jacob Frelinger
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Wenxin Jiang
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Evan W. Newell
- Agency for Science Technology and Research, Singapore Immunology Network, Singapore
| | - John Ramey
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Mark M. Davis
- Department of Microbiology and Immunology, Stanford University, Stanford, California, United States of America
- Institute for Immunity, Transplantation and Infection, Stanford University, Stanford, California, United States of America
- The Howard Hughes Medical Institute, Stanford University, Stanford, California, United States of America
| | - Spyros A. Kalams
- Infectious Diseases Division, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Stephen C. De Rosa
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Department of Laboratory Medicine, University of Washington, Seattle, Washington, United States of America
| | - Raphael Gottardo
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Department of Statistics, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
73
|
Anchang B, Do MT, Zhao X, Plevritis SK. CCAST: a model-based gating strategy to isolate homogeneous subpopulations in a heterogeneous population of single cells. PLoS Comput Biol 2014; 10:e1003664. [PMID: 25078380 PMCID: PMC4117418 DOI: 10.1371/journal.pcbi.1003664] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2013] [Accepted: 04/25/2014] [Indexed: 12/12/2022] Open
Abstract
A model-based gating strategy is developed for sorting cells and analyzing populations of single cells. The strategy, named CCAST, for Clustering, Classification and Sorting Tree, identifies a gating strategy for isolating homogeneous subpopulations from a heterogeneous population of single cells using a data-derived decision tree representation that can be applied to cell sorting. Because CCAST does not rely on expert knowledge, it removes human bias and variability when determining the gating strategy. It combines any clustering algorithm with silhouette measures to identify underlying homogeneous subpopulations, then applies recursive partitioning techniques to generate a decision tree that defines the gating strategy. CCAST produces an optimal strategy for cell sorting by automating the selection of gating markers, the corresponding gating thresholds and gating sequence; all of these parameters are typically manually defined. Even though CCAST is optimized for cell sorting, it can be applied for the identification and analysis of homogeneous subpopulations among heterogeneous single cell data. We apply CCAST on single cell data from both breast cancer cell lines and normal human bone marrow. On the SUM159 breast cancer cell line data, CCAST indicates at least five distinct cell states based on two surface markers (CD24 and EPCAM) and provides a gating sorting strategy that produces more homogeneous subpopulations than previously reported. When applied to normal bone marrow data, CCAST reveals an efficient strategy for gating T-cells without prior knowledge of the major T-cell subtypes and the markers that best define them. On the normal bone marrow data, CCAST also reveals two major mature B-cell subtypes, namely CD123+ and CD123- cells, which were not revealed by manual gating but show distinct intracellular signaling responses. More generally, the CCAST framework could be used on other biological and non-biological high dimensional data types that are mixtures of unknown homogeneous subpopulations.
Collapse
Affiliation(s)
- Benedict Anchang
- Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, California, United States of America
| | - Mary T. Do
- Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, California, United States of America
| | - Xi Zhao
- Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, California, United States of America
| | - Sylvia K. Plevritis
- Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
74
|
Zare H, Wang J, Hu A, Weber K, Smith J, Nickerson D, Song C, Witten D, Blau CA, Noble WS. Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput Biol 2014; 10:e1003703. [PMID: 25010360 PMCID: PMC4091710 DOI: 10.1371/journal.pcbi.1003703] [Citation(s) in RCA: 90] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 05/20/2014] [Indexed: 12/13/2022] Open
Abstract
Cancers arise from successive rounds of mutation and selection, generating clonal populations that vary in size, mutational content and drug responsiveness. Ascertaining the clonal composition of a tumor is therefore important both for prognosis and therapy. Mutation counts and frequencies resulting from next-generation sequencing (NGS) potentially reflect a tumor's clonal composition; however, deconvolving NGS data to infer a tumor's clonal structure presents a major challenge. We propose a generative model for NGS data derived from multiple subsections of a single tumor, and we describe an expectation-maximization procedure for estimating the clonal genotypes and relative frequencies using this model. We demonstrate, via simulation, the validity of the approach, and then use our algorithm to assess the clonal composition of a primary breast cancer and associated metastatic lymph node. After dividing the tumor into subsections, we perform exome sequencing for each subsection to assess mutational content, followed by deep sequencing to precisely count normal and variant alleles within each subsection. By quantifying the frequencies of 17 somatic variants, we demonstrate that our algorithm predicts clonal relationships that are both phylogenetically and spatially plausible. Applying this method to larger numbers of tumors should cast light on the clonal evolution of cancers in space and time. Cancers arise from a series of mutations that occur over time. As a result, as a tumor grows each cell inherits a distinctive genotype, defined by the set of all somatic mutations that distinguish the tumor cell from normal cells. Acertaining these genotype patterns, and identifying which ones are associated with the growth of the cancer and its ability to metastasize, can potentially give clinicians insights into how to treat the cancer. In this work, we describe a method for inferring the predominant genotypes within a single tumor. The method requires that a tumor be sectioned and that each section be subjected to a high-throughput sequencing procedure. The resulting mutations and their associated frequencies within each tumor section are then used as input to a probabilistic model that infers the underlying genotypes and their relative frequencies within the tumor. We use simulated data to demonstrate the validity of the approach, and then we apply our algorithm to data from a primary breast cancer and associated metastatic lymph node. We demonstrate that our algorithm predicts genotypes that are consistent with an evolutionary model and with the physical topology of the tumor itself. Applying this method to larger numbers of tumors should cast light on the evolution of cancers in space and time.
Collapse
Affiliation(s)
- Habil Zare
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Junfeng Wang
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Alex Hu
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Kris Weber
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Josh Smith
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Debbie Nickerson
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - ChaoZhong Song
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Daniela Witten
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| | - C. Anthony Blau
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| |
Collapse
|
75
|
Pyne S, Lee SX, Wang K, Irish J, Tamayo P, Nazaire MD, Duong T, Ng SK, Hafler D, Levy R, Nolan GP, Mesirov J, McLachlan GJ. Joint modeling and registration of cell populations in cohorts of high-dimensional flow cytometric data. PLoS One 2014; 9:e100334. [PMID: 24983991 PMCID: PMC4077578 DOI: 10.1371/journal.pone.0100334] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2013] [Accepted: 05/23/2014] [Indexed: 01/20/2023] Open
Abstract
In biomedical applications, an experimenter encounters different potential sources of variation in data such as individual samples, multiple experimental conditions, and multivariate responses of a panel of markers such as from a signaling network. In multiparametric cytometry, which is often used for analyzing patient samples, such issues are critical. While computational methods can identify cell populations in individual samples, without the ability to automatically match them across samples, it is difficult to compare and characterize the populations in typical experiments, such as those responding to various stimulations or distinctive of particular patients or time-points, especially when there are many samples. Joint Clustering and Matching (JCM) is a multi-level framework for simultaneous modeling and registration of populations across a cohort. JCM models every population with a robust multivariate probability distribution. Simultaneously, JCM fits a random-effects model to construct an overall batch template – used for registering populations across samples, and classifying new samples. By tackling systems-level variation, JCM supports practical biomedical applications involving large cohorts. Software for fitting the JCM models have been implemented in an R package EMMIX-JCM, available from http://www.maths.uq.edu.au/~gjm/mix_soft/EMMIX-JCM/.
Collapse
Affiliation(s)
- Saumyadipta Pyne
- CR Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad, Andhra Pradesh, India
| | - Sharon X. Lee
- Department of Mathematics, University of Queensland, St. Lucia, Queensland, Australia
| | - Kui Wang
- Department of Mathematics, University of Queensland, St. Lucia, Queensland, Australia
| | - Jonathan Irish
- Division of Oncology, Stanford Medical School, Stanford, California, United States of America
- Baxter Laboratory for Stem Cell Biology, Department of Microbiology and Immunology, Stanford School of Medicine, Stanford, California, United States of America
- Department of Cancer Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Pablo Tamayo
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts, United States of America
| | - Marc-Danie Nazaire
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts, United States of America
| | - Tarn Duong
- Molecular Mechanisms of Intracellular Transport, Unit Mixte de Recherche 144 Centre National de la Recherche Scientifique/Institut Curie, Paris, France
| | - Shu-Kay Ng
- School of Medicine, Griffith University, Meadowbrook, Queensland, Australia
| | - David Hafler
- Department of Neurology, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Ronald Levy
- Division of Oncology, Stanford Medical School, Stanford, California, United States of America
| | - Garry P. Nolan
- Baxter Laboratory for Stem Cell Biology, Department of Microbiology and Immunology, Stanford School of Medicine, Stanford, California, United States of America
| | - Jill Mesirov
- Broad Institute of MIT and Harvard University, Cambridge, Massachusetts, United States of America
| | - Geoffrey J. McLachlan
- Department of Mathematics, University of Queensland, St. Lucia, Queensland, Australia
- * E-mail:
| |
Collapse
|
76
|
Daily expression pattern of protein-encoding genes and small noncoding RNAs in synechocystis sp. strain PCC 6803. Appl Environ Microbiol 2014; 80:5195-206. [PMID: 24928881 DOI: 10.1128/aem.01086-14] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Many organisms harbor circadian clocks with periods close to 24 h. These cellular clocks allow organisms to anticipate the environmental cycles of day and night by synchronizing circadian rhythms with the rising and setting of the sun. These rhythms originate from the oscillator components of circadian clocks and control global gene expression and various cellular processes. The oscillator of photosynthetic cyanobacteria is composed of three proteins, KaiA, KaiB, and KaiC, linked to a complex regulatory network. Synechocystis sp. strain PCC 6803 possesses the standard cyanobacterial kaiABC gene cluster plus multiple kaiB and kaiC gene copies and antisense RNAs for almost every kai transcript. However, there is no clear evidence of circadian rhythms in Synechocystis sp. PCC 6803 under various experimental conditions. It is also still unknown if and to what extent the multiple kai gene copies and kai antisense RNAs affect circadian timing. Moreover, a large number of small noncoding RNAs whose accumulation dynamics over time have not yet been monitored are known for Synechocystis sp. PCC 6803. Here we performed a 48-h time series transcriptome analysis of Synechocystis sp. PCC 6803, taking into account periodic light-dark phases, continuous light, and continuous darkness. We found that expression of functionally related genes occurred in different phases of day and night. Moreover, we found day-peaking and night-peaking transcripts among the small RNAs; in particular, the amounts of kai antisense RNAs correlated or anticorrelated with those of their respective kai target mRNAs, pointing toward the regulatory relevance of these antisense RNAs. Surprisingly, we observed that the amounts of 16S and 23S rRNAs in this cyanobacterium fluctuated in light-dark periods, showing maximum accumulation in the dark phase. Importantly, the amounts of all transcripts, including small noncoding RNAs, did not show any rhythm under continuous light or darkness, indicating the absence of circadian rhythms in Synechocystis.
Collapse
|
77
|
Richards AJ, Staats J, Enzor J, McKinnon K, Frelinger J, Denny TN, Weinhold KJ, Chan C. Setting objective thresholds for rare event detection in flow cytometry. J Immunol Methods 2014; 409:54-61. [PMID: 24727143 DOI: 10.1016/j.jim.2014.04.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Revised: 03/05/2014] [Accepted: 04/01/2014] [Indexed: 12/11/2022]
Abstract
The accurate identification of rare antigen-specific cytokine positive cells from peripheral blood mononuclear cells (PBMC) after antigenic stimulation in an intracellular staining (ICS) flow cytometry assay is challenging, as cytokine positive events may be fairly diffusely distributed and lack an obvious separation from the negative population. Traditionally, the approach by flow operators has been to manually set a positivity threshold to partition events into cytokine-positive and cytokine-negative. This approach suffers from subjectivity and inconsistency across different flow operators. The use of statistical clustering methods does not remove the need to find an objective threshold between between positive and negative events since consistent identification of rare event subsets is highly challenging for automated algorithms, especially when there is distributional overlap between the positive and negative events ("smear"). We present a new approach, based on the Fβ measure, that is similar to manual thresholding in providing a hard cutoff, but has the advantage of being determined objectively. The performance of this algorithm is compared with results obtained by expert visual gating. Several ICS data sets from the External Quality Assurance Program Oversight Laboratory (EQAPOL) proficiency program were used to make the comparisons. We first show that visually determined thresholds are difficult to reproduce and pose a problem when comparing results across operators or laboratories, as well as problems that occur with the use of commonly employed clustering algorithms. In contrast, a single parameterization for the Fβ method performs consistently across different centers, samples, and instruments because it optimizes the precision/recall tradeoff by using both negative and positive controls.
Collapse
Affiliation(s)
- Adam J Richards
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA; Duke Center for AIDS Research, Duke University, Durham, NC, USA; Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA.
| | - Janet Staats
- Duke Center for AIDS Research, Duke University, Durham, NC, USA; Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA; Department of Surgery, Duke University Medical Center, Durham, NC, USA
| | - Jennifer Enzor
- Duke Center for AIDS Research, Duke University, Durham, NC, USA; Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA; Department of Surgery, Duke University Medical Center, Durham, NC, USA
| | | | - Jacob Frelinger
- Institute for Genome Sciences and Policy, Duke University, NC, USA
| | - Thomas N Denny
- Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA; Duke Human Vaccine Institute, Duke University, Durham, NC, USA
| | - Kent J Weinhold
- Duke Center for AIDS Research, Duke University, Durham, NC, USA; Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA; Department of Surgery, Duke University Medical Center, Durham, NC, USA
| | - Cliburn Chan
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA; Duke Center for AIDS Research, Duke University, Durham, NC, USA; Duke External Quality Assurance Program Oversight Laboratory, Duke University, Durham, NC, USA
| |
Collapse
|
78
|
Tangri S, Vall H, Kaplan D, Hoffman B, Purvis N, Porwit A, Hunsberger B, Shankey TV. Validation of cell-based fluorescence assays: practice guidelines from the ICSH and ICCS - part III - analytical issues. CYTOMETRY PART B-CLINICAL CYTOMETRY 2014; 84:291-308. [PMID: 24022852 DOI: 10.1002/cyto.b.21106] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Revised: 05/20/2013] [Accepted: 06/14/2013] [Indexed: 11/07/2022]
Abstract
Clinical diagnostic assays, may be classified as quantitative, quasi-quantitative or qualitative. The assay's description should state what the assay needs to accomplish (intended use or purpose) and what it is not intended to achieve. The type(s) of samples (whole blood, peripheral blood mononuclear cells (PBMC), bone marrow, bone marrow mononuclear cells (BMMC), tissue, fine needle aspirate, fluid, etc.), instrument platform for use and anticoagulant restrictions should be fully validated for stability requirements and specified. When applicable, assay sensitivity and specificity should be fully validated and reported; these performance criteria will dictate the number and complexity of specimen samples required for validation. Assay processing and staining conditions (lyse/wash/fix/perm, stain pre or post, time and temperature, sample stability, etc.) should be described in detail and fully validated.
Collapse
|
79
|
Abraham Y, Zhang X, Parker CN. Multiparametric Analysis of Screening Data: Growing Beyond the Single Dimension to Infinity and Beyond. ACTA ACUST UNITED AC 2014; 19:628-39. [PMID: 24598104 DOI: 10.1177/1087057114524987] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 01/14/2014] [Indexed: 11/16/2022]
Abstract
Advances in instrumentation now allow the development of screening assays that are capable of monitoring multiple readouts such as transcript or protein levels, or even multiple parameters derived from images. Such advances in assay technologies highlight the complex nature of biology and disease. Harnessing this complexity requires integration of all the different parameters that can be measured rather than just monitoring a single dimension as is commonly used. Although some of the methods used to combine multiple measurements, such as principal component analysis, are commonly used for microarray analysis, biologists are not yet using many of the tools that have been developed in other fields to address such issues. Visualization of multiparametric data sets is one of the major challenges in this field, and a depiction of the results in a manner that can be readily interpreted is essential. This article describes a number of assay systems being used to generate such data sets en masse, and the methods being applied to their visualization and analysis. We also discuss some of the challenges of applying methods developed in other fields to biology.
Collapse
Affiliation(s)
- Yann Abraham
- Novartis Institute for Biomedical Research, Basel, Switzerland
| | - Xian Zhang
- Novartis Institute for Biomedical Research, Basel, Switzerland
| | | |
Collapse
|
80
|
Naim I, Datta S, Rebhahn J, Cavenaugh JS, Mosmann TR, Sharma G. SWIFT-scalable clustering for automated identification of rare cell populations in large, high-dimensional flow cytometry datasets, part 1: algorithm design. Cytometry A 2014; 85:408-21. [PMID: 24677621 PMCID: PMC4238829 DOI: 10.1002/cyto.a.22446] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2013] [Revised: 11/08/2013] [Accepted: 01/02/2013] [Indexed: 01/05/2023]
Abstract
We present a model-based clustering method, SWIFT (Scalable Weighted Iterative Flow-clustering Technique), for digesting high-dimensional large-sized datasets obtained via modern flow cytometry into more compact representations that are well-suited for further automated or manual analysis. Key attributes of the method include the following: (a) the analysis is conducted in the multidimensional space retaining the semantics of the data, (b) an iterative weighted sampling procedure is utilized to maintain modest computational complexity and to retain discrimination of extremely small subpopulations (hundreds of cells from datasets containing tens of millions), and (c) a splitting and merging procedure is incorporated in the algorithm to preserve distinguishability between biologically distinct populations, while still providing a significant compaction relative to the original data. This article presents a detailed algorithmic description of SWIFT, outlining the application-driven motivations for the different design choices, a discussion of computational complexity of the different steps, and results obtained with SWIFT for synthetic data and relatively simple experimental data that allow validation of the desirable attributes. A companion paper (Part 2) highlights the use of SWIFT, in combination with additional computational tools, for more challenging biological problems.
Collapse
Affiliation(s)
- Iftekhar Naim
- Department of Computer Science, University of Rochester, Rochester, New York
| | | | | | | | | | | |
Collapse
|
81
|
Mosmann TR, Naim I, Rebhahn J, Datta S, Cavenaugh JS, Weaver JM, Sharma G. SWIFT-scalable clustering for automated identification of rare cell populations in large, high-dimensional flow cytometry datasets, part 2: biological evaluation. Cytometry A 2014; 85:422-33. [PMID: 24532172 PMCID: PMC4238823 DOI: 10.1002/cyto.a.22445] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2013] [Revised: 11/15/2013] [Accepted: 01/02/2014] [Indexed: 01/27/2023]
Abstract
A multistage clustering and data processing method, SWIFT (detailed in a companion manuscript), has been developed to detect rare subpopulations in large, high-dimensional flow cytometry datasets. An iterative sampling procedure initially fits the data to multidimensional Gaussian distributions, then splitting and merging stages use a criterion of unimodality to optimize the detection of rare subpopulations, to converge on a consistent cluster number, and to describe non-Gaussian distributions. Probabilistic assignment of cells to clusters, visualization, and manipulation of clusters by their cluster medians, facilitate application of expert knowledge using standard flow cytometry programs. The dual problems of rigorously comparing similar complex samples, and enumerating absent or very rare cell subpopulations in negative controls, were solved by assigning cells in multiple samples to a cluster template derived from a single or combined sample. Comparison of antigen-stimulated and control human peripheral blood cell samples demonstrated that SWIFT could identify biologically significant subpopulations, such as rare cytokine-producing influenza-specific T cells. A sensitivity of better than one part per million was attained in very large samples. Results were highly consistent on biological replicates, yet the analysis was sensitive enough to show that multiple samples from the same subject were more similar than samples from different subjects. A companion manuscript (Part 1) details the algorithmic development of SWIFT. © 2014 The Authors. Published by Wiley Periodicals Inc.
Collapse
Affiliation(s)
- Tim R Mosmann
- David H. Smith Center for Vaccine Biology and Immunology, University of Rochester Medical Center, University of Rochester, Rochester, New York
| | | | | | | | | | | | | |
Collapse
|
82
|
Finak G, Jiang W, Krouse K, Wei C, Sanz I, Phippard D, Asare A, De Rosa SC, Self S, Gottardo R. High-throughput flow cytometry data normalization for clinical trials. Cytometry A 2013; 85:277-86. [PMID: 24382714 DOI: 10.1002/cyto.a.22433] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Revised: 11/18/2013] [Accepted: 12/13/2013] [Indexed: 01/08/2023]
Abstract
Flow cytometry datasets from clinical trials generate very large datasets and are usually highly standardized, focusing on endpoints that are well defined apriori. Staining variability of individual makers is not uncommon and complicates manual gating, requiring the analyst to adapt gates for each sample, which is unwieldy for large datasets. It can lead to unreliable measurements, especially if a template-gating approach is used without further correction to the gates. In this article, a computational framework is presented for normalizing the fluorescence intensity of multiple markers in specific cell populations across samples that is suitable for high-throughput processing of large clinical trial datasets. Previous approaches to normalization have been global and applied to all cells or data with debris removed. They provided no mechanism to handle specific cell subsets. This approach integrates tightly with the gating process so that normalization is performed during gating and is local to the specific cell subsets exhibiting variability. This improves peak alignment and the performance of the algorithm. The performance of this algorithm is demonstrated on two clinical trial datasets from the HIV Vaccine Trials Network (HVTN) and the Immune Tolerance Network (ITN). In the ITN data set we show that local normalization combined with template gating can account for sample-to-sample variability as effectively as manual gating. In the HVTN dataset, it is shown that local normalization mitigates false-positive vaccine response calls in an intracellular cytokine staining assay. In both datasets, local normalization performs better than global normalization. The normalization framework allows the use of template gates even in the presence of sample-to-sample staining variability, mitigates the subjectivity and bias of manual gating, and decreases the time necessary to analyze large datasets.
Collapse
Affiliation(s)
- Greg Finak
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109
| | | | | | | | | | | | | | | | | | | |
Collapse
|
83
|
Abstract
Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating), to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data, and software are also key parts of flow cytometry bioinformatics. Data standards include the widely adopted Flow Cytometry Standard (FCS) defining how data from cytometers should be stored, but also several new standards under development by the International Society for Advancement of Cytometry (ISAC) to aid in storing more detailed information about experimental design and analytical steps. Open data is slowly growing with the opening of the CytoBank database in 2010 and FlowRepository in 2012, both of which allow users to freely distribute their data, and the latter of which has been recommended as the preferred repository for MIFlowCyt-compliant data by ISAC. Open software is most widely available in the form of a suite of Bioconductor packages, but is also available for web execution on the GenePattern platform.
Collapse
Affiliation(s)
- Kieran O'Neill
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| | - Nima Aghaeepour
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| | - Josef Špidlen
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Ryan Brinkman
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
84
|
Gilels F, Paquette ST, Zhang J, Rahman I, White PM. Mutation of Foxo3 causes adult onset auditory neuropathy and alters cochlear synapse architecture in mice. J Neurosci 2013; 33:18409-24. [PMID: 24259566 PMCID: PMC6618809 DOI: 10.1523/jneurosci.2529-13.2013] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Revised: 09/26/2013] [Accepted: 10/12/2013] [Indexed: 11/21/2022] Open
Abstract
Auditory neuropathy is a form of hearing loss in which cochlear inner hair cells fail to correctly encode or transmit acoustic information to the brain. Few genes have been implicated in the adult-onset form of this disease. Here we show that mice lacking the transcription factor Foxo3 have adult onset hearing loss with the hallmark characteristics of auditory neuropathy, namely, elevated auditory thresholds combined with normal outer hair cell function. Using histological techniques, we demonstrate that Foxo3-dependent hearing loss is not due to a loss of cochlear hair cells or spiral ganglion neurons, both of which normally express Foxo3. Moreover, Foxo3-knock-out (KO) inner hair cells do not display reductions in numbers of synapses. Instead, we find that there are subtle structural changes in and surrounding inner hair cells. Confocal microscopy in conjunction with 3D modeling and quantitative analysis show that synaptic localization is altered in Foxo3-KO mice and Myo7a immunoreactivity is reduced. TEM demonstrates apparent afferent degeneration. Strikingly, acoustic stimulation promotes Foxo3 nuclear localization in vivo, implying a connection between cochlear activity and synaptic function maintenance. Together, these findings support a new role for the canonical damage response factor Foxo3 in contributing to the maintenance of auditory synaptic transmission.
Collapse
MESH Headings
- Acoustic Stimulation
- Age Factors
- Alcohol Oxidoreductases
- Animals
- Animals, Newborn
- Calcium-Binding Proteins/metabolism
- Co-Repressor Proteins
- Cochlea/growth & development
- Cochlea/metabolism
- Cochlea/pathology
- DNA-Binding Proteins/metabolism
- Disease Models, Animal
- Evoked Potentials, Auditory, Brain Stem/genetics
- Forkhead Box Protein O3
- Forkhead Transcription Factors/genetics
- Forkhead Transcription Factors/metabolism
- Gene Expression Regulation, Developmental/genetics
- Hair Cells, Auditory, Inner/metabolism
- Hair Cells, Auditory, Inner/pathology
- Hair Cells, Auditory, Inner/ultrastructure
- Hearing Loss, Central/genetics
- Hearing Loss, Central/pathology
- Hearing Loss, Central/physiopathology
- Imaging, Three-Dimensional
- Mice
- Mice, Transgenic
- Microscopy, Electron, Transmission
- Mutation/genetics
- Myosin VIIa
- Myosins/metabolism
- Phosphoproteins/metabolism
- Receptors, AMPA/metabolism
- Synapses/genetics
- Synapses/pathology
- Synapses/ultrastructure
Collapse
Affiliation(s)
| | | | | | - Irfan Rahman
- Department of Environmental Medicine, University of Rochester School of Medicine and Dentistry, Rochester, New York 14642
| | | |
Collapse
|
85
|
|
86
|
Cron A, Gouttefangeas C, Frelinger J, Lin L, Singh SK, Britten CM, Welters MJP, van der Burg SH, West M, Chan C. Hierarchical modeling for rare event detection and cell subset alignment across flow cytometry samples. PLoS Comput Biol 2013; 9:e1003130. [PMID: 23874174 PMCID: PMC3708855 DOI: 10.1371/journal.pcbi.1003130] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 05/17/2013] [Indexed: 11/21/2022] Open
Abstract
Flow cytometry is the prototypical assay for multi-parameter single cell analysis, and is essential in vaccine and biomarker research for the enumeration of antigen-specific lymphocytes that are often found in extremely low frequencies (0.1% or less). Standard analysis of flow cytometry data relies on visual identification of cell subsets by experts, a process that is subjective and often difficult to reproduce. An alternative and more objective approach is the use of statistical models to identify cell subsets of interest in an automated fashion. Two specific challenges for automated analysis are to detect extremely low frequency event subsets without biasing the estimate by pre-processing enrichment, and the ability to align cell subsets across multiple data samples for comparative analysis. In this manuscript, we develop hierarchical modeling extensions to the Dirichlet Process Gaussian Mixture Model (DPGMM) approach we have previously described for cell subset identification, and show that the hierarchical DPGMM (HDPGMM) naturally generates an aligned data model that captures both commonalities and variations across multiple samples. HDPGMM also increases the sensitivity to extremely low frequency events by sharing information across multiple samples analyzed simultaneously. We validate the accuracy and reproducibility of HDPGMM estimates of antigen-specific T cells on clinically relevant reference peripheral blood mononuclear cell (PBMC) samples with known frequencies of antigen-specific T cells. These cell samples take advantage of retrovirally TCR-transduced T cells spiked into autologous PBMC samples to give a defined number of antigen-specific T cells detectable by HLA-peptide multimer binding. We provide open source software that can take advantage of both multiple processors and GPU-acceleration to perform the numerically-demanding computations. We show that hierarchical modeling is a useful probabilistic approach that can provide a consistent labeling of cell subsets and increase the sensitivity of rare event detection in the context of quantifying antigen-specific immune responses.
Collapse
Affiliation(s)
- Andrew Cron
- Department of Statistical Science, Duke University, Durham, North Carolina, United States of America
| | - Cécile Gouttefangeas
- Interfaculty Institute for Cell Biology, Department of Immunology, Eberhard Karls University, Tuebingen, Germany
| | - Jacob Frelinger
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
| | - Lin Lin
- Population Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Satwinder K. Singh
- Department of Clinical Oncology, Leiden University Medical Center, Leiden, The Netherlands
| | - Cedrik M. Britten
- Translational Oncology at the University Medical Center of the Johannes Gutenberg-University Mainz gGmbH, Mainz, Germany
| | - Marij J. P. Welters
- Department of Clinical Oncology, Leiden University Medical Center, Leiden, The Netherlands
| | - Sjoerd H. van der Burg
- Department of Clinical Oncology, Leiden University Medical Center, Leiden, The Netherlands
| | - Mike West
- Department of Statistical Science, Duke University, Durham, North Carolina, United States of America
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
| | - Cliburn Chan
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina, United States of America
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina, United States of America
| |
Collapse
|
87
|
Spidlen J, Barsky A, Breuer K, Carr P, Nazaire MD, Hill BA, Qian Y, Liefeld T, Reich M, Mesirov JP, Wilkinson P, Scheuermann RH, Sekaly RP, Brinkman RR. GenePattern flow cytometry suite. SOURCE CODE FOR BIOLOGY AND MEDICINE 2013; 8:14. [PMID: 23822732 PMCID: PMC3717030 DOI: 10.1186/1751-0473-8-14] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Accepted: 06/21/2013] [Indexed: 01/08/2023]
Abstract
BACKGROUND Traditional flow cytometry data analysis is largely based on interactive and time consuming analysis of series two dimensional representations of up to 20 dimensional data. Recent technological advances have increased the amount of data generated by the technology and outpaced the development of data analysis approaches. While there are advanced tools available, including many R/BioConductor packages, these are only accessible programmatically and therefore out of reach for most experimentalists. GenePattern is a powerful genomic analysis platform with over 200 tools for analysis of gene expression, proteomics, and other data. A web-based interface provides easy access to these tools and allows the creation of automated analysis pipelines enabling reproducible research. RESULTS In order to bring advanced flow cytometry data analysis tools to experimentalists without programmatic skills, we developed the GenePattern Flow Cytometry Suite. It contains 34 open source GenePattern flow cytometry modules covering methods from basic processing of flow cytometry standard (i.e., FCS) files to advanced algorithms for automated identification of cell populations, normalization and quality assessment. Internally, these modules leverage from functionality developed in R/BioConductor. Using the GenePattern web-based interface, they can be connected to build analytical pipelines. CONCLUSIONS GenePattern Flow Cytometry Suite brings advanced flow cytometry data analysis capabilities to users with minimal computer skills. Functionality previously available only to skilled bioinformaticians is now easily accessible from a web browser.
Collapse
Affiliation(s)
- Josef Spidlen
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Aaron Barsky
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Karin Breuer
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Peter Carr
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Marc-Danie Nazaire
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Barbara Allen Hill
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yu Qian
- Vaccine and Gene Therapy Institute of Florida, Port Saint Lucie, FL, USA
| | - Ted Liefeld
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michael Reich
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill P Mesirov
- Computational Biology and Bioinformatics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | | - Ryan R Brinkman
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
88
|
Lehmann R, Machné R, Georg J, Benary M, Axmann I, Steuer R. How cyanobacteria pose new problems to old methods: challenges in microarray time series analysis. BMC Bioinformatics 2013; 14:133. [PMID: 23601192 PMCID: PMC3679775 DOI: 10.1186/1471-2105-14-133] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 03/18/2013] [Indexed: 11/24/2022] Open
Abstract
Background The transcriptomes of several cyanobacterial strains have been shown to exhibit diurnal oscillation patterns, reflecting the diurnal phototrophic lifestyle of the organisms. The analysis of such genome-wide transcriptional oscillations is often facilitated by the use of clustering algorithms in conjunction with a number of pre-processing steps. Biological interpretation is usually focussed on the time and phase of expression of the resulting groups of genes. However, the use of microarray technology in such studies requires the normalization of pre-processing data, with unclear impact on the qualitative and quantitative features of the derived information on the number of oscillating transcripts and their respective phases. Results A microarray based evaluation of diurnal expression in the cyanobacterium Synechocystis sp. PCC 6803 is presented. As expected, the temporal expression patterns reveal strong oscillations in transcript abundance. We compare the Fourier transformation-based expression phase before and after the application of quantile normalization, median polishing, cyclical LOESS, and least oscillating set (LOS) normalization. Whereas LOS normalization mostly preserves the phases of the raw data, the remaining methods introduce systematic biases. In particular, quantile-normalization is found to introduce a phase-shift of 180°, effectively changing night-expressed genes into day-expressed ones. Comparison of a large number of clustering results of differently normalized data shows that the normalization method determines the result. Subsequent steps, such as the choice of data transformation, similarity measure, and clustering algorithm, only play minor roles. We find that the standardization and the DTF transformation are favorable for the clustering of time series in contrast to the 12 m transformation. We use the cluster-wise functional enrichment of a clustering derived by LOS normalization, clustering using flowClust, and DFT transformation to derive the diurnal biological program of Synechocystis sp.. Conclusion Application of quantile normalization, median polishing, and also cyclic LOESS normalization of the presented cyanobacterial dataset lead to increased numbers of oscillating genes and the systematic shift of the expression phase. The LOS normalization minimizes the observed detrimental effects. As previous analyses employed a variety of different normalization methods, a direct comparison of results must be treated with caution.
Collapse
Affiliation(s)
- Robert Lehmann
- Institute for Theoretical Biology, Humboldt University Berlin, Invalidenstraße 43, D-10115 Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
89
|
Cambray G, Guimaraes JC, Mutalik VK, Lam C, Mai QA, Thimmaiah T, Carothers JM, Arkin AP, Endy D. Measurement and modeling of intrinsic transcription terminators. Nucleic Acids Res 2013; 41:5139-48. [PMID: 23511967 PMCID: PMC3643576 DOI: 10.1093/nar/gkt163] [Citation(s) in RCA: 128] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The reliable forward engineering of genetic systems remains limited by the ad hoc reuse of many types of basic genetic elements. Although a few intrinsic prokaryotic transcription terminators are used routinely, termination efficiencies have not been studied systematically. Here, we developed and validated a genetic architecture that enables reliable measurement of termination efficiencies. We then assembled a collection of 61 natural and synthetic terminators that collectively encode termination efficiencies across an ∼800-fold dynamic range within Escherichia coli. We simulated co-transcriptional RNA folding dynamics to identify competing secondary structures that might interfere with terminator folding kinetics or impact termination activity. We found that structures extending beyond the core terminator stem are likely to increase terminator activity. By excluding terminators encoding such context-confounding elements, we were able to develop a linear sequence-function model that can be used to estimate termination efficiencies (r = 0.9, n = 31) better than models trained on all terminators (r = 0.67, n = 54). The resulting systematically measured collection of terminators should improve the engineering of synthetic genetic systems and also advance quantitative modeling of transcription termination.
Collapse
Affiliation(s)
- Guillaume Cambray
- BIOFAB International Open Facility Advancing Biotechnology (BIOFAB), 5885 Hollis Street, Emeryville, CA 94608, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
90
|
Quantitative estimation of activity and quality for collections of functional genetic elements. Nat Methods 2013; 10:347-53. [DOI: 10.1038/nmeth.2403] [Citation(s) in RCA: 161] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 02/13/2013] [Indexed: 01/20/2023]
|
91
|
Barteneva NS, Ketman K, Fasler-Kan E, Potashnikova D, Vorobjev IA. Cell sorting in cancer research--diminishing degree of cell heterogeneity. Biochim Biophys Acta Rev Cancer 2013; 1836:105-22. [PMID: 23481260 DOI: 10.1016/j.bbcan.2013.02.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 02/06/2013] [Accepted: 02/08/2013] [Indexed: 12/18/2022]
Abstract
Increasing evidence of intratumor heterogeneity and its augmentation due to selective pressure of microenvironment and recent achievements in cancer therapeutics lead to the need to investigate and track the tumor subclonal structure. Cell sorting of heterogeneous subpopulations of tumor and tumor-associated cells has been a long established strategy in cancer research. Advancement in lasers, computer technology and optics has led to a new generation of flow cytometers and cell sorters capable of high-speed processing of single cell suspensions. Over the last several years cell sorting was used in combination with molecular biological methods, imaging and proteomics to characterize primary and metastatic cancer cell populations, minimal residual disease and single tumor cells. It was the principal method for identification and characterization of cancer stem cells. Analysis of single cancer cells may improve early detection of tumors, monitoring of circulating tumor cells, evaluation of intratumor heterogeneity and chemotherapeutic treatments. The aim of this review is to provide an overview of major cell sorting applications and approaches with new prospective developments such as microfluidics and microchip technologies.
Collapse
Affiliation(s)
- Natasha S Barteneva
- Program in Cellular and Molecular Medicine, Children's Hospital Boston, Harvard Medical School, Boston, MA, USA.
| | | | | | | | | |
Collapse
|
92
|
Computational methods for evaluation of cell-based data assessment—Bioconductor. Curr Opin Biotechnol 2013; 24:105-11. [DOI: 10.1016/j.copbio.2012.09.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Revised: 09/04/2012] [Accepted: 09/05/2012] [Indexed: 12/14/2022]
|
93
|
Robinson JP, Rajwa B, Patsekin V, Davisson VJ. Computational analysis of high-throughput flow cytometry data. Expert Opin Drug Discov 2012; 7:679-93. [PMID: 22708834 PMCID: PMC4389283 DOI: 10.1517/17460441.2012.693475] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
INTRODUCTION Flow cytometry has been around for over 40 years, but only recently has the opportunity arisen to move into the high-throughput domain. The technology is now available and is highly competitive with imaging tools under the right conditions. Flow cytometry has, however, been a technology that has focused on its unique ability to study single cells and appropriate analytical tools are readily available to handle this traditional role of the technology. AREAS COVERED Expansion of flow cytometry to a high-throughput (HT) and high-content technology requires both advances in hardware and analytical tools. The historical perspective of flow cytometry operation as well as how the field has changed and what the key changes have been discussed. The authors provide a background and compelling arguments for moving toward HT flow, where there are many innovative opportunities. With alternative approaches now available for flow cytometry, there will be a considerable number of new applications. These opportunities show strong capability for drug screening and functional studies with cells in suspension. EXPERT OPINION There is no doubt that HT flow is a rich technology awaiting acceptance by the pharmaceutical community. It can provide a powerful phenotypic analytical toolset that has the capacity to change many current approaches to HT screening. The previous restrictions on the technology, based on its reduced capacity for sample throughput, are no longer a major issue. Overcoming this barrier has transformed a mature technology into one that can focus on systems biology questions not previously considered possible.
Collapse
Affiliation(s)
- J Paul Robinson
- Purdue University Cytometry Laboratories, Purdue University, West Lafayette, IN 47907, USA.
| | | | | | | |
Collapse
|
94
|
Linderman MD, Bjornson Z, Simonds EF, Qiu P, Bruggner RV, Sheode K, Meng TH, Plevritis SK, Nolan GP. CytoSPADE: high-performance analysis and visualization of high-dimensional cytometry data. Bioinformatics 2012; 28:2400-1. [PMID: 22782546 DOI: 10.1093/bioinformatics/bts425] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Recent advances in flow cytometry enable simultaneous single-cell measurement of 30+ surface and intracellular proteins. CytoSPADE is a high-performance implementation of an interface for the Spanning-tree Progression Analysis of Density-normalized Events algorithm for tree-based analysis and visualization of this high-dimensional cytometry data. AVAILABILITY Source code and binaries are freely available at http://cytospade.org and via Bioconductor version 2.10 onwards for Linux, OSX and Windows. CytoSPADE is implemented in R, C++ and Java. CONTACT michael.linderman@mssm.edu SUPPLEMENTARY INFORMATION Additional documentation available at http://cytospade.org.
Collapse
Affiliation(s)
- Michael D Linderman
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
95
|
Sasidharan K, Amariei C, Tomita M, Murray DB. Rapid DNA, RNA and protein extraction protocols optimized for slow continuously growing yeast cultures. Yeast 2012; 29:311-22. [PMID: 22763810 DOI: 10.1002/yea.2911] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 06/07/2012] [Accepted: 06/07/2012] [Indexed: 11/09/2022] Open
Abstract
Conventional extraction protocols for yeast have been developed for relatively rapid-growing low cell density cultures of laboratory strains and often do not have the integrity for frequent sampling of cultures. Therefore, these protocols are usually inefficient for cultures under slow growth conditions or of non-laboratory strains. We have developed a combined mechanical and chemical disruption procedure using vigorous bead-beating that can consistently disrupt yeast cells (> 95%), irrespective of cell cycle and metabolic state. Using this disruption technique coupled with quenching, we have developed DNA, RNA and protein extraction protocols that are optimized for a large number of samples from slow-growing high-density industrial yeast cultures. Additionally, sample volume, the use of expensive reagents/enzymes, handling times and incubations were minimized. We have tested the reproducibility of our methods using triplicate/time-series extractions and compared these with commonly used protocols or commercially available kits. Moreover, we utilized a simple flow-cytometric approach to estimate the mitochondrial DNA copy number. Based on the results, our methods have shown higher reproducibility, yield and quality.
Collapse
Affiliation(s)
- Kalesh Sasidharan
- Institute for Advanced Biosciences, Keio University, Yamagata, Japan
| | | | | | | |
Collapse
|
96
|
Machné R, Murray DB. The yin and yang of yeast transcription: elements of a global feedback system between metabolism and chromatin. PLoS One 2012; 7:e37906. [PMID: 22685547 PMCID: PMC3369881 DOI: 10.1371/journal.pone.0037906] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 04/30/2012] [Indexed: 11/19/2022] Open
Abstract
When grown in continuous culture, budding yeast cells tend to synchronize their respiratory activity to form a stable oscillation that percolates throughout cellular physiology and involves the majority of the protein-coding transcriptome. Oscillations in batch culture and at single cell level support the idea that these dynamics constitute a general growth principle. The precise molecular mechanisms and biological functions of the oscillation remain elusive. Fourier analysis of transcriptome time series datasets from two different oscillation periods (0.7 h and 5 h) reveals seven distinct co-expression clusters common to both systems (34% of all yeast ORF), which consolidate into two superclusters when correlated with a compilation of 1,327 unrelated transcriptome datasets. These superclusters encode for cell growth and anabolism during the phase of high, and mitochondrial growth, catabolism and stress response during the phase of low oxygen uptake. The promoters of each cluster are characterized by different nucleotide contents, promoter nucleosome configurations, and dependence on ATP-dependent nucleosome remodeling complexes. We show that the ATP:ADP ratio oscillates, compatible with alternating metabolic activity of the two superclusters and differential feedback on their transcription via activating (RSC) and repressive (Isw2) types of promoter structure remodeling. We propose a novel feedback mechanism, where the energetic state of the cell, reflected in the ATP:ADP ratio, gates the transcription of large, but functionally coherent groups of genes via differential effects of ATP-dependent nucleosome remodeling machineries. Besides providing a mechanistic hypothesis for the delayed negative feedback that results in the oscillatory phenotype, this mechanism may underpin the continuous adaptation of growth to environmental conditions.
Collapse
Affiliation(s)
- Rainer Machné
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria.
| | | |
Collapse
|
97
|
Ge Y, Sealfon SC. flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding. Bioinformatics 2012; 28:2052-8. [PMID: 22595209 DOI: 10.1093/bioinformatics/bts300] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION For flow cytometry data, there are two common approaches to the unsupervised clustering problem: one is based on the finite mixture model and the other on spatial exploration of the histograms. The former is computationally slow and has difficulty to identify clusters of irregular shapes. The latter approach cannot be applied directly to high-dimensional data as the computational time and memory become unmanageable and the estimated histogram is unreliable. An algorithm without these two problems would be very useful. RESULTS In this article, we combine ideas from the finite mixture model and histogram spatial exploration. This new algorithm, which we call flowPeaks, can be applied directly to high-dimensional data and identify irregular shape clusters. The algorithm first uses K-means algorithm with a large K to partition the cell population into many small clusters. These partitioned data allow the generation of a smoothed density function using the finite mixture model. All local peaks are exhaustively searched by exploring the density function and the cells are clustered by the associated local peak. The algorithm flowPeaks is automatic, fast and reliable and robust to cluster shape and outliers. This algorithm has been applied to flow cytometry data and it has been compared with state of the art algorithms, including Misty Mountain, FLOCK, flowMeans, flowMerge and FLAME. AVAILABILITY The R package flowPeaks is available at https://github.com/yongchao/flowPeaks. CONTACT yongchao.ge@mssm.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yongchao Ge
- Department of Neurology and Center of Translational System Biology, Mount Sinai School of Medicine, New York, NY 10029, USA.
| | | |
Collapse
|
98
|
Hoffmann MH, Klausen TW, Boegsted M, Larsen SF, Schmitz A, Leinoe EB, Schmiegelow K, Hasle H, Bergmann OJ, Sorensen S, Nyegaard M, Dybkaer K, Johnsen HE. Clinical impact of leukemic blast heterogeneity at diagnosis in cytogenetic intermediate-risk acute myeloid leukemia. CYTOMETRY PART B: CLINICAL CYTOMETRY 2012; 82B:123-131. [DOI: 10.1002/cyto.b.20633] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
|
99
|
Ray S, Pyne S. A computational framework to emulate the human perspective in flow cytometric data analysis. PLoS One 2012; 7:e35693. [PMID: 22563466 PMCID: PMC3341382 DOI: 10.1371/journal.pone.0035693] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2011] [Accepted: 03/22/2012] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND In recent years, intense research efforts have focused on developing methods for automated flow cytometric data analysis. However, while designing such applications, little or no attention has been paid to the human perspective that is absolutely central to the manual gating process of identifying and characterizing cell populations. In particular, the assumption of many common techniques that cell populations could be modeled reliably with pre-specified distributions may not hold true in real-life samples, which can have populations of arbitrary shapes and considerable inter-sample variation. RESULTS To address this, we developed a new framework flowScape for emulating certain key aspects of the human perspective in analyzing flow data, which we implemented in multiple steps. First, flowScape begins with creating a mathematically rigorous map of the high-dimensional flow data landscape based on dense and sparse regions defined by relative concentrations of events around modes. In the second step, these modal clusters are connected with a global hierarchical structure. This representation allows flowScape to perform ridgeline analysis for both traversing the landscape and isolating cell populations at different levels of resolution. Finally, we extended manual gating with a new capacity for constructing templates that can identify target populations in terms of their relative parameters, as opposed to the more commonly used absolute or physical parameters. This allows flowScape to apply such templates in batch mode for detecting the corresponding populations in a flexible, sample-specific manner. We also demonstrated different applications of our framework to flow data analysis and show its superiority over other analytical methods. CONCLUSIONS The human perspective, built on top of intuition and experience, is a very important component of flow cytometric data analysis. By emulating some of its approaches and extending these with automation and rigor, flowScape provides a flexible and robust framework for computational cytomics.
Collapse
Affiliation(s)
- Surajit Ray
- Department of Mathematics and Statistics, Boston University, Boston, Massachusetts, United States of America
| | - Saumyadipta Pyne
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute, Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
100
|
Rugg-Gunn PJ, Cox BJ, Lanner F, Sharma P, Ignatchenko V, McDonald ACH, Garner J, Gramolini AO, Rossant J, Kislinger T. Cell-surface proteomics identifies lineage-specific markers of embryo-derived stem cells. Dev Cell 2012; 22:887-901. [PMID: 22424930 PMCID: PMC3405530 DOI: 10.1016/j.devcel.2012.01.005] [Citation(s) in RCA: 117] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 10/20/2011] [Accepted: 01/11/2012] [Indexed: 11/30/2022]
Abstract
The advent of reprogramming and its impact on stem cell biology has renewed interest in lineage restriction in mammalian embryos, the source of embryonic (ES), epiblast (EpiSC), trophoblast (TS), and extraembryonic endoderm (XEN) stem cell lineages. Isolation of specific cell types during stem cell differentiation and reprogramming, and also directly from embryos, is a major technical challenge because few cell-surface proteins are known that can distinguish each cell type. We provide a large-scale proteomic resource of cell-surface proteins for the four embryo-derived stem cell lines. We validated 27 antibodies against lineage-specific cell-surface markers, which enabled investigation of specific cell populations during ES-EpiSC reprogramming and ES-to-XEN differentiation. Identified markers also allowed prospective isolation and characterization of viable lineage progenitors from blastocysts by flow cytometry. These results provide a comprehensive stem cell proteomic resource and enable new approaches to interrogate the mechanisms that regulate cell fate specification.
Collapse
Affiliation(s)
- Peter J Rugg-Gunn
- Program in Developmental and Stem Cell Biology, Hospital for Sick Children Research Institute, 555 University Avenue, Toronto, ON M5G 1X8, Canada
| | | | | | | | | | | | | | | | | | | |
Collapse
|