1
|
Sun J, Choy D, Sompairac N, Jamshidi S, Mishto M, Kordasti S. ImmCellTyper facilitates systematic mass cytometry data analysis for deep immune profiling. eLife 2024; 13:RP95494. [PMID: 39240985 PMCID: PMC11379455 DOI: 10.7554/elife.95494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2024] Open
Abstract
Mass cytometry is a cutting-edge high-dimensional technology for profiling marker expression at the single-cell level, advancing clinical research in immune monitoring. Nevertheless, the vast data generated by cytometry by time-of-flight (CyTOF) poses a significant analytical challenge. To address this, we describe ImmCellTyper (https://github.com/JingAnyaSun/ImmCellTyper), a novel toolkit for CyTOF data analysis. This framework incorporates BinaryClust, an in-house developed semi-supervised clustering tool that automatically identifies main cell types. BinaryClust outperforms existing clustering tools in accuracy and speed, as shown in benchmarks with two datasets of approximately 4 million cells, matching the precision of manual gating by human experts. Furthermore, ImmCellTyper offers various visualisation and analytical tools, spanning from quality control to differential analysis, tailored to users' specific needs for a comprehensive CyTOF data analysis solution. The workflow includes five key steps: (1) batch effect evaluation and correction, (2) data quality control and pre-processing, (3) main cell lineage characterisation and quantification, (4) in-depth investigation of specific cell types; and (5) differential analysis of cell abundance and functional marker expression across study groups. Overall, ImmCellTyper combines expert biological knowledge in a semi-supervised approach to accurately deconvolute well-defined main cell lineages, while maintaining the potential of unsupervised methods to discover novel cell subsets, thus facilitating high-dimensional immune profiling.
Collapse
Affiliation(s)
- Jing Sun
- Centre for Inflammation Biology and Cancer Immunology & Peter Gorer Department of Immunobiology, King's College London, London, United Kingdom
| | - Desmond Choy
- School of Cancer and Pharmaceutical Sciences, King's College London, London, United Kingdom
| | - Nicolas Sompairac
- School of Cancer and Pharmaceutical Sciences, King's College London, London, United Kingdom
| | - Shirin Jamshidi
- School of Cancer and Pharmaceutical Sciences, King's College London, London, United Kingdom
| | - Michele Mishto
- Centre for Inflammation Biology and Cancer Immunology & Peter Gorer Department of Immunobiology, King's College London, London, United Kingdom
- Research Group of Molecular Immunology, Francis Crick Institute, London, United Kingdom
| | - Shahram Kordasti
- School of Cancer and Pharmaceutical Sciences, King's College London, London, United Kingdom
- Haematology Department, Guy's Hospital, London, United Kingdom
- Department of Clinical and Molecular Sciences, Università Politecnica delle Marche, Ancona, Italy
| |
Collapse
|
2
|
Szabó E, Faragó A, Bodor G, Gémes N, Puskás LG, Kovács L, Szebeni GJ. Identification of immune subsets with distinct lectin binding signatures using multi-parameter flow cytometry: correlations with disease activity in systemic lupus erythematosus. Front Immunol 2024; 15:1380481. [PMID: 38774868 PMCID: PMC11106380 DOI: 10.3389/fimmu.2024.1380481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/22/2024] [Indexed: 05/24/2024] Open
Abstract
Objectives Cell surface glycosylation can influence protein-protein interactions with particular relevance to changes in core fucosylation and terminal sialylation. Glycans are ligands for immune regulatory lectin families like galectins (Gals) or sialic acid immunoglobulin-like lectins (Siglecs). This study delves into the glycan alterations within immune subsets of systemic lupus erythematosus (SLE). Methods Evaluation of binding affinities of Galectin-1, Galectin-3, Siglec-1, Aleuria aurantia lectin (AAL, recognizing core fucosylation), and Sambucus nigra agglutinin (SNA, specific for α-2,6-sialylation) was conducted on various immune subsets in peripheral blood mononuclear cells (PBMCs) from control and SLE subjects. Lectin binding was measured by multi-parameter flow cytometry in 18 manually gated subsets of T-cells, NK-cells, NKT-cells, B-cells, and monocytes in unstimulated resting state and also after 3-day activation. Stimulated pre-gated populations were subsequently clustered by FlowSOM algorithm based on lectin binding and activation markers, CD25 or HLA-DR. Results Elevated AAL, SNA and CD25+/CD25- SNA binding ratio in certain stimulated SLE T-cell subsets correlated with SLE Disease Activity Index 2000 (SLEDAI-2K) scores. The significantly increased frequencies of activated AALlow Siglec-1low NK metaclusters in SLE also correlated with SLEDAI-2K indices. In SLE, activated double negative NKTs displayed significantly lower core fucosylation and CD25+/CD25- Siglec-1 binding ratio, negatively correlating with disease activity. The significantly enhanced AAL binding in resting SLE plasmablasts positively correlated with SLEDAI-2K scores. Conclusion Alterations in the glycosylation of immune cells in SLE correlate with disease severity, which might represent potential implications in the pathogenesis of SLE.
Collapse
Affiliation(s)
- Enikő Szabó
- Institute of Genetics, Laboratory of Functional Genomics, HUN-REN Biological Research Center, Szeged, Hungary
- Core Facility, HUN-REN Biological Research Centre, Szeged, Hungary
| | - Anna Faragó
- Astridbio Technologies Ltd, Szeged, Hungary
- Doctoral School of Multidisciplinary Medical Sciences, Albert Szent-Györgyi Medical School, University of Szeged, Szeged, Hungary
| | - Gergely Bodor
- Department of Rheumatology and Immunology, Albert Szent-Gyorgyi Medical School and Health Center, University of Szeged, Szeged, Hungary
| | - Nikolett Gémes
- Institute of Genetics, Laboratory of Functional Genomics, HUN-REN Biological Research Center, Szeged, Hungary
- Core Facility, HUN-REN Biological Research Centre, Szeged, Hungary
| | - László G. Puskás
- Institute of Genetics, Laboratory of Functional Genomics, HUN-REN Biological Research Center, Szeged, Hungary
- Core Facility, HUN-REN Biological Research Centre, Szeged, Hungary
| | - László Kovács
- Department of Rheumatology and Immunology, Albert Szent-Gyorgyi Medical School and Health Center, University of Szeged, Szeged, Hungary
| | - Gábor J. Szebeni
- Institute of Genetics, Laboratory of Functional Genomics, HUN-REN Biological Research Center, Szeged, Hungary
- Core Facility, HUN-REN Biological Research Centre, Szeged, Hungary
- Astridbio Technologies Ltd, Szeged, Hungary
- Department of Internal Medicine, Hematology Center, Faculty of Medicine, University of Szeged, Szeged, Hungary
| |
Collapse
|
3
|
Montante S, Chen Y, Brinkman RR. flowSim: Near duplicate detection for flow cytometry data. Cytometry A 2023; 103:889-901. [PMID: 37530476 PMCID: PMC10834853 DOI: 10.1002/cyto.a.24776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/22/2023] [Accepted: 07/11/2023] [Indexed: 08/03/2023]
Abstract
The analysis of large amounts of data is important for the development of machine learning (ML) models. flowSim is the first algorithm designed to visualize, detect and remove highly redundant information in flow cytometry (FCM) training sets to decrease the computational time for training and increase the performance of ML algorithms by reducing overfitting. flowSim performs near duplicate image detection by combining community detection algorithms with the density analysis of the marker expression values. flowSim clustering compared to consensus manual clustering on a dataset composed of 160 images of bivariate FCM data had a mean Adjusted Rand Index of 0.90, demonstrating its efficiency in identifying similar patterns. flowSim selectively discarded near duplicate files in datasets constructed with known redundancy, and removed 92.6% of FCM images in a dataset of over 500,000 drawn from public repositories.
Collapse
Affiliation(s)
- Sebastiano Montante
- Terry Fox Laboratory, BC Cancer Research, Vancouver, British Columbia, Canada
| | - Yixuan Chen
- Terry Fox Laboratory, BC Cancer Research, Vancouver, British Columbia, Canada
| | - Ryan R. Brinkman
- Terry Fox Laboratory, BC Cancer Research, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada, 675 West 10th Avenue
| |
Collapse
|
4
|
Saihi H, Bessant C, Alazawi W. Automated and reproducible cell identification in mass cytometry using neural networks. Brief Bioinform 2023; 24:bbad392. [PMID: 37930029 PMCID: PMC10630086 DOI: 10.1093/bib/bbad392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 10/04/2023] [Accepted: 10/08/2023] [Indexed: 11/07/2023] Open
Abstract
The principal use of mass cytometry is to identify distinct cell types and changes in their composition, phenotype and function in different samples and conditions. Combining data from different studies has the potential to increase the power of these discoveries in diverse fields such as immunology, oncology and infection. However, current tools are lacking in scalable, reproducible and automated methods to integrate and study data sets from mass cytometry that often use heterogenous approaches to study similar samples. To address these limitations, we present two novel developments: (1) a pre-trained cell identification model named Immunopred that allows automated identification of immune cells without user-defined prior knowledge of expected cell types and (2) a fully automated cytometry meta-analysis pipeline built around Immunopred. We evaluated this pipeline on six COVID-19 study data sets comprising 270 unique samples and uncovered novel significant phenotypic changes in the wider immune landscape of COVID-19 that were not identified when each study was analyzed individually. Applied widely, our approach will support the discovery of novel findings in research areas where cytometry data sets are available for integration.
Collapse
Affiliation(s)
- Hajar Saihi
- Centre for Immunobiology, Blizard Institute, School of Medicine and Dentistry, Barts and the London, UK
| | - Conrad Bessant
- Digital Environment Research Institute, Queen Mary University of London, London, UK
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- Alan Turing Institute, British Library, 96 Euston Rd., London NW1 2DB
| | - William Alazawi
- Centre for Immunobiology, Blizard Institute, School of Medicine and Dentistry, Barts and the London, UK
| |
Collapse
|
5
|
Blampey Q, Bercovici N, Dutertre CA, Pic I, Ribeiro JM, André F, Cournède PH. A biology-driven deep generative model for cell-type annotation in cytometry. Brief Bioinform 2023; 24:bbad260. [PMID: 37497716 DOI: 10.1093/bib/bbad260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/20/2023] [Accepted: 06/27/2023] [Indexed: 07/28/2023] Open
Abstract
Cytometry enables precise single-cell phenotyping within heterogeneous populations. These cell types are traditionally annotated via manual gating, but this method lacks reproducibility and sensitivity to batch effect. Also, the most recent cytometers-spectral flow or mass cytometers-create rich and high-dimensional data whose analysis via manual gating becomes challenging and time-consuming. To tackle these limitations, we introduce Scyan https://github.com/MICS-Lab/scyan, a Single-cell Cytometry Annotation Network that automatically annotates cell types using only prior expert knowledge about the cytometry panel. For this, it uses a normalizing flow-a type of deep generative model-that maps protein expressions into a biologically relevant latent space. We demonstrate that Scyan significantly outperforms the related state-of-the-art models on multiple public datasets while being faster and interpretable. In addition, Scyan overcomes several complementary tasks, such as batch-effect correction, debarcoding and population discovery. Overall, this model accelerates and eases cell population characterization, quantification and discovery in cytometry.
Collapse
Affiliation(s)
- Quentin Blampey
- Université Paris-Saclay, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), 3 rue Joliot Curie, 91190,Gif-sur-Yvette, France
| | - Nadège Bercovici
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Université Paris Cité, Institut Cochin, CNRS, Inserm, 22 Rue Méchain, 75014, Paris, France
| | - Charles-Antoine Dutertre
- Université Paris-Saclay, Gustave Roussy, Inserm U1015, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Isabelle Pic
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Joana Mourato Ribeiro
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Gustave Roussy, Département de Médecine Oncologique, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Fabrice André
- Université Paris-Saclay, Gustave Roussy, Inserm U981, 114 Rue Edouard Vaillant, 94805, Villejuif, France
- Gustave Roussy, Département de Médecine Oncologique, 114 Rue Edouard Vaillant, 94805, Villejuif, France
| | - Paul-Henry Cournède
- Université Paris-Saclay, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), 3 rue Joliot Curie, 91190,Gif-sur-Yvette, France
| |
Collapse
|
6
|
Nguyen PC, Nguyen V, Baldwin K, Kankanige Y, Blombery P, Came N, Westerman DA. Computational flow cytometry provides accurate assessment of measurable residual disease in chronic lymphocytic leukaemia. Br J Haematol 2023; 202:760-770. [PMID: 37052611 DOI: 10.1111/bjh.18802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 03/27/2023] [Accepted: 03/28/2023] [Indexed: 04/14/2023]
Abstract
Undetectable measurable residual disease (MRD) is associated with favourable clinical outcomes in chronic lymphocytic leukaemia (CLL). While assessment is commonly performed using multiparameter flow cytometry (MFC), this approach is associated with limitations including user bias and expertise that may not be widely available. Implementation of unsupervised clustering algorithms in the laboratory can address these limitations and have not been previously reported in a systematic quantitative manner. We developed a computational pipeline to assess CLL MRD using FlowSOM. In the training step, a self-organising map was generated with nodes representing the full breadth of normal immature and mature B cells along with disease immunophenotypes. This map was used to detect MRD in multiple validation cohorts containing a total of 456 samples. This included an evaluation of atypical CLL cases and samples collected from two different laboratories. Computational MRD showed high correlation with expert analysis (Pearson's r > 0.99 for typical CLL). Binary classification of typical CLL samples as either MRD positive or negative demonstrated high concordance (>98%). Interestingly, computational MRD detected disease in a small number of atypical CLL cases in which MRD was not detected by expert analysis. These results demonstrate the feasibility and value of automated MFC analysis in a diagnostic laboratory.
Collapse
Affiliation(s)
- Phillip C Nguyen
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Vuong Nguyen
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Kylie Baldwin
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Yamuna Kankanige
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
| | - Piers Blombery
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
- Department of Clinical Haematology, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| | - Neil Came
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
| | - David A Westerman
- Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
- Department of Clinical Haematology, Peter MacCallum Cancer Centre and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| |
Collapse
|
7
|
Robinson JP, Ostafe R, Iyengar SN, Rajwa B, Fischer R. Flow Cytometry: The Next Revolution. Cells 2023; 12:1875. [PMID: 37508539 PMCID: PMC10378642 DOI: 10.3390/cells12141875] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/06/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Unmasking the subtleties of the immune system requires both a comprehensive knowledge base and the ability to interrogate that system with intimate sensitivity. That task, to a considerable extent, has been handled by an iterative expansion in flow cytometry methods, both in technological capability and also in accompanying advances in informatics. As the field of fluorescence-based cytomics matured, it reached a technological barrier at around 30 parameter analyses, which stalled the field until spectral flow cytometry created a fundamental transformation that will likely lead to the potential of 100 simultaneous parameter analyses within a few years. The simultaneous advance in informatics has now become a watershed moment for the field as it competes with mature systematic approaches such as genomics and proteomics, allowing cytomics to take a seat at the multi-omics table. In addition, recent technological advances try to combine the speed of flow systems with other detection methods, in addition to fluorescence alone, which will make flow-based instruments even more indispensable in any biological laboratory. This paper outlines current approaches in cell analysis and detection methods, discusses traditional and microfluidic sorting approaches as well as next-generation instruments, and provides an early look at future opportunities that are likely to arise.
Collapse
Affiliation(s)
- J Paul Robinson
- Department of Basic Medical Sciences, Purdue University, West Lafayette, IN 47907, USA
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
| | - Raluca Ostafe
- Molecular Evolution, Protein Engineering and Production Facility (PI4D), Purdue University, West Lafayette, IN 47907, USA
| | | | - Bartek Rajwa
- Bindley Bioscience Center, Purdue University, West Lafayette, IN 47907, USA
| | - Rainer Fischer
- Department of Comparative Pathobiology, College of Veterinary Medicine, Purdue University, West Lafayette, IN 47907, USA
- Purdue Institute of Inflammation, Immunology and Infectious Diseases, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
8
|
Li Y, Nguyen J, Anastasiu DC, Arriaga EA. CosTaL: an accurate and scalable graph-based clustering algorithm for high-dimensional single-cell data analysis. Brief Bioinform 2023; 24:bbad157. [PMID: 37150778 PMCID: PMC10199777 DOI: 10.1093/bib/bbad157] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 03/28/2023] [Accepted: 04/02/2023] [Indexed: 05/09/2023] Open
Abstract
With the aim of analyzing large-sized multidimensional single-cell datasets, we are describing a method for Cosine-based Tanimoto similarity-refined graph for community detection using Leiden's algorithm (CosTaL). As a graph-based clustering method, CosTaL transforms the cells with high-dimensional features into a weighted k-nearest-neighbor (kNN) graph. The cells are represented by the vertices of the graph, while an edge between two vertices in the graph represents the close relatedness between the two cells. Specifically, CosTaL builds an exact kNN graph using cosine similarity and uses the Tanimoto coefficient as the refining strategy to re-weight the edges in order to improve the effectiveness of clustering. We demonstrate that CosTaL generally achieves equivalent or higher effectiveness scores on seven benchmark cytometry datasets and six single-cell RNA-sequencing datasets using six different evaluation metrics, compared with other state-of-the-art graph-based clustering methods, including PhenoGraph, Scanpy and PARC. As indicated by the combined evaluation metrics, Costal has high efficiency with small datasets and acceptable scalability for large datasets, which is beneficial for large-scale analysis.
Collapse
Affiliation(s)
- Yijia Li
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 420 Washington Ave. S.E., Minneapolis, 55455, Minnesota, USA
| | - Jonathan Nguyen
- Department of Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, 95053, California, USA
| | - David C Anastasiu
- Department of Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, 95053, California, USA
| | - Edgar A Arriaga
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 420 Washington Ave. S.E., Minneapolis, 55455, Minnesota, USA
- Department of Chemistry, University of Minnesota, Smith Hall, 139 Smith Hall, Pleasant St SE, Minneapolis, 55455, Minnesota, USA
| |
Collapse
|
9
|
Bechi Genzano C, Bezzecchi E, Carnovale D, Mandelli A, Morotti E, Castorani V, Favalli V, Stabilini A, Insalaco V, Ragogna F, Codazzi V, Scotti GM, Del Rosso S, Mazzi BA, De Pellegrin M, Giustina A, Piemonti L, Bosi E, Battaglia M, Morelli MJ, Bonfanti R, Petrelli A. Combined unsupervised and semi-automated supervised analysis of flow cytometry data reveals cellular fingerprint associated with newly diagnosed pediatric type 1 diabetes. Front Immunol 2022; 13:1026416. [PMID: 36389771 PMCID: PMC9647173 DOI: 10.3389/fimmu.2022.1026416] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 10/11/2022] [Indexed: 11/03/2023] Open
Abstract
An unbiased and replicable profiling of type 1 diabetes (T1D)-specific circulating immunome at disease onset has yet to be identified due to experimental and patient selection limitations. Multicolor flow cytometry was performed on whole blood from a pediatric cohort of 107 patients with new-onset T1D, 85 relatives of T1D patients with 0-1 islet autoantibodies (pre-T1D_LR), 58 patients with celiac disease or autoimmune thyroiditis (CD_THY) and 76 healthy controls (HC). Unsupervised clustering of flow cytometry data, validated by a semi-automated gating strategy, confirmed previous findings showing selective increase of naïve CD4 T cells and plasmacytoid DCs, and revealed a decrease in CD56brightNK cells in T1D. Furthermore, a non-selective decrease of CD3+CD56+ regulatory T cells was observed in T1D. The frequency of naïve CD4 T cells at disease onset was associated with partial remission, while it was found unaltered in the pre-symptomatic stages of the disease. Thanks to a broad cohort of pediatric individuals and the implementation of unbiased approaches for the analysis of flow cytometry data, here we determined the circulating immune fingerprint of newly diagnosed pediatric T1D and provide a reference dataset to be exploited for validation or discovery purposes to unravel the pathogenesis of T1D.
Collapse
Affiliation(s)
| | - Eugenia Bezzecchi
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Center for Omics Sciences, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Debora Carnovale
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | | | - Elisa Morotti
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Department of Pediatrics, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Valeria Castorani
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Valeria Favalli
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Department of Pediatrics, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Angela Stabilini
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Vittoria Insalaco
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Francesca Ragogna
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Valentina Codazzi
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | | | - Stefania Del Rosso
- Laboratory Medicine, Autoimmunity Section, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Benedetta Allegra Mazzi
- Immuno-Hematology and Transfusion Medicine (ITMS), IRCCS Ospedale San Raffaele, Milan, Italy
| | - Maurizio De Pellegrin
- Pediatric Orthopedic and Traumatology Unit, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Andrea Giustina
- Institute of Endocrine and Metabolic Sciences, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| | - Lorenzo Piemonti
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| | - Emanuele Bosi
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Department of General Medicine, Diabetes and Endocrinology, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| | - Manuela Battaglia
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Marco J. Morelli
- Center for Omics Sciences, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Riccardo Bonfanti
- Diabetes Research Institute, IRCCS Ospedale San Raffaele, Milan, Italy
- Department of Pediatrics, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| | | |
Collapse
|
10
|
Seal S, Wrobel J, Johnson AM, Nemenoff RA, Schenk EL, Bitler BG, Jordan KR, Ghosh D. On clustering for cell-phenotyping in multiplex immunohistochemistry (mIHC) and multiplexed ion beam imaging (MIBI) data. BMC Res Notes 2022; 15:215. [PMID: 35725622 PMCID: PMC9208090 DOI: 10.1186/s13104-022-06097-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Accepted: 06/07/2022] [Indexed: 12/04/2022] Open
Abstract
OBJECTIVE Multiplex immunohistochemistry (mIHC) and multiplexed ion beam imaging (MIBI) images are usually phenotyped using a manual thresholding process. The thresholding is prone to biases, especially when examining multiple images with high cellularity. RESULTS Unsupervised cell-phenotyping methods including PhenoGraph, flowMeans, and SamSPECTRAL, primarily used in flow cytometry data, often perform poorly or need elaborate tuning to perform well in the context of mIHC and MIBI data. We show that, instead, semi-supervised cell clustering using Random Forests, linear and quadratic discriminant analysis are superior. We test the performance of the methods on two mIHC datasets from the University of Colorado School of Medicine and a publicly available MIBI dataset. Each dataset contains a bunch of highly complex images.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Biostatistics and Informatics, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA.
| | - Julia Wrobel
- Department of Biostatistics and Informatics, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Amber M Johnson
- Department of Medicine, School of Medicine, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Raphael A Nemenoff
- Department of Medicine, School of Medicine, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Erin L Schenk
- Division of Medical Oncology, School of Medicine, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Benjamin G Bitler
- Department of Obstetrics and Gynecology, School of Medicine, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Kimberly R Jordan
- Department of Immunology and Microbiology, School of Medicine, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado, USA
| |
Collapse
|
11
|
Monaghan SA, Li JL, Liu YC, Ko MY, Boyiadzis M, Chang TY, Wang YF, Lee CC, Swerdlow SH, Ko BS. A Machine Learning Approach to the Classification of Acute Leukemias and Distinction From Nonneoplastic Cytopenias Using Flow Cytometry Data. Am J Clin Pathol 2022; 157:546-553. [PMID: 34643210 DOI: 10.1093/ajcp/aqab148] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 08/01/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVES Flow cytometry (FC) is critical for the diagnosis and monitoring of hematologic malignancies. Machine learning (ML) methods rapidly classify multidimensional data and should dramatically improve the efficiency of FC data analysis. We aimed to build a model to classify acute leukemias, including acute promyelocytic leukemia (APL), and distinguish them from nonneoplastic cytopenias. We also sought to illustrate a method to identify key FC parameters that contribute to the model's performance. METHODS Using data from 531 patients who underwent evaluation for cytopenias and/or acute leukemia, we developed an ML model to rapidly distinguish among APL, acute myeloid leukemia/not APL, acute lymphoblastic leukemia, and nonneoplastic cytopenias. Unsupervised learning using gaussian mixture model and Fisher kernel methods were applied to FC listmode data, followed by supervised support vector machine classification. RESULTS High accuracy (ACC, 94.2%; area under the curve [AUC], 99.5%) was achieved based on the 37-parameter FC panel. Using only 3 parameters, however, yielded similar performance (ACC, 91.7%; AUC, 98.3%) and highlighted the significant contribution of light scatter properties. CONCLUSIONS Our findings underscore the potential for ML to automatically identify and prioritize FC specimens that have critical results, including APL and other acute leukemias.
Collapse
Affiliation(s)
- Sara A Monaghan
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- UPMC Presbyterian, Pittsburgh, PA, USA
| | - Jeng-Lin Li
- Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
| | - Yen-Chun Liu
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Department of Pathology, St Jude Children’s Research Hospital, Memphis, TN, USA
| | - Ming-Ya Ko
- Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
| | - Michael Boyiadzis
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA, USA
| | | | | | - Chi-Chun Lee
- Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
| | - Steven H Swerdlow
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- UPMC Presbyterian, Pittsburgh, PA, USA
| | - Bor-Sheng Ko
- Department of Hematological Oncology, National Taiwan University Cancer Center, Taipei, Taiwan
- Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan
| |
Collapse
|
12
|
Zhai Y, Zhang J, Zhang T, Gong Y, Zhang Z, Zhang D, Zhao Y. AOPM: Application of Antioxidant Protein Classification Model in Predicting the Composition of Antioxidant Drugs. Front Pharmacol 2022; 12:818115. [PMID: 35115948 PMCID: PMC8803896 DOI: 10.3389/fphar.2021.818115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/20/2021] [Indexed: 11/18/2022] Open
Abstract
Antioxidant proteins can not only balance the oxidative stress in the body, but are also an important component of antioxidant drugs. Accurate identification of antioxidant proteins is essential to help humans fight diseases and develop new drugs. In this paper, we developed a friendly method AOPM to identify antioxidant proteins. 188D and the Composition of k-spaced Amino Acid Pairs were adopted as the feature extraction method. In addition, the Max-Relevance-Max-Distance algorithm (MRMD) and random forest were the feature selection and classifier, respectively. We used 5-folds cross-validation and independent test dataset to evaluate our model. On the test dataset, AOPM presented a higher performance compared with the state-of-the-art methods. The sensitivity, specificity, accuracy, Matthew’s Correlation Coefficient and an Area Under the Curve reached 87.3, 94.2, 92.0%, 0.815 and 0.972, respectively. In addition, AOPM still has excellent performance in predicting the catalytic enzymes of antioxidant drugs. This work proved the feasibility of virtual drug screening based on sequence information and provided new ideas and solutions for drug development.
Collapse
Affiliation(s)
- Yixiao Zhai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Jingyu Zhang
- Department of Neurology, the Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yue Gong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Zixiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Dandan Zhang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| |
Collapse
|
13
|
Abstract
Flow cytometry is a laser-based technology generating a scattered and a fluorescent light signal that enables rapid analysis of the size and granularity of a particle or single cell. In addition, it offers the opportunity to phenotypically characterize and collect the cell with the use of a variety of fluorescent reagents. These reagents include but are not limited to fluorochrome-conjugated antibodies, fluorescent expressing protein-, viability-, and DNA-binding dyes. Major developments in reagents, electronics, and software within the last 30 years have greatly expanded the ability to combine up to 50 antibodies in one single tube. However, these advances also harbor technical risks and interpretation issues in the identification of certain cell populations which will be summarized in this viewpoint article. It will further provide an overview of different potential applications of flow cytometry in research and its possibilities to be used in the clinic.
Collapse
|
14
|
Béné MC, Lacombe F, Porwit A. Unsupervised flow cytometry analysis in hematological malignancies: A new paradigm. Int J Lab Hematol 2021; 43 Suppl 1:54-64. [PMID: 34288436 DOI: 10.1111/ijlh.13548] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 03/13/2021] [Accepted: 03/28/2021] [Indexed: 01/10/2023]
Abstract
Ever since hematopoietic cells became "events" enumerated and characterized in suspension by cell counters or flow cytometers, researchers and engineers have strived to refine the acquisition and display of the electronic signals generated. A large array of solutions was then developed to identify at best the numerous cell subsets that can be delineated, notably among hematopoietic cells. As instruments became more and more stable and robust, the focus moved to analytic software. Almost concomitantly, the capacity increased to use large panels (both with mass and classical cytometry) and to apply artificial intelligence/machine learning for their analysis. The combination of these concepts raised new analytical possibilities, opening an unprecedented field of subtle exploration for many conditions, including hematopoiesis and hematological disorders. In this review, the general concepts and progress achieved in the development of new analytical approaches for exploring high-dimensional data sets at the single-cell level will be described as they appeared over the past few years. A larger and more practical part will detail the various steps that need to be mastered, both in data acquisition and in the preanalytical check of data files. Finally, a step-by-step explanation of the solution in development to combine the Bioconductor clustering algorithm FlowSOM and the popular and widely used software Kaluza® (Beckman Coulter) will be presented. The aim of this review was to point out that the day when these progresses will reach routine hematology laboratories does not seem so far away.
Collapse
Affiliation(s)
- Marie C Béné
- Hematology Biology, Nantes University Hospital, Nantes, France.,CRCINA Inserm, Nantes, France
| | - Francis Lacombe
- Hematology Biology, Cytometry Department, Bordeaux University Hospital, Bordeaux, France
| | - Anna Porwit
- Department of Clinical Sciences, Oncology and Pathology, Faculty of Medicine, Lund University, Lund, Sweden.,Department of Clinical Genetics and Pathology, Skåne University Hospital, Lund, Sweden
| |
Collapse
|
15
|
Quintelier K, Couckuyt A, Emmaneel A, Aerts J, Saeys Y, Van Gassen S. Analyzing high-dimensional cytometry data using FlowSOM. Nat Protoc 2021; 16:3775-3801. [PMID: 34172973 DOI: 10.1038/s41596-021-00550-0] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 03/31/2021] [Indexed: 02/06/2023]
Abstract
The dimensionality of cytometry data has strongly increased in the last decade, and in many situations the traditional manual downstream analysis becomes insufficient. The field is therefore slowly moving toward more automated approaches, and in this paper we describe the protocol for analyzing high-dimensional cytometry data using FlowSOM, a clustering and visualization algorithm based on a self-organizing map. FlowSOM is used to distinguish cell populations from cytometry data in an unsupervised way and can help to gain deeper insights in fields such as immunology and oncology. Since the original FlowSOM publication (2015), we have validated the tool on a wide variety of datasets, and to write this protocol, we made use of this experience to improve the user-friendliness of the package (e.g., comprehensive functions replacing commonly required scripts). Where the original paper focused mainly on the algorithm description, this protocol offers user guidelines on how to implement the procedure, detailed parameter descriptions and troubleshooting recommendations. The protocol provides clearly annotated R code, and is therefore relevant for all scientists interested in computational high-dimensional analyses without requiring a strong bioinformatics background. We demonstrate the complete workflow, starting from data preparation (such as compensation, transformation and quality control), including detailed discussion of the different FlowSOM parameters and visualization options, and concluding with how the results can be further used to answer biological questions, such as statistical comparison between groups of interest. An average FlowSOM analysis takes 1-3 h to complete, though quality issues can increase this time considerably.
Collapse
Affiliation(s)
- Katrien Quintelier
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Data Mining and Modeling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium.,Department of Pulmonary Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Artuur Couckuyt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Data Mining and Modeling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium
| | - Annelies Emmaneel
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Data Mining and Modeling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium
| | - Joachim Aerts
- Department of Pulmonary Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Yvan Saeys
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Data Mining and Modeling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium
| | - Sofie Van Gassen
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium. .,Data Mining and Modeling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium.
| |
Collapse
|
16
|
Honrado C, Bisegna P, Swami NS, Caselli F. Single-cell microfluidic impedance cytometry: from raw signals to cell phenotypes using data analytics. LAB ON A CHIP 2021; 21:22-54. [PMID: 33331376 PMCID: PMC7909465 DOI: 10.1039/d0lc00840k] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The biophysical analysis of single-cells by microfluidic impedance cytometry is emerging as a label-free and high-throughput means to stratify the heterogeneity of cellular systems based on their electrophysiology. Emerging applications range from fundamental life-science and drug assessment research to point-of-care diagnostics and precision medicine. Recently, novel chip designs and data analytic strategies are laying the foundation for multiparametric cell characterization and subpopulation distinction, which are essential to understand biological function, follow disease progression and monitor cell behaviour in microsystems. In this tutorial review, we present a comparative survey of the approaches to elucidate cellular and subcellular features from impedance cytometry data, covering the related subjects of device design, data analytics (i.e., signal processing, dielectric modelling, population clustering), and phenotyping applications. We give special emphasis to the exciting recent developments of the technique (timeframe 2017-2020) and provide our perspective on future challenges and directions. Its synergistic application with microfluidic separation, sensor science and machine learning can form an essential toolkit for label-free quantification and isolation of subpopulations to stratify heterogeneous biosystems.
Collapse
Affiliation(s)
- Carlos Honrado
- Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22904, USA.
| | | | | | | |
Collapse
|