1
|
Lewis JE, Cooper LAD, Jaye DL, Pozdnyakova O. Automated Deep Learning-Based Diagnosis and Molecular Characterization of Acute Myeloid Leukemia Using Flow Cytometry. Mod Pathol 2024; 37:100373. [PMID: 37925056 DOI: 10.1016/j.modpat.2023.100373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/23/2023] [Accepted: 10/28/2023] [Indexed: 11/06/2023]
Abstract
The current flow cytometric analysis of blood and bone marrow samples for diagnosis of acute myeloid leukemia (AML) relies heavily on manual intervention in the processing and analysis steps, introducing significant subjectivity into resulting diagnoses and necessitating highly trained personnel. Furthermore, concurrent molecular characterization via cytogenetics and targeted sequencing can take multiple days, delaying patient diagnosis and treatment. Attention-based multi-instance learning models (ABMILMs) are deep learning models that make accurate predictions and generate interpretable insights regarding the classification of a sample from individual events/cells; nonetheless, these models have yet to be applied to flow cytometry data. In this study, we developed a computational pipeline using ABMILMs for the automated diagnosis of AML cases based exclusively on flow cytometric data. Analysis of 1820 flow cytometry samples shows that this pipeline provides accurate diagnoses of acute leukemia (area under the receiver operating characteristic curve [AUROC] 0.961) and accurately differentiates AML vs B- and T-lymphoblastic leukemia (AUROC 0.965). Models for prediction of 9 cytogenetic aberrancies and 32 pathogenic variants in AML provide accurate predictions, particularly for t(15;17)(PML::RARA) [AUROC 0.929], t(8;21)(RUNX1::RUNX1T1) (AUROC 0.814), and NPM1 variants (AUROC 0.807). Finally, we demonstrate how these models generate interpretable insights into which individual flow cytometric events and markers deliver optimal diagnostic utility, providing hematopathologists with a data visualization tool for improved data interpretation, as well as novel biological associations between flow cytometric marker expression and cytogenetic/molecular variants in AML. Our study is the first to illustrate the feasibility of using deep learning-based analysis of flow cytometric data for automated AML diagnosis and molecular characterization.
Collapse
Affiliation(s)
- Joshua E Lewis
- Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts
| | - Lee A D Cooper
- Department of Pathology, Northwestern University, Chicago, Illinois
| | - David L Jaye
- Department of Pathology and Laboratory Medicine, Emory University, Atlanta, Georgia
| | - Olga Pozdnyakova
- Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts.
| |
Collapse
|
2
|
Lewis JE, Cooper LA, Jaye DL, Pozdnyakova O. Automated Deep Learning-Based Diagnosis and Molecular Characterization of Acute Myeloid Leukemia using Flow Cytometry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.18.558289. [PMID: 37808719 PMCID: PMC10557578 DOI: 10.1101/2023.09.18.558289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Current flow cytometric analysis of blood and bone marrow samples for diagnosis of acute myeloid leukemia (AML) relies heavily on manual intervention in both the processing and analysis steps, introducing significant subjectivity into resulting diagnoses and necessitating highly trained personnel. Furthermore, concurrent molecular characterization via cytogenetics and targeted sequencing can take multiple days, delaying patient diagnosis and treatment. Attention-based multi-instance learning models (ABMILMs) are deep learning models which make accurate predictions and generate interpretable insights regarding the classification of a sample from individual events/cells; nonetheless, these models have yet to be applied to flow cytometry data. In this study, we developed a computational pipeline using ABMILMs for the automated diagnosis of AML cases based exclusively on flow cytometric data. Analysis of 1,820 flow cytometry samples shows that this pipeline provides accurate diagnoses of acute leukemia [AUROC 0.961] and accurately differentiates AML versus B- and T-lymphoblastic leukemia [AUROC 0.965]. Models for prediction of 9 cytogenetic aberrancies and 32 pathogenic variants in AML provide accurate predictions, particularly for t(15;17)(PML::RARA) [AUROC 0.929], t(8;21)(RUNX1::RUNX1T1) [AUROC 0.814], and NPM1 variants [AUROC 0.807]. Finally, we demonstrate how these models generate interpretable insights into which individual flow cytometric events and markers deliver optimal diagnostic utility, providing hematopathologists with a data visualization tool for improved data interpretation, as well as novel biological associations between flow cytometric marker expression and cytogenetic/molecular variants in AML. Our study is the first to illustrate the feasibility of using deep learning-based analysis of flow cytometric data for automated AML diagnosis and molecular characterization.
Collapse
Affiliation(s)
- Joshua E. Lewis
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA, USA
| | - Lee A.D. Cooper
- Department of Pathology, Northwestern University, Chicago, IL, USA
| | - David L. Jaye
- Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, USA
| | - Olga Pozdnyakova
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA, USA
| |
Collapse
|
3
|
Robinson JP, Ostafe R, Iyengar SN, Rajwa B, Fischer R. Flow Cytometry: The Next Revolution. Cells 2023; 12:1875. [PMID: 37508539 PMCID: PMC10378642 DOI: 10.3390/cells12141875] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/06/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Unmasking the subtleties of the immune system requires both a comprehensive knowledge base and the ability to interrogate that system with intimate sensitivity. That task, to a considerable extent, has been handled by an iterative expansion in flow cytometry methods, both in technological capability and also in accompanying advances in informatics. As the field of fluorescence-based cytomics matured, it reached a technological barrier at around 30 parameter analyses, which stalled the field until spectral flow cytometry created a fundamental transformation that will likely lead to the potential of 100 simultaneous parameter analyses within a few years. The simultaneous advance in informatics has now become a watershed moment for the field as it competes with mature systematic approaches such as genomics and proteomics, allowing cytomics to take a seat at the multi-omics table. In addition, recent technological advances try to combine the speed of flow systems with other detection methods, in addition to fluorescence alone, which will make flow-based instruments even more indispensable in any biological laboratory. This paper outlines current approaches in cell analysis and detection methods, discusses traditional and microfluidic sorting approaches as well as next-generation instruments, and provides an early look at future opportunities that are likely to arise.
Collapse
Affiliation(s)
- J Paul Robinson
- Department of Basic Medical Sciences, Purdue University, West Lafayette, IN 47907, USA
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
| | - Raluca Ostafe
- Molecular Evolution, Protein Engineering and Production Facility (PI4D), Purdue University, West Lafayette, IN 47907, USA
| | | | - Bartek Rajwa
- Bindley Bioscience Center, Purdue University, West Lafayette, IN 47907, USA
| | - Rainer Fischer
- Department of Comparative Pathobiology, College of Veterinary Medicine, Purdue University, West Lafayette, IN 47907, USA
- Purdue Institute of Inflammation, Immunology and Infectious Diseases, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
4
|
Hu Z, Bhattacharya S, Butte AJ. Application of Machine Learning for Cytometry Data. Front Immunol 2022; 12:787574. [PMID: 35046945 PMCID: PMC8761933 DOI: 10.3389/fimmu.2021.787574] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 12/14/2021] [Indexed: 01/23/2023] Open
Abstract
Modern cytometry technologies present opportunities to profile the immune system at a single-cell resolution with more than 50 protein markers, and have been widely used in both research and clinical settings. The number of publicly available cytometry datasets is growing. However, the analysis of cytometry data remains a bottleneck due to its high dimensionality, large cell numbers, and heterogeneity between datasets. Machine learning techniques are well suited to analyze complex cytometry data and have been used in multiple facets of cytometry data analysis, including dimensionality reduction, cell population identification, and sample classification. Here, we review the existing machine learning applications for analyzing cytometry data and highlight the importance of publicly available cytometry data that enable researchers to develop and validate machine learning methods.
Collapse
Affiliation(s)
- Zicheng Hu
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, United States
| | - Sanchita Bhattacharya
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
5
|
Wong N, Kim D, Robinson Z, Huang C, Conboy IM. K-means quantization for a web-based open-source flow cytometry analysis platform. Sci Rep 2021; 11:6735. [PMID: 33762594 PMCID: PMC7991430 DOI: 10.1038/s41598-021-86015-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 03/03/2021] [Indexed: 11/20/2022] Open
Abstract
Flow cytometry (FCM) is an analytic technique that is capable of detecting and recording the emission of fluorescence and light scattering of cells or particles (that are collectively called “events”) in a population1. A typical FCM experiment can produce a large array of data making the analysis computationally intensive2. Current FCM data analysis platforms (FlowJo3, etc.), while very useful, do not allow interactive data processing online due to the data size limitations. Here we report a more effective way to analyze FCM data on the web. Freecyto is a free and intuitive Python-flask-based web application that uses a weighted k-means clustering algorithm to facilitate the interactive analysis of flow cytometry data. A key limitation of web browsers is their inability to interactively display large amounts of data. Freecyto addresses this bottleneck through the use of the k-means algorithm to quantize the data, allowing the user to access a representative set of data points for interactive visualization of complex datasets. Moreover, Freecyto enables the interactive analyses of large complex datasets while preserving the standard FCM visualization features, such as the generation of scatterplots (dotplots), histograms, heatmaps, boxplots, as well as a SQL-based sub-population gating feature2. We also show that Freecyto can be applied to the analysis of various experimental setups that frequently require the use of FCM. Finally, we demonstrate that the data accuracy is preserved when Freecyto is compared to conventional FCM software.
Collapse
Affiliation(s)
- Nathan Wong
- Department of Bioengineering and QB3, UC Berkeley, Berkeley, CA, 94720, USA.
| | - Daehwan Kim
- Department of Bioengineering and QB3, UC Berkeley, Berkeley, CA, 94720, USA
| | - Zachery Robinson
- Department of Bioengineering and QB3, UC Berkeley, Berkeley, CA, 94720, USA
| | - Connie Huang
- Department of Bioengineering and QB3, UC Berkeley, Berkeley, CA, 94720, USA
| | - Irina M Conboy
- Department of Bioengineering and QB3, UC Berkeley, Berkeley, CA, 94720, USA.
| |
Collapse
|
6
|
Hunter-Schlichting D, Lane J, Cole B, Flaten Z, Barcelo H, Ramasubramanian R, Cassidy E, Faul J, Crimmins E, Pankratz N, Thyagarajan B. Validation of a hybrid approach to standardize immunophenotyping analysis in large population studies: The Health and Retirement Study. Sci Rep 2020; 10:8759. [PMID: 32472068 PMCID: PMC7260195 DOI: 10.1038/s41598-020-65016-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 04/01/2020] [Indexed: 12/11/2022] Open
Abstract
Traditional manual gating strategies are often time-intensive, place a high burden on the analyzer, and are susceptible to bias between analyzers. Several automated gating methods have shown to exceed performance of manual gating for a limited number of cell subsets. However, many of the automated algorithms still require significant manual interventions or have yet to demonstrate their utility in large datasets. Therefore, we developed an approach that utilizes a previously published automated algorithm (OpenCyto framework) with a manually created hierarchically cell gating template implemented, along with a custom developed visualization software (FlowAnnotator) to rapidly and efficiently analyze immunophenotyping data in large population studies. This approach allows pre-defining populations that can be analyzed solely by automated analysis and incorporating manual refinement for smaller downstream populations. We validated this method with traditional manual gating strategies for 24 subsets of T cells, B cells, NK cells, monocytes and dendritic cells in 931 participants from the Health and Retirement Study (HRS). Our results show a high degree of correlation (r ≥ 0.80) for 18 (78%) of the 24 cell subsets. For the remaining subsets, the correlation was low (<0.80) primarily because of the low numbers of events recorded in these subsets. The mean difference in the absolute counts between the hybrid method and manual gating strategy of these cell subsets showed results that were very similar to the traditional manual gating method. We describe a practical method for standardization of immunophenotyping methods in large scale population studies that provides a rapid, accurate and reproducible alternative to labor intensive manual gating strategies.
Collapse
Affiliation(s)
| | - John Lane
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Benjamin Cole
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Zachary Flaten
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Helene Barcelo
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Ramya Ramasubramanian
- Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
| | - Erin Cassidy
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Jessica Faul
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, USA
| | - Eileen Crimmins
- Davis School of Gerontology, University of Southern California Davis, Los Angeles, CA, USA
| | - Nathan Pankratz
- Divison of Computational Pathology, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA
| | - Bharat Thyagarajan
- Division of Molecular Pathology and Genomics, Department of Laboratory Medicine and Pathology, Minneapolis, MN, USA.
| |
Collapse
|
7
|
Scheuermann RH, Bui J, Wang HY, Qian Y. Automated Analysis of Clinical Flow Cytometry Data: A Chronic Lymphocytic Leukemia Illustration. Clin Lab Med 2017; 37:931-944. [PMID: 29128077 PMCID: PMC5766345 DOI: 10.1016/j.cll.2017.07.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Flow cytometry is used in cell-based diagnostic evaluation for blood-borne malignancies including leukemia and lymphoma. The current practice for cytometry data analysis relies on manual gating to identify cell subsets in complex mixtures, which is subjective, labor-intensive, and poorly reproducible. This article reviews recent efforts to develop, validate, and disseminate automated computational methods and pipelines for cytometry data analysis that could help overcome the limitations of manual analysis and provide for efficient and data-driven diagnostic applications. It demonstrates the performance of an optimized computational pipeline in a pilot study of chronic lymphocytic leukemia data from the authors' clinical diagnostic laboratory.
Collapse
Affiliation(s)
- Richard H Scheuermann
- Department of Informatics, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA.
| | - Jack Bui
- Department of Pathology, University of California, San Diego, Biomedical Sciences Building Room 1028, 9500 Gilman Drive, La Jolla, CA 92093-0612, USA
| | - Huan-You Wang
- Department of Pathology, School of Medicine, University of California, San Diego, 3855 Health Sciences Drive, La Jolla, CA 92093-0987, USA
| | - Yu Qian
- Department of Informatics, J. Craig Venter Institute, 4120 Capricorn Lane, La Jolla, CA 92037, USA
| |
Collapse
|