1
|
Maurits MP, Korsunsky I, Raychaudhuri S, Murphy SN, Smoller JW, Weiss ST, Petukhova LM, Weng C, Wei WQ, Huizinga TWJ, Reinders MJT, Karlson EW, van den Akker EB, Knevel R. A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history. J Am Med Inform Assoc 2022; 29:761-769. [PMID: 35139533 PMCID: PMC9122640 DOI: 10.1093/jamia/ocac008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 11/24/2021] [Accepted: 01/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE To facilitate patient disease subset and risk factor identification by constructing a pipeline which is generalizable, provides easily interpretable results, and allows replication by overcoming electronic health records (EHRs) batch effects. MATERIAL AND METHODS We used 1872 billing codes in EHRs of 102 880 patients from 12 healthcare systems. Using tools borrowed from single-cell omics, we mitigated center-specific batch effects and performed clustering to identify patients with highly similar medical history patterns across the various centers. Our visualization method (PheSpec) depicts the phenotypic profile of clusters, applies a novel filtering of noninformative codes (Ranked Scope Pervasion), and indicates the most distinguishing features. RESULTS We observed 114 clinically meaningful profiles, for example, linking prostate hyperplasia with cancer and diabetes with cardiovascular problems and grouping pediatric developmental disorders. Our framework identified disease subsets, exemplified by 6 "other headache" clusters, where phenotypic profiles suggested different underlying mechanisms: migraine, convulsion, injury, eye problems, joint pain, and pituitary gland disorders. Phenotypic patterns replicated well, with high correlations of ≥0.75 to an average of 6 (2-8) of the 12 different cohorts, demonstrating the consistency with which our method discovers disease history profiles. DISCUSSION Costly clinical research ventures should be based on solid hypotheses. We repurpose methods from single-cell omics to build these hypotheses from observational EHR data, distilling useful information from complex data. CONCLUSION We establish a generalizable pipeline for the identification and replication of clinically meaningful (sub)phenotypes from widely available high-dimensional billing codes. This approach overcomes datatype problems and produces comprehensive visualizations of validation-ready phenotypes.
Collapse
Affiliation(s)
- Marc P Maurits
- Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Ilya Korsunsky
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Shawn N Murphy
- Research Information Science and Computing, Mass General Brigham, Boston, MA, USA
| | - Jordan W Smoller
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lynn M Petukhova
- Lynn M. Petukhova, Department of Dermatology at NewYork-Presbyterian/Columbia University Medical Center (CUMC)
| | - Chunhua Weng
- Chunhua Weng, Biomedical Informatics - Columbia University
| | - Wei-Qi Wei
- Wei-Qi Wei, Biomedical Informatics in the School of Medicine at Vanderbilt University Wei
| | - Thomas W J Huizinga
- Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marcel J T Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- The Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
| | - Elizabeth W Karlson
- Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Erik B van den Akker
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Section of Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Rachel Knevel
- Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands
- Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
2
|
Bae J, Kim JE, Perumalsamy H, Park S, Kim Y, Jun DW, Yoon TH. Mass Cytometry Study on Hepatic Fibrosis and Its Drug-Induced Recovery Using Mouse Peripheral Blood Mononuclear Cells. Front Immunol 2022; 13:814030. [PMID: 35222390 PMCID: PMC8863676 DOI: 10.3389/fimmu.2022.814030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 01/03/2022] [Indexed: 01/10/2023] Open
Abstract
The number of patients with liver diseases has increased significantly with the progress of global industrialization. Hepatic fibrosis, one of the most common liver diseases diagnosed in many developed countries, occurs in response to chronic liver injury and is primarily driven by the development of inflammation. Earlier immunological studies have been focused on the importance of the innate immune response in the pathophysiology of steatohepatitis and fibrosis, but recently, it has also been reported that adaptive immunity, particularly B cells, plays an essential role in hepatic inflammation and fibrosis. However, despite recent data showing the importance of adaptive immunity, relatively little is known about the role of B cells in the pathogenesis of steatohepatitis fibrosis. In this study, a single-cell-based, high-dimensional mass cytometric investigation of the peripheral blood mononuclear cells collected from mice belonging to three groups [normal chow (NC), thioacetamide (TAA), and 11beta-HSD inhibitor drug] was conducted to further understand the pathogenesis of liver fibrosis through reliable noninvasive biomarkers. Firstly, major immune cell types and their population changes were qualitatively analyzed using UMAP dimensionality reduction and two-dimensional visualization technique combined with a conventional manual gating strategy. The population of B cells displayed a twofold increase in the TAA group compared to that in the NC group, which was recovered slightly after treatment with the 11beta-HSD inhibitor drug. In contrast, the populations of NK cells, effector CD4+ T cells, and memory CD8+ T cells were significantly reduced in the TAA group compared with those in the NC group. Further identification and quantification of the major immune cell types and their subsets were conducted based on automated clustering approaches [PhenoGraph (PG) and FlowSOM]. The B-cell subset corresponding to PhenoGraph cluster PG#2 (CD62LhighCD44highLy6chigh B cells) and PG#3 (CD62LhighCD44highLy6clow B cell) appears to play a major role in both the development of hepatic fibrosis and recovery via treatment, whereas PG#1 (CD62LlowCD44highLy6clow B cell) seems to play a dominant role in the development of hepatic fibrosis. These findings provide insights into the roles of cellular subsets of B cells during the progression of, and recovery from, hepatic fibrosis.
Collapse
Affiliation(s)
- Jiwon Bae
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul, South Korea
| | - Ji Eun Kim
- Department of Internal Medicine, Hanyang University Hospital, Seoul, South Korea
| | - Haribalan Perumalsamy
- Research Institute for Convergence of Basic Science, Hanyang University, Seoul, South Korea
| | - Sehee Park
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul, South Korea
| | - Yun Kim
- Hanyang Medicine-Engineering-Bio Collaborative & Comprehensive Center for Drug Development, Hanyang University, Seoul, South Korea.,Department of Clinical Pharmacology and Therapeutics, Hanyang University Hospital, Seoul, South Korea
| | - Dae Won Jun
- Department of Internal Medicine, Hanyang University Hospital, Seoul, South Korea.,Hanyang Medicine-Engineering-Bio Collaborative & Comprehensive Center for Drug Development, Hanyang University, Seoul, South Korea.,Department of Medical and Digital Engineering, College of Engineering, Hanyang University, Seoul, South Korea
| | - Tae Hyun Yoon
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul, South Korea.,Research Institute for Convergence of Basic Science, Hanyang University, Seoul, South Korea.,Institute of Next Generation Material Design, Hanyang University, Seoul, South Korea.,Yoon Idea Lab. Co. Ltd, Seoul, South Korea
| |
Collapse
|
3
|
Yang ZK, Pan L, Zhang Y, Luo H, Gao F. Data-driven identification of SARS-CoV-2 subpopulations using PhenoGraph and binary-coded genomic data. Brief Bioinform 2021; 22:bbab307. [PMID: 34382087 PMCID: PMC8385964 DOI: 10.1093/bib/bbab307] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 07/01/2021] [Accepted: 07/17/2021] [Indexed: 01/08/2023] Open
Abstract
For epidemic prevention and control, the identification of SARS-CoV-2 subpopulations sharing similar micro-epidemiological patterns and evolutionary histories is necessary for a more targeted investigation into the links among COVID-19 outbreaks caused by SARS-CoV-2 with similar genetic backgrounds. Genomic sequencing analysis has demonstrated the ability to uncover viral genetic diversity. However, an objective analysis is necessary for the identification of SARS-CoV-2 subpopulations. Herein, we detected all the mutations in 186 682 SARS-CoV-2 isolates. We found that the GC content of the SARS-CoV-2 genome had evolved to be lower, which may be conducive to viral spread, and the frameshift mutation was rare in the global population. Next, we encoded the genomic mutations in binary form and used an unsupervised learning classifier, namely PhenoGraph, to classify this information. Consequently, PhenoGraph successfully identified 303 SARS-CoV-2 subpopulations, and we found that the PhenoGraph classification was consistent with, but more detailed and precise than the known GISAID clades (S, L, V, G, GH, GR, GV and O). By the change trend analysis, we found that the growth rate of SARS-CoV-2 diversity has slowed down significantly. We also analyzed the temporal, spatial and phylogenetic relationships among the subpopulations and revealed the evolutionary trajectory of SARS-CoV-2 to a certain extent. Hence, our results provide a better understanding of the patterns and trends in the genomic evolution and epidemiology of SARS-CoV-2.
Collapse
Affiliation(s)
- Zhi-Kai Yang
- Fifth Affiliated Hospital of Guangzhou Medical University, Guangzhou 510700, China
| | - Lingyu Pan
- Guangzhou Nanxin Pharmaceutical Co., Ltd., Guangzhou 510700, China
| | - Yanming Zhang
- SinoGenoMax Co., Ltd./Chinese National Human Genome Center, Guangzhou 510700, China
| | - Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- Department of Physics, School of Science, and the Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| |
Collapse
|
4
|
Yang B, Davis JM, Gomez TH, Younes M, Zhao X, Shen Q, Wang R, Ko TC, Cao Y. Characteristic pancreatic and splenic immune cell infiltration patterns in mouse acute pancreatitis. Cell Biosci 2021; 11:28. [PMID: 33531047 PMCID: PMC7852096 DOI: 10.1186/s13578-021-00544-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 01/21/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND A systemic evaluation of immune cell infiltration patterns in experimental acute pancreatitis (AP) is lacking. Using multi-dimensional flow cytometry, this study profiled infiltrating immune cell types in multiple AP mouse models. METHODS Three AP models were generated in C57BL/6 mice via cerulein (CAE) injection, alcohol and palmitoleic acid (EtOH + POA) injection, and alcohol diet feeding and cerulein (EtOH + CAE) injection. Primary pancreatic cells and splenocytes were prepared, and multi-dimensional flow cytometry was performed and analyzed by manual gating and computerized PhenoGraph, followed by visualization with t-distributed stochastic neighbor embedding (t-SNE). RESULTS CAE treatment induced a time-dependent increase of major innate immune cells and a decrease of follicular B cells, and TCD4+ cells and the subtypes in the pancreas, whereas elicited a reversed pattern in the spleen. EtOH + POA treatment resulted in weaker effects than CAE treatment. EtOH feeding enhanced CAE-induced amylase secretion, but unexpectedly attenuated CAE-induced immune cell regulation. In comparison with manual gating analysis, computerized analysis demonstrated a remarkable time efficiency and reproducibility on the innate immune cells and B cells. CONCLUSIONS The reverse pattern of increased innate and decreased adaptive immune cells was consistent in the pancreas in CAE and EtOH + POA treatments. Alcohol feeding opposed the CAE effect on immune cell regulation. Together, the immune profiling approach utilized in this study provides a better understanding of overall immune responses in AP, which may facilitate the identification of intervention windows and new therapeutic strategies. Computerized analysis is superior to manual gating by dramatically reducing analysis time.
Collapse
Affiliation(s)
- Baibing Yang
- Department of Surgery, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Joy M Davis
- Department of Surgery, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Thomas H Gomez
- Center of Laboratory Animal Medicine and Care, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Mamoun Younes
- Department of Pathology & Laboratory Medicine, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.,Department of Pathology, George Washington University School of Medicine and Health Sciences, George Washington University Hospital, Washington, DC, 20037, USA
| | - Xiurong Zhao
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Qiang Shen
- Department of Genetics, Louisiana State University Health Sciences Center, New Orleans, LA, 70112, USA
| | - Run Wang
- Department of Surgery, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Tien C Ko
- Department of Surgery, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| | - Yanna Cao
- Department of Surgery, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
5
|
Bandyopadhyay S, Fowles JS, Yu L, Fisher DAC, Oh ST. Identification of functionally primitive and immunophenotypically distinct subpopulations in secondary acute myeloid leukemia by mass cytometry. Cytometry B Clin Cytom 2018; 96:46-56. [PMID: 30426661 DOI: 10.1002/cyto.b.21743] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Revised: 09/09/2018] [Accepted: 09/18/2018] [Indexed: 01/01/2023]
Abstract
BACKGROUND Background: Mass cytometry (CyTOF) is a powerful tool for analyzing cellular networks at the single cell level. Due to the high-dimensional nature of this approach, analysis algorithms have been developed to visualize and interpret mass cytometry data. In this study, we applied these approaches to a cohort of patients with secondary acute myeloid leukemia (sAML). METHODS We utilized mass cytometry to interrogate localization and intensity of thrombopoietin-mediated intracellular signaling in sAML. Extracellular and intracellular phenotypes were dissected using SPADE, viSNE, and PhenoGraph. RESULTS Healthy controls exhibited highly localized signaling responses largely restricted to the hematopoietic stem/progenitor cell (HSPC) compartment. In contrast, sAML samples contained subpopulations outside the HSPC compartment exhibiting thrombopoietin (TPO) sensitivity comparable to or greater than immunophenotypically defined HSPCs. We employed unsupervised clustering by PhenoGraph to elucidate distinct subpopulations within these heterogeneous samples. One metacluster composed almost exclusively of Lin- CD61+ CD34- CD38- CD45low cells was identified. This subpopulation was not readily identified by established manual gating approaches, and generally exhibited greater STAT phosphorylation in response to TPO stimulation than did Lin- CD61- CD34+ CD38- cells. Lin- CD61+ CD34- CD38- CD45low cells were identified in three additional sAML patients analyzed independently using a manual gating approach based upon PhenoGraph results. Each patient exhibited a similar TPO hypersensitivity to the PhenoGraph metacluster. CONCLUSIONS The identification of this cellular subpopulation highlights the limitations of manual gating in sAML. Our study demonstrates the potential for mass cytometry to elucidate rare subpopulations in highly heterogeneous tumors by utilizing unsupervised high dimensional analysis. © 2018 International Clinical Cytometry Society.
Collapse
Affiliation(s)
- Shovik Bandyopadhyay
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Division of Hematology, Washington University School of Medicine, St Louis, Missouri
| | - Jared S Fowles
- Division of Hematology, Washington University School of Medicine, St Louis, Missouri
| | - Liyang Yu
- Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, St. Louis, Missouri
| | - Daniel A C Fisher
- Division of Hematology, Washington University School of Medicine, St Louis, Missouri
| | - Stephen T Oh
- Division of Hematology, Washington University School of Medicine, St Louis, Missouri.,Center for Human Immunology and Immunotherapy Programs, Washington University School of Medicine, St. Louis, Missouri
| |
Collapse
|
6
|
DiGiuseppe JA, Cardinali JL, Rezuke WN, Pe’er D. PhenoGraph and viSNE facilitate the identification of abnormal T-cell populations in routine clinical flow cytometric data. Cytometry B Clin Cytom 2018; 94:588-601. [PMID: 28865188 PMCID: PMC5834343 DOI: 10.1002/cyto.b.21588] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 07/23/2017] [Accepted: 08/29/2017] [Indexed: 01/22/2023]
Abstract
BACKGROUND Flow cytometric identification of neoplastic T-cell populations is complicated by the wide range of phenotypic abnormalities in T-cell neoplasia, and the diverse repertoire of reactive T-cell phenotypes. We evaluated whether a recently described clustering algorithm, PhenoGraph, and dimensionality-reduction algorithm, viSNE, might facilitate the identification of abnormal T-cell populations in routine clinical flow cytometric data. METHODS We applied PhenoGraph and viSNE to peripheral blood mononuclear cells labeled with a single 8-color T/NK-cell antibody combination. Individual peripheral blood samples containing either a T-cell neoplasm or reactive lymphocytosis were analyzed together with a cohort of 10 normal samples, which established the location and identity of normal mononuclear-cell subsets in viSNE displays. RESULTS PhenoGraph-derived subpopulations from the normal samples formed regions of phenotypic similarity in the viSNE display describing normal mononuclear-cell subsets, which correlated with those obtained by manual gating (r2 = 0.99, P < 0.0001). In 24 of 24 cases of T-cell neoplasia with an aberrant phenotype, compared with 4 of 17 cases of reactive lymphocytosis (P = 1.4 × 10-7 , Fisher Exact test), PhenoGraph-derived subpopulations originating exclusively from the abnormal sample formed one or more distinct phenotypic regions in the viSNE display, which represented the neoplastic T cells, and reactive T-cell subpopulations not present in the normal cohort, respectively. The numbers of neoplastic T cells identified using PhenoGraph/viSNE correlated with those obtained by manual gating (r2 = 0.99; P < 0.0001). CONCLUSIONS PhenoGraph and viSNE may facilitate the identification of abnormal T-cell populations in routine clinical flow cytometric data. © 2017 Clinical Cytometry Society.
Collapse
Affiliation(s)
- Joseph A. DiGiuseppe
- Department of Pathology & Laboratory Medicine, Hartford Hospital, Hartford, Connecticut,Correspondence to: Joseph A. DiGiuseppe, Department of Pathology & Laboratory Medicine, Hartford Hospital, 80 Seymour St, Hartford, CT 06102-5037, USA or Dana Pe’er, Program in Computational and Systems Biology, Sloan Kettering Institute, 417 East 68th Street, New York, NY 10065, USA.
| | - Jolene L. Cardinali
- Department of Pathology & Laboratory Medicine, Hartford Hospital, Hartford, Connecticut
| | - William N. Rezuke
- Department of Pathology & Laboratory Medicine, Hartford Hospital, Hartford, Connecticut
| | - Dana Pe’er
- Program in Computational and Systems Biology, Sloan Kettering Institute, New York, New York,Correspondence to: Joseph A. DiGiuseppe, Department of Pathology & Laboratory Medicine, Hartford Hospital, 80 Seymour St, Hartford, CT 06102-5037, USA or Dana Pe’er, Program in Computational and Systems Biology, Sloan Kettering Institute, 417 East 68th Street, New York, NY 10065, USA.
| |
Collapse
|