1
|
Palomino-Echeverria S, Huergo E, Ortega-Legarreta A, Uson Raposo EM, Aguilar F, Peña-Ramirez CDL, López-Vicario C, Alessandria C, Laleman W, Queiroz Farias A, Moreau R, Fernandez J, Arroyo V, Caraceni P, Lagani V, Sánchez-Garrido C, Clària J, Tegner J, Trebicka J, Kiani NA, Planell N, Rautou PE, Gomez-Cabrero D. A robust clustering strategy for stratification unveils unique patient subgroups in acutely decompensated cirrhosis. J Transl Med 2024; 22:599. [PMID: 38937846 PMCID: PMC11210156 DOI: 10.1186/s12967-024-05386-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 06/10/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Patient heterogeneity poses significant challenges for managing individuals and designing clinical trials, especially in complex diseases. Existing classifications rely on outcome-predicting scores, potentially overlooking crucial elements contributing to heterogeneity without necessarily impacting prognosis. METHODS To address patient heterogeneity, we developed ClustALL, a computational pipeline that simultaneously faces diverse clinical data challenges like mixed types, missing values, and collinearity. ClustALL enables the unsupervised identification of patient stratifications while filtering for stratifications that are robust against minor variations in the population (population-based) and against limited adjustments in the algorithm's parameters (parameter-based). RESULTS Applied to a European cohort of patients with acutely decompensated cirrhosis (n = 766), ClustALL identified five robust stratifications, using only data at hospital admission. All stratifications included markers of impaired liver function and number of organ dysfunction or failure, and most included precipitating events. When focusing on one of these stratifications, patients were categorized into three clusters characterized by typical clinical features; notably, the 3-cluster stratification showed a prognostic value. Re-assessment of patient stratification during follow-up delineated patients' outcomes, with further improvement of the prognostic value of the stratification. We validated these findings in an independent prospective multicentre cohort of patients from Latin America (n = 580). CONCLUSIONS By applying ClustALL to patients with acutely decompensated cirrhosis, we identified three patient clusters. Following these clusters over time offers insights that could guide future clinical trial design. ClustALL is a novel and robust stratification method capable of addressing the multiple challenges of patient stratification in most complex diseases.
Collapse
Affiliation(s)
| | - Estefania Huergo
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain
| | - Asier Ortega-Legarreta
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain
| | - Eva M Uson Raposo
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Ferran Aguilar
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | | | - Cristina López-Vicario
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Biochemistry and Molecular Genetics Service, Hospital Clínic-IDIBAPS, Barcelona, Spain
| | - Carlo Alessandria
- Division of Gastroenterology and Hepatology, A.O.U. Città della Salute e della Scienza di Torino, Torino, Italy
| | - Wim Laleman
- Department of Gastroenterology & Hepatology, Section of Liver & Biliopancreatic disorders and Liver Transplantation, University Hospitals Leuven, KU LEUVEN, Leuven, Belgium
| | - Alberto Queiroz Farias
- Department of Gastroenterology, Hospital das Clínicas, University of São Paulo School of Medicine, Paulo School, Brazil
| | - Richard Moreau
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Université Paris-Cité, Inserm, Centre de recherche sur l'inflammation, UMR 1149, Paris, France
- Assistance Publique-Hôpitaux de Paris (AP-HP), Paris, France
- Hôpital Beaujon, Service d'Hépatologie, Clichy, France
| | - Javier Fernandez
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Vicente Arroyo
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Paolo Caraceni
- Department of Medical and Surgical Science, University of Bologna, Bologna, Italy
- IRCCS Azienda Ospedaliera-Universitaria di Bologna, Bologna, Italy
| | - Vincenzo Lagani
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal, Saudi Arabia
- Institute of Chemical Biology, Ilia State University, Tbilisi, 0162, Georgia
| | | | - Joan Clària
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Biochemistry and Molecular Genetics Service, Hospital Clínic-IDIBAPS, Barcelona, Spain
- CIBERehd, Barcelona, Spain
- Department of Biomedical Sciences, University of Barcelona, Barcelona, Spain
| | - Jesper Tegner
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal, Saudi Arabia
- Unit of Computational Medicine, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Jonel Trebicka
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Department of internal medicine B, University of Münster, Münster, Germany
| | - Narsis A Kiani
- Algorithmic Dynamics Lab, Center for Molecular Medicine, Karolinska Institutet, Solna, Sweden
- Department of Oncology-Pathology, Karolinska Institutet, Solna, Sweden
| | - Nuria Planell
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain.
- Computational Biology Program, Universidad de Navarra, CIMA, Instituto de Investigación Sanitaria de Navarra (IdiSNA), Navarra, 31008, Spain.
| | - Pierre-Emmanuel Rautou
- Université Paris-Cité, Inserm, Centre de recherche sur l'inflammation, UMR 1149, Paris, France.
- AP-HP, Hôpital Beaujon, Service d'Hépatologie, DMU DIGEST, Centre de Référence des Maladies Vasculaires du Foie, FILFOIE, ERN RARE-LIVER, Clichy, France.
| | - David Gomez-Cabrero
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain.
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| |
Collapse
|
2
|
He T, Belouali A, Patricoski J, Lehmann H, Ball R, Anagnostou V, Kreimeyer K, Botsis T. Trends and opportunities in computable clinical phenotyping: A scoping review. J Biomed Inform 2023; 140:104335. [PMID: 36933631 DOI: 10.1016/j.jbi.2023.104335] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023]
Abstract
Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping. Five databases were searched using a query that combined the concepts of automation, clinical context, and phenotyping. Subsequently, four reviewers screened 7960 records (after removing over 4000 duplicates) and selected 139 that satisfied the inclusion criteria. This dataset was analyzed to extract information on target use cases, data-related topics, phenotyping methodologies, evaluation strategies, and portability of developed solutions. Most studies supported patient cohort selection without discussing the application to specific use cases, such as precision medicine. Electronic Health Records were the primary source in 87.1 % (N = 121) of all studies, and International Classification of Diseases codes were heavily used in 55.4 % (N = 77) of all studies, however, only 25.9 % (N = 36) of the records described compliance with a common data model. In terms of the presented methods, traditional Machine Learning (ML) was the dominant method, often combined with natural language processing and other approaches, while external validation and portability of computable phenotypes were pursued in many cases. These findings revealed that defining target use cases precisely, moving away from sole ML strategies, and evaluating the proposed solutions in the real setting are essential opportunities for future work. There is also momentum and an emerging need for computable phenotyping to support clinical and epidemiological research and precision medicine.
Collapse
Affiliation(s)
- Ting He
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Anas Belouali
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Jessica Patricoski
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Harold Lehmann
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Robert Ball
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Valsamo Anagnostou
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kory Kreimeyer
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Taxiarchis Botsis
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
3
|
Chen J, Guo C, Lu M, Ding S. Unifying Diagnosis Identification and Prediction Method Embedding the Disease Ontology Structure From Electronic Medical Records. Front Public Health 2022; 9:793801. [PMID: 35127624 PMCID: PMC8811031 DOI: 10.3389/fpubh.2021.793801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE The reasonable classification of a large number of distinct diagnosis codes can clarify patient diagnostic information and help clinicians to improve their ability to assign and target treatment for primary diseases. Our objective is to identify and predict a unifying diagnosis (UD) from electronic medical records (EMRs). METHODS We screened 4,418 sepsis patients from a public MIMIC-III database and extracted their diagnostic information for UD identification, their demographic information, laboratory examination information, chief complaint, and history of present illness information for UD prediction. We proposed a data-driven UD identification and prediction method (UDIPM) embedding the disease ontology structure. First, we designed a set similarity measure method embedding the disease ontology structure to generate a patient similarity matrix. Second, we applied affinity propagation clustering to divide patients into different clusters, and extracted a typical diagnosis code co-occurrence pattern from each cluster. Furthermore, we identified a UD by fusing visual analysis and a conditional co-occurrence matrix. Finally, we trained five classifiers in combination with feature fusion and feature selection method to unify the diagnosis prediction. RESULTS The experimental results on a public electronic medical record dataset showed that the UDIPM could extracted a typical diagnosis code co-occurrence pattern effectively, identified and predicted a UD based on patients' diagnostic and admission information, and outperformed other fusion methods overall. CONCLUSIONS The accurate identification and prediction of the UD from a large number of distinct diagnosis codes and multi-source heterogeneous patient admission information in EMRs can provide a data-driven approach to assist better coding integration of diagnosis.
Collapse
Affiliation(s)
- Jingfeng Chen
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Chonghui Guo
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Menglin Lu
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Suying Ding
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|