1
|
Chikamochi T, Ishiguro C, Mimura W, Maeda M, Murata F, Fukuda H. Validation Study of the Claims-Based Algorithm Using the International Classification of Diseases Codes to Identify Patients With Coronavirus Disease in Japan From 2020 to 2022: The VENUS Study. Pharmacoepidemiol Drug Saf 2024; 33:e70032. [PMID: 39449609 DOI: 10.1002/pds.70032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 08/28/2024] [Accepted: 09/20/2024] [Indexed: 10/26/2024]
Abstract
PURPOSE We validated claims-based algorithms using the International Classification of Diseases, Tenth Revision (ICD-10) to identify patients with the first-ever coronavirus disease (COVID-19) onset between May 2020 and August 2022. METHODS The study cohort was comprised of residents of one municipality enrolled in a public insurance program. This study used data provided by the municipality, including residents' insurer-based medical claims data linked to the Health Center Real-time Information-Sharing System (HER-SYS). The HER-SYS data included positive results from COVID-19 tests and were used as reference standards. Claims-based algorithms #1 and #2 were U07.1, B34.2, with and without suspicious diagnoses, respectively. Claims-based algorithms #3 and #4 were U07.1 with and without suspicious diagnoses, respectively. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for each algorithm. RESULTS The study cohort included 165 038 residents, including 13 402 residents were the reference standard. For the entire period, the sensitivity, specificity, PPV, and NPV were 55.7% (95% confidence interval: 54.8%-56.5%), 65.4% (65.2%-65.6%), 11.5% (11.3%-11.8%), and 98.9% (98.8%-99.0%) for Algorithm #1, and 67.0% (66.2%-67.8%), 88.1% (87.9%-88.3%), 31.6% (31.1%-32.2%), and 97.8% (97.7%-97.8%) for Algorithm #2, and 52.9% (52.0%-53.7%), 67.1% (66.9%-67.3%), 11.5% (11.2%-11.8%), and 98.3% (98.3%-98.4%) for Algorithm #3, 62.6% (61.8%-63.4%), 88.5% (88.3%-88.7%), 30.9% (30.3%-31.4%), and 97.3% (97.2%-97.4%) for Algorithm #4, respectively. CONCLUSIONS Our study showed that the validity of claims-based algorithms consisting of COVID-19-related ICD-10 codes to identify patients with first-onset COVID-19 is limited.
Collapse
Affiliation(s)
- Taku Chikamochi
- Section of Clinical Epidemiology, Department of Data Science, Center for Clinical Sciences, National Center for Global Health and Medicine, Tokyo, Japan
| | - Chieko Ishiguro
- Section of Clinical Epidemiology, Department of Data Science, Center for Clinical Sciences, National Center for Global Health and Medicine, Tokyo, Japan
| | - Wataru Mimura
- Section of Clinical Epidemiology, Department of Data Science, Center for Clinical Sciences, National Center for Global Health and Medicine, Tokyo, Japan
| | - Megumi Maeda
- Department of Health Care Administration and Management, Kyushu University Graduate School of Medical Sciences, Fukuoka, Japan
| | - Fumiko Murata
- Department of Health Care Administration and Management, Kyushu University Graduate School of Medical Sciences, Fukuoka, Japan
| | - Haruhisa Fukuda
- Department of Health Care Administration and Management, Kyushu University Graduate School of Medical Sciences, Fukuoka, Japan
| |
Collapse
|
2
|
Fadilah A, Putri VYS, Puling IMDR, Willyanto SE. Assessing the precision of machine learning for diagnosing pulmonary arterial hypertension: a systematic review and meta-analysis of diagnostic accuracy studies. Front Cardiovasc Med 2024; 11:1422327. [PMID: 39257851 PMCID: PMC11385608 DOI: 10.3389/fcvm.2024.1422327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 07/30/2024] [Indexed: 09/12/2024] Open
Abstract
Introduction Pulmonary arterial hypertension (PAH) is a severe cardiovascular condition characterized by pulmonary vascular remodeling, increased resistance to blood flow, and eventual right heart failure. Right heart catheterization (RHC) is the gold standard diagnostic technique, but due to its invasiveness, it poses risks such as vessel and valve injury. In recent years, machine learning (ML) technologies have offered non-invasive alternatives combined with ML for improving the diagnosis of PAH. Objectives The study aimed to evaluate the diagnostic performance of various methods, such as electrocardiography (ECG), echocardiography, blood biomarkers, microRNA, chest x-ray, clinical codes, computed tomography (CT) scan, and magnetic resonance imaging (MRI), combined with ML in diagnosing PAH. Methods The outcomes of interest included sensitivity, specificity, area under the curve (AUC), positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR). This study employed the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool for quality appraisal and STATA V.12.0 for the meta-analysis. Results A comprehensive search across six databases resulted in 26 articles for examination. Twelve articles were categorized as low-risk, nine as moderate-risk, and five as high-risk. The overall diagnostic performance analysis demonstrated significant findings, with sensitivity at 81% (95% CI = 0.76-0.85, p < 0.001), specificity at 84% (95% CI = 0.77-0.88, p < 0.001), and an AUC of 89% (95% CI = 0.85-0.91). In the subgroup analysis, echocardiography displayed outstanding results, with a sensitivity value of 83% (95% CI = 0.72-0.91), specificity value of 93% (95% CI = 0.89-0.96), PLR value of 12.4 (95% CI = 6.8-22.9), and DOR value of 70 (95% CI = 23-231). ECG demonstrated excellent accuracy performance, with a sensitivity of 82% (95% CI = 0.80-0.84) and a specificity of 82% (95% CI = 0.78-0.84). Moreover, blood biomarkers exhibited the highest NLR value of 0.50 (95% CI = 0.42-0.59). Conclusion The implementation of echocardiography and ECG with ML for diagnosing PAH presents a promising alternative to RHC. This approach shows potential, as it achieves excellent diagnostic parameters, offering hope for more accessible and less invasive diagnostic methods. Systematic Review Registration PROSPERO (CRD42024496569).
Collapse
Affiliation(s)
- Akbar Fadilah
- Brawijaya Cardiovascular Research Center, Department of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Brawijaya, Malang, Indonesia
| | - Valerinna Yogibuana Swastika Putri
- Brawijaya Cardiovascular Research Center, Department of Cardiology and Vascular Medicine, Faculty of Medicine, Universitas Brawijaya, Malang, Indonesia
| | | | | |
Collapse
|
3
|
Holdefer AA, Pizarro J, Saunders-Hastings P, Beers J, Sang A, Hettinger AZ, Blumenthal J, Martinez E, Jones LD, Deady M, Ezzeldin H, Anderson SA. Development of Interoperable Computable Phenotype Algorithms for Adverse Events of Special Interest to Be Used for Biologics Safety Surveillance: Validation Study. JMIR Public Health Surveill 2024; 10:e49811. [PMID: 39008361 PMCID: PMC11287092 DOI: 10.2196/49811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 02/24/2024] [Accepted: 05/26/2024] [Indexed: 07/16/2024] Open
Abstract
BACKGROUND Adverse events associated with vaccination have been evaluated by epidemiological studies and more recently have gained additional attention with the emergency use authorization of several COVID-19 vaccines. As part of its responsibility to conduct postmarket surveillance, the US Food and Drug Administration continues to monitor several adverse events of special interest (AESIs) to ensure vaccine safety, including for COVID-19. OBJECTIVE This study is part of the Biologics Effectiveness and Safety Initiative, which aims to improve the Food and Drug Administration's postmarket surveillance capabilities while minimizing public burden. This study aimed to enhance active surveillance efforts through a rules-based, computable phenotype algorithm to identify 5 AESIs being monitored by the Center for Disease Control and Prevention for COVID-19 or other vaccines: anaphylaxis, Guillain-Barré syndrome, myocarditis/pericarditis, thrombosis with thrombocytopenia syndrome, and febrile seizure. This study examined whether these phenotypes have sufficiently high positive predictive value (PPV) to ensure that the cases selected for surveillance are reasonably likely to be a postbiologic adverse event. This allows patient privacy, and security concerns for the data sharing of patients who had nonadverse events can be properly accounted for when evaluating the cost-benefit aspect of our approach. METHODS AESI phenotype algorithms were developed to apply to electronic health record data at health provider organizations across the country by querying for standard and interoperable codes. The codes queried in the rules represent symptoms, diagnoses, or treatments of the AESI sourced from published case definitions and input from clinicians. To validate the performance of the algorithms, we applied them to electronic health record data from a US academic health system and provided a sample of cases for clinicians to evaluate. Performance was assessed using PPV. RESULTS With a PPV of 93.3%, our anaphylaxis algorithm performed the best. The PPVs for our febrile seizure, myocarditis/pericarditis, thrombocytopenia syndrome, and Guillain-Barré syndrome algorithms were 89%, 83.5%, 70.2%, and 47.2%, respectively. CONCLUSIONS Given our algorithm design and performance, our results support continued research into using interoperable algorithms for widespread AESI postmarket detection.
Collapse
Affiliation(s)
| | | | | | | | | | - Aaron Zachary Hettinger
- Center for Biostatistics, Informatics and Data Science, MedStar Health Research Institute, Columbia, MD, United States
- Department of Emergency Medicine, Georgetown University School of Medicine, Washington, DC, United States
| | - Joseph Blumenthal
- Center for Biostatistics, Informatics and Data Science, MedStar Health Research Institute, Columbia, MD, United States
| | | | | | | | - Hussein Ezzeldin
- Center for Biologics Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD, United States
| | - Steven A Anderson
- Center for Biologics Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD, United States
| |
Collapse
|
4
|
Kural KC, Mazo I, Walderhaug M, Santana-Quintero L, Karagiannis K, Thompson EE, Kelman JA, Goud R. Using machine learning to improve anaphylaxis case identification in medical claims data. JAMIA Open 2024; 7:ooae037. [PMID: 38911332 PMCID: PMC11190610 DOI: 10.1093/jamiaopen/ooae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 06/25/2024] Open
Abstract
Objectives Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of "Big Data" for healthcare or public health purposes. Materials and methods This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases. Results Resulting machine learning model accuracies ranged from 47.7% to 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms. Discussion Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm. Conclusion Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.
Collapse
Affiliation(s)
- Kamil Can Kural
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
- School of Systems Biology, George Mason University, Manassas, VA 20110, United States
| | - Ilya Mazo
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Mark Walderhaug
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Luis Santana-Quintero
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Konstantinos Karagiannis
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Elaine E Thompson
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Jeffrey A Kelman
- Centers for Medicare & Medicaid Services, Washington, DC 20001, United States
| | - Ravi Goud
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| |
Collapse
|
5
|
Jambon-Barbara C, Hlavaty A, Bernardeau C, Bouvaist H, Chaumais MC, Humbert M, Montani D, Cracowski JL, Khouri C. Development and validation of a code-based algorithm using in-hospital medical records to identify patients with pulmonary arterial hypertension in a French healthcare database. ERJ Open Res 2024; 10:00109-2024. [PMID: 39135662 PMCID: PMC11317892 DOI: 10.1183/23120541.00109-2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 04/11/2024] [Indexed: 08/15/2024] Open
Abstract
Introduction Pulmonary arterial hypertension (PAH) is a rare and severe disease for which most of the evidence about prognostic factors, evolution and treatment efficacy comes from cohorts, registries and clinical trials. We therefore aimed to develop and validate a new PAH identification algorithm that can be used in the French healthcare database "Système National des Données de Santé (SNDS)". Methods We developed and validated the algorithm using the Grenoble Alpes University Hospital medical charts. We first identified PAH patients following a previously validated algorithm, using in-hospital ICD-10 (10th revision of the International Statistical Classification of Diseases) codes, right heart catheterisation procedure and PAH-specific treatment dispensing. Then, we refined the latter with the exclusion of chronic thromboembolic pulmonary hypertension procedures and treatment, the main misclassification factor. Second, we validated this algorithm using a gold standard review of in-hospital medical charts and calculated sensitivity, specificity, positive and negative predictive value (PPV and NPV) and accuracy. Finally, we applied this algorithm in the French healthcare database and described the characteristics of the identified patients. Results In the Grenoble University Hospital, we identified 252 unique patients meeting all the algorithm's criteria between 1 January 2010 and 30 June 2022, and reviewed all medical records. The sensitivity, specificity, PPV, NPV and accuracy were 91.0%, 74.3%, 67.9%, 93.3% and 80.6%, respectively. Application of this algorithm to the SNDS yielded the identification of 9931 patients with consistent characteristics compared to PAH registries. Conclusion Overall, we propose a new PAH identification algorithm developed and adapted to the French specificities that can be used in future studies using the French healthcare database.
Collapse
Affiliation(s)
- Clément Jambon-Barbara
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
- Univ. Grenoble Alpes, HP2 Laboratory, Inserm U1300, Grenoble, France
| | - Alex Hlavaty
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
- Univ. Grenoble Alpes, HP2 Laboratory, Inserm U1300, Grenoble, France
| | - Claire Bernardeau
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
| | - Hélène Bouvaist
- Cardiology Unit, Grenoble Alpes University Hospital, Grenoble, France
| | - Marie-Camille Chaumais
- INSERM UMR_S 999, Hôpital Marie Lannelongue, Le Plessis Robinson, France
- Assistance Publique-Hôpitaux de Paris (AP-HP), Department of Pharmacy, Hôpital Bicêtre, Le Kremlin-Bicêtre, France
- Faculty of Pharmacy, Université Paris-Saclay, Saclay, France
| | - Marc Humbert
- INSERM UMR_S 999, Hôpital Marie Lannelongue, Le Plessis Robinson, France
- Faculty of Medicine, Université Paris-Saclay, Le Kremlin-Bicêtre, France
- AP-HP, Department of Respiratory and Intensive Care Medicine, Pulmonary Hypertension National Referral Centre, Hôpital Bicêtre, DMU 5 Thorinno, Le Kremlin-Bicêtre, France
| | - David Montani
- INSERM UMR_S 999, Hôpital Marie Lannelongue, Le Plessis Robinson, France
- Faculty of Medicine, Université Paris-Saclay, Le Kremlin-Bicêtre, France
- AP-HP, Department of Respiratory and Intensive Care Medicine, Pulmonary Hypertension National Referral Centre, Hôpital Bicêtre, DMU 5 Thorinno, Le Kremlin-Bicêtre, France
| | - Jean-Luc Cracowski
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
- Univ. Grenoble Alpes, HP2 Laboratory, Inserm U1300, Grenoble, France
| | - Charles Khouri
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
- Univ. Grenoble Alpes, HP2 Laboratory, Inserm U1300, Grenoble, France
- Grenoble Alpes University Hospital, Clinical Pharmacology Department INSERM CIC1406, Grenoble, France
| |
Collapse
|
6
|
Gao L, Skinner J, Nath T, Lin Q, Griffiths M, Damico RL, Pauciulo MW, Nichols WC, Hassoun PM, Everett AD, Johns RA. Resistin predicts disease severity and survival in patients with pulmonary arterial hypertension. Respir Res 2024; 25:235. [PMID: 38844967 PMCID: PMC11154998 DOI: 10.1186/s12931-024-02861-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/30/2024] [Indexed: 06/10/2024] Open
Abstract
BACKGROUND Abnormal remodeling of distal pulmonary arteries in patients with pulmonary arterial hypertension (PAH) leads to progressively increased pulmonary vascular resistance, followed by right ventricular hypertrophy and failure. Despite considerable advancements in PAH treatment prognosis remains poor. We aim to evaluate the potential for using the cytokine resistin as a genetic and biological marker for disease severity and survival in a large cohort of patients with PAH. METHODS Biospecimens, clinical, and genetic data for 1121 adults with PAH, including 808 with idiopathic PAH (IPAH) and 313 with scleroderma-associated PAH (SSc-PAH), were obtained from a national repository. Serum resistin levels were measured by ELISA, and associations between resistin levels, clinical variables, and single nucleotide polymorphism genotypes were examined with multivariable regression models. Machine-learning (ML) algorithms were applied to develop and compare risk models for mortality prediction. RESULTS Resistin levels were significantly higher in all PAH samples and PAH subtype (IPAH and SSc-PAH) samples than in controls (P < .0001) and had significant discriminative abilities (AUCs of 0.84, 0.82, and 0.91, respectively; P < .001). High resistin levels (above 4.54 ng/mL) in PAH patients were associated with older age (P = .001), shorter 6-min walk distance (P = .001), and reduced cardiac performance (cardiac index, P = .016). Interestingly, mutant carriers of either rs3219175 or rs3745367 had higher resistin levels (adjusted P = .0001). High resistin levels in PAH patients were also associated with increased risk of death (hazard ratio: 2.6; 95% CI: 1.27-5.33; P < .0087). Comparisons of ML-derived survival models confirmed satisfactory prognostic value of the random forest model (AUC = 0.70, 95% CI: 0.62-0.79) for PAH. CONCLUSIONS This work establishes the importance of resistin in the pathobiology of human PAH. In line with its function in rodent models, serum resistin represents a novel biomarker for PAH prognostication and may indicate a new therapeutic avenue. ML-derived survival models highlighted the importance of including resistin levels to improve performance. Future studies are needed to develop multi-marker assays that improve noninvasive risk stratification.
Collapse
Affiliation(s)
- Li Gao
- Department of Medicine, Division of Allergy and Clinical Immunology, Johns Hopkins University School of Medicine, 5501 Hopkins Bayview Circle, Room 3B.65B, Baltimore, MD, 21224-6821, USA.
| | - John Skinner
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University School of Medicine, 720 Rutland Avenue, Ross 361, Baltimore, MD, 21287, USA
| | - Tanmay Nath
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Qing Lin
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University School of Medicine, 720 Rutland Avenue, Ross 361, Baltimore, MD, 21287, USA
| | - Megan Griffiths
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Rachel L Damico
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Michael W Pauciulo
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - William C Nichols
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Paul M Hassoun
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Allen D Everett
- Division of Pediatric Cardiology, Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Roger A Johns
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University School of Medicine, 720 Rutland Avenue, Ross 361, Baltimore, MD, 21287, USA.
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
7
|
Logeart D, Doublet M, Gouysse M, Damy T, Isnard R, Roubille F. Development and validation of algorithms to predict left ventricular ejection fraction class from healthcare claims data. ESC Heart Fail 2024; 11:1688-1697. [PMID: 38438250 PMCID: PMC11098626 DOI: 10.1002/ehf2.14725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 01/22/2024] [Accepted: 01/24/2024] [Indexed: 03/06/2024] Open
Abstract
AIMS The use of large medical or healthcare claims databases is very useful for population-based studies on the burden of heart failure (HF). Clinical characteristics and management of HF patients differ according to categories of left ventricular ejection fraction (LVEF), but this information is often missing in such databases. We aimed to develop and validate algorithms to identify LVEF in healthcare databases where the information is lacking. METHODS AND RESULTS Algorithms were built by machine learning with a random forest approach. Algorithms were trained and reinforced using the French national claims database [Système National des Données de Santé (SNDS)] and a French HF registry. Variables were age, gender, and comorbidities, which could be identified by medico-administrative code-based proxies, Anatomical Therapeutic Chemical codes for drug delivery, International Classification of Diseases (Tenth Revision) coding for hospitalizations, and administrative codes for any other type of reimbursed care. The algorithms were validated by cross-validation and against a subset of the SNDS that includes LVEF information. The areas under the receiver operating characteristic curve were 0.84 for the algorithm identifying LVEF ≤ 40% and 0.79 for the algorithms identifying LVEF < 50% and ≥50%. For LVEF ≤ 40%, the reinforced algorithm identified 50% of patients in the validation dataset with a positive predictive value of 0.88 and a specificity of 0.96. The most important predictive variables were delivery of HF medication, sex, age, hospitalization, and testing for natriuretic peptides with different orders of positive or negative importance according to the LVEF category. CONCLUSIONS The algorithms identify reduced or preserved LVEF in HF patients within a nationwide healthcare claims database with high positive predictive value and low rates of false positives.
Collapse
Affiliation(s)
- Damien Logeart
- Department of CardiologyParis Cité University, AP‐HP Hôpital Lariboisière, Inserm U9422 rue Ambroise ParéParisFrance
| | | | | | - Thibaud Damy
- Department of Cardiology and French National Reference Centre for Cardiac AmyloidosisHôpitaux Universitaires Henri‐Mondor AP‐HP, IMRB, Inserm, Université Paris‐Est CréteilCréteilFrance
| | | | - François Roubille
- Department of CardiologyINI‐CRT PhyMedExp Inserm CNRS, CHU de Montpellier, Université de MontpellierMontpellierFrance
| |
Collapse
|
8
|
Nemati N, Burton T, Fathieh F, Gillins HR, Shadforth I, Ramchandani S, Bridges CR. Pulmonary Hypertension Detection Non-Invasively at Point-of-Care Using a Machine-Learned Algorithm. Diagnostics (Basel) 2024; 14:897. [PMID: 38732312 PMCID: PMC11083349 DOI: 10.3390/diagnostics14090897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/10/2024] [Accepted: 04/22/2024] [Indexed: 05/13/2024] Open
Abstract
Artificial intelligence, particularly machine learning, has gained prominence in medical research due to its potential to develop non-invasive diagnostics. Pulmonary hypertension presents a diagnostic challenge due to its heterogeneous nature and similarity in symptoms to other cardiovascular conditions. Here, we describe the development of a supervised machine learning model using non-invasive signals (orthogonal voltage gradient and photoplethysmographic) and a hand-crafted library of 3298 features. The developed model achieved a sensitivity of 87% and a specificity of 83%, with an overall Area Under the Receiver Operator Characteristic Curve (AUC-ROC) of 0.93. Subgroup analysis showed consistent performance across genders, age groups and classes of PH. Feature importance analysis revealed changes in metrics that measure conduction, repolarization and respiration as significant contributors to the model. The model demonstrates promising performance in identifying pulmonary hypertension, offering potential for early detection and intervention when embedded in a point-of-care diagnostic system.
Collapse
Affiliation(s)
- Navid Nemati
- Analytics for Life, Toronto, ON M5X 1C9, Canada; (N.N.); (F.F.)
| | - Timothy Burton
- Analytics for Life, Toronto, ON M5X 1C9, Canada; (N.N.); (F.F.)
| | - Farhad Fathieh
- Analytics for Life, Toronto, ON M5X 1C9, Canada; (N.N.); (F.F.)
| | - Horace R. Gillins
- Analytics for Life, Bethesda, MD 20814, USA; (H.R.G.); (I.S.); (C.R.B.)
| | - Ian Shadforth
- Analytics for Life, Bethesda, MD 20814, USA; (H.R.G.); (I.S.); (C.R.B.)
| | | | | |
Collapse
|
9
|
Didden E, Lu D, Hsi A, Brand M, Hedlin H, Zamanian RT. Clinical evaluation of code-based algorithms to identify patients with pulmonary arterial hypertension in healthcare databases. Pulm Circ 2024; 14:e12333. [PMID: 38333073 PMCID: PMC10851026 DOI: 10.1002/pul2.12333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/24/2023] [Accepted: 12/21/2023] [Indexed: 02/10/2024] Open
Abstract
Pulmonary arterial hypertension (PAH) is a rare subgroup of pulmonary hypertension (PH). Claims and administrative databases can be particularly important for research in rare diseases; however, there is a lack of validated algorithms to identify PAH patients using administrative codes. We aimed to measure the accuracy of code-based PAH algorithms against the true clinical diagnosis by right heart catheterization (RHC). This study evaluated algorithms in patients who were recorded in two linkable data assets: the Stanford Healthcare administrative electronic health record database and the Stanford Vera Moulton Wall Center clinical PH database (which records each patient's RHC diagnosis). We assessed the sensitivity and specificity achieved by 16 algorithms (six published). In total, 720 PH patients with linked data available were included and 558 (78%) of these were PAH patients. Algorithms consisting solely of a P(A)H-specific diagnostic code classed all or almost all PH patients as PAH (sensitivity >97%, specificity <12%) while multicomponent algorithms with well-defined temporal sequences of procedure, diagnosis and treatment codes achieved a better balance of sensitivity and specificity. Specificity increased and sensitivity decreased with increasing algorithm complexity. The best-performing algorithms, in terms of fewest misclassified patients, included multiple components (e.g., PH diagnosis, PAH treatment, continuous enrollment for ≥6 months before and ≥12 months following index date) and achieved sensitivities and specificities of around 95% and 38%, respectively. Our findings help researchers tailor their choice and design of code-based PAH algorithms to their research question and demonstrate the importance of including well-defined temporal components in the algorithms.
Collapse
Affiliation(s)
- Eva‐Maria Didden
- Global Epidemiology, Rare Disease Epicenter, Actelion Pharmaceuticals LtdJanssen Pharmaceutical Company of Johnson & JohnsonAllschwilSwitzerland
| | - Di Lu
- Quantitative Sciences UnitStanford UniversityStanfordCaliforniaUSA
| | - Andrew Hsi
- Adult PH ProgramVera Moulton Wall Center UniversityStanfordCaliforniaUSA
| | - Monika Brand
- Global Epidemiology, Rare Disease Epicenter, Actelion Pharmaceuticals LtdJanssen Pharmaceutical Company of Johnson & JohnsonAllschwilSwitzerland
| | - Haley Hedlin
- Quantitative Sciences UnitStanford UniversityStanfordCaliforniaUSA
| | - Roham T. Zamanian
- Adult PH ProgramVera Moulton Wall Center UniversityStanfordCaliforniaUSA
- Division of Pulmonary, Allergy, and Critical Care MedicineStanford UniversityStanfordCaliforniaUSA
| |
Collapse
|
10
|
Kural KC, Mazo I, Walderhaug M, Santana-Quintero L, Karagiannis K, Thompson EE, Kelman JA, Goud R. Using machine learning to improve anaphylaxis case identification in medical claims data. JAMIA Open 2023; 6:ooad090. [PMID: 37900974 PMCID: PMC10611436 DOI: 10.1093/jamiaopen/ooad090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/29/2023] [Accepted: 10/12/2023] [Indexed: 10/31/2023] Open
Abstract
Objective Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of "Big Data" for healthcare or public health purposes. Methods This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases. Results Resulting machine learning model accuracies ranged between 47.7% and 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms. Discussion Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm. Conclusion Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.
Collapse
Affiliation(s)
- Kamil Can Kural
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
- School of Systems Biology, George Mason University, Manassas, VA 20110, United States
| | - Ilya Mazo
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Mark Walderhaug
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Luis Santana-Quintero
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Konstantinos Karagiannis
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Elaine E Thompson
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Jeffrey A Kelman
- Centers for Medicare & Medicaid Services, Washington, DC 20001, United States
| | - Ravi Goud
- Center for Biologics Evaluation and Research (CBER), Food and Drug Administration, Silver Spring, MD 20993, United States
| |
Collapse
|
11
|
Morland K, Gerges C, Elwing J, Visovatti SH, Weatherald J, Gillmeyer KR, Sahay S, Mathai SC, Boucly A, Williams PG, Harikrishnan S, Minty EP, Hobohm L, Jose A, Badagliacca R, Lau EMT, Jing Z, Vanderpool RR, Fauvel C, Leonidas Alves J, Strange G, Pulido T, Qian J, Li M, Mercurio V, Zelt JGE, Moles VM, Cirulis MM, Nikkho SM, Benza RL, Elliott CG. Real-world evidence to advance knowledge in pulmonary hypertension: Status, challenges, and opportunities. A consensus statement from the Pulmonary Vascular Research Institute's Innovative Drug Development Initiative's Real-world Evidence Working Group. Pulm Circ 2023; 13:e12317. [PMID: 38144948 PMCID: PMC10739115 DOI: 10.1002/pul2.12317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/26/2023] [Accepted: 11/21/2023] [Indexed: 12/26/2023] Open
Abstract
This manuscript on real-world evidence (RWE) in pulmonary hypertension (PH) incorporates the broad experience of members of the Pulmonary Vascular Research Institute's Innovative Drug Development Initiative Real-World Evidence Working Group. We aim to strengthen the research community's understanding of RWE in PH to facilitate clinical research advances and ultimately improve patient care. Herein, we review real-world data (RWD) sources, discuss challenges and opportunities when using RWD sources to study PH populations, and identify resources needed to support the generation of meaningful RWE for the global PH community.
Collapse
Affiliation(s)
- Kellie Morland
- Global Medical AffairsUnited Therapeutics CorporationResearch Triangle ParkNorth CarolinaUSA
| | - Christian Gerges
- Department of Internal Medicine II, Division of CardiologyMedical University of ViennaViennaAustria
| | - Jean Elwing
- Division of Pulmonary, Critical Care, and Sleep MedicineUniversity of CincinnatiCincinnatiOhioUSA
| | - Scott H. Visovatti
- Division of Cardiovascular MedicineThe Ohio State UniversityColumbusOhioUSA
| | - Jason Weatherald
- Department of Medicine, Division of Pulmonary MedicineUniversity of AlbertaEdmontonCanada
| | - Kari R. Gillmeyer
- The Pulmonary CenterBoston University Chobian & Avedisian School of MedicineBostonMassachusettsUSA
- Center for Healthcare Organization & Implementation ResearchVA Bedford Healthcare System and VA Boston Healthcare SystemBedfordMassachusettsUSA
| | - Sandeep Sahay
- Division of Pulmonary, Critical Care & Sleep MedicineHouston Methodist HospitalHoustonTexasUSA
| | - Stephen C. Mathai
- Division of Pulmonary and Critical Care MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Athénaïs Boucly
- Faculté de MédecineUniversité Paris‐SaclayLe Kremlin‐BicêtreFrance
- Service de Pneumologie et Soins Intensifs Respiratoires, Centre de Référence de l'Hypertension Pulmonaire, Hôpital BicêtreAssistance Publique Hôpitaux de ParisLe Kremlin BicêtreFrance
- National Heart and Lung InstituteImperial CollegeLondonUK
| | - Paul G. Williams
- Center of Chest Diseases & Critical CareMilpark HospitalJohannesburgSouth Africa
| | | | - Evan P. Minty
- Department of Medicine & O'Brien Institute for Public HealthUniversity of CalgaryCalgaryCanada
| | - Lukas Hobohm
- Department of CardiologyUniversity Medical Center of the Johannes Gutenberg University MainzMainzGermany
- Center for Thrombosis and Hemostasis (CTH)University Medical Center of the Johannes Gutenberg University MainzMainzGermany
| | - Arun Jose
- Division of Pulmonary, Critical Care, and Sleep MedicineUniversity of CincinnatiCincinnatiOhioUSA
| | - Roberto Badagliacca
- Department of Clinical, Anesthesiological and Cardiovascular Sciences, Sapienza University of RomePoliclinico Umberto IRomeItaly
| | - Edmund M. T. Lau
- Department of Respiratory Medicine, Royal Prince Alfred HospitalUniversity of SydneyCamperdownNew South WalesAustralia
- Faculty of Medicine and HealthUniversity of SydneyCamperdownNew South WalesAustralia
| | - Zhi‐Cheng Jing
- State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College HospitalChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijingChina
| | | | - Charles Fauvel
- Service de Cardiologie, Centre de Compétence en Hypertension Pulmonaire 27/76, Centre Hospitalier Universitaire Charles Nicolle, INSERM EnVI U1096Université de RouenRouenFrance
| | - Jose Leonidas Alves
- Pulmonary Division, Heart InstituteUniversity of São Paulo Medical SchoolSão PauloBrazil
| | - Geoff Strange
- School of MedicineThe University of Notre Dame AustraliaPerthWestern AustraliaAustralia
| | - Tomas Pulido
- Ignacio Chávez National Heart InstituteMéxico CityMexico
| | - Junyan Qian
- Department of Rheumatology and Clinical Immunology, Chinese Academy of Medical Sciences & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC‐DID), Ministry of Science & Technology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital (PUMCH), Key Laboratory of Rheumatology and Clinical ImmunologyMinistry of EducationBeijingChina
| | - Mengtao Li
- Department of Rheumatology and Clinical Immunology, Chinese Academy of Medical Sciences & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC‐DID), Ministry of Science & Technology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital (PUMCH), Key Laboratory of Rheumatology and Clinical ImmunologyMinistry of EducationBeijingChina
| | - Valentina Mercurio
- Department of Translational Medical SciencesFederico II UniversityNaplesItaly
| | - Jason G. E. Zelt
- Department of Medicine, Faculty of MedicineUniversity of OttawaOttawaCanada
| | - Victor M. Moles
- Division of Cardiovascular MedicineUniversity of MichiganAnn ArborMichiganUSA
| | - Meghan M. Cirulis
- Division of Pulmonary and Critical Care MedicineUniversity of UtahSalt Lake CityUtahUSA
- Department of Pulmonary and Critical Care MedicineIntermountain Medical Center MurraySalt Lake CityUtahUSA
| | | | - Raymond L. Benza
- Mount Sinai HeartIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - C. Gregory Elliott
- Division of Pulmonary and Critical Care MedicineUniversity of UtahSalt Lake CityUtahUSA
- Department of Pulmonary and Critical Care MedicineIntermountain Medical Center MurraySalt Lake CityUtahUSA
| |
Collapse
|
12
|
Cao L, Huang YS, Wu C, Getz K, Miller TP, Ruiz J, Fisher BT, Seif AE, Aplenc R, Li Y. Leveraging machine learning to identify acute myeloid leukemia patients and their chemotherapy regimens in an administrative database. Pediatr Blood Cancer 2023; 70:e30260. [PMID: 36815580 PMCID: PMC10402395 DOI: 10.1002/pbc.30260] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 01/08/2023] [Accepted: 01/30/2023] [Indexed: 02/24/2023]
Abstract
BACKGROUND Administrative datasets are useful for identifying rare disease cohorts such as pediatric acute myeloid leukemia (AML). Previously, cohorts were assembled using labor-intensive, manual reviews of patients' longitudinal chemotherapy data. METHODS We utilized a two-step machine learning (ML) method to (i) identify pediatric patients with newly diagnosed AML, and (ii) among the identified AML patients, their chemotherapy courses, in an administrative/billing database. Using 2558 patients previously manually reviewed, multiple ML algorithms were derived from 75% of the study sample, and the selected model was tested in the remaining hold-out sample. The selected model was also applied to assemble a new pediatric AML cohort and further assessed in an external validation, using a standalone cohort established by manual chart abstraction. RESULTS For patient identification, the selected Support Vector Machine model yielded a sensitivity of 0.97 and a positive predictive value (PPV) of 0.97 in the hold-out test sample. For course-specific chemotherapy regimen and start date identification, the selected Random Forest model yielded overall PPV greater than or equal to 0.88 and sensitivity greater than or equal to 0.86 across all courses in the test sample. When applied to new cohort assembly, ML identified 3016 AML patients with 10,588 treatment courses. In the external validation subset, PPV was greater than or equal to 0.75 and sensitivity was greater than or equal to 0.82 for patient identification, and PPV was greater than or equal to 0.93 and sensitivity was greater than or equal to 0.94 for regimen identifications. CONCLUSION A carefully designed ML model can accurately identify pediatric AML patients and their chemotherapy courses from administrative databases. This approach may be generalizable to other diseases and databases.
Collapse
Affiliation(s)
- Lusha Cao
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Yuan-Shung Huang
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Chao Wu
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Kelly Getz
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Oncology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Tamara P. Miller
- Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia, USA
- Aflac Cancer & Blood Disorders Center, Children’s Healthcare of Atlanta, Atlanta, Georgia, USA
| | - Jenny Ruiz
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Oncology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Brian T. Fisher
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Infectious Diseases, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Alix E. Seif
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Oncology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Richard Aplenc
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Oncology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Yimei Li
- Perelman School of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
- Division of Oncology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| |
Collapse
|
13
|
Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol. Diagnostics (Basel) 2022; 12:diagnostics12081965. [PMID: 36010315 PMCID: PMC9407063 DOI: 10.3390/diagnostics12081965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/08/2022] [Accepted: 08/11/2022] [Indexed: 11/26/2022] Open
Abstract
Purpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven machine learning (ML), demonstrating complex relationships between risk factors and outcomes and promising predictive performance with vast amounts of medical data, aimed to investigate the association between dyslipidemia and the incidence of early stage hypertension in a large cohort with normal blood pressure at baseline. Methods: This study analyzed annual health screening data for 71,108 people from 2005 to 2017, including data for 27 risk-related indicators, sourced from the MJ Group, a major health screening center in Taiwan. We used five machine learning (ML) methods—stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), least absolute shrinkage and selection operator regression (Lasso), ridge regression (Ridge), and gradient boosting with categorical features support (CatBoost)—to develop a multi-stage ML algorithm-based prediction scheme and then evaluate important risk factors at the early stage of hypertension, especially for groups with high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) levels within or out of the reference range. Results: Age, body mass index, waist circumference, waist-to-hip ratio, fasting plasma glucose, and C-reactive protein (CRP) were associated with hypertension. The hemoglobin level was also a positive contributor to blood pressure elevation and it appeared among the top three important risk factors in all LDL-C/HDL-C groups; therefore, these variables may be important in affecting blood pressure in the early stage of hypertension. A residual contribution to blood pressure elevation was found in groups with increased LDL-C. This suggests that LDL-C levels are associated with CPR levels, and that the LDL-C level may be an important factor for predicting the development of hypertension. Conclusion: The five prediction models provided similar classifications of risk factors. The results of this study show that an increase in LDL-C is more important than the start of a drop in HDL-C in health screening of sub-healthy adults. The findings of this study should be of value to health awareness raising about hypertension and further discussion and follow-up research.
Collapse
|
14
|
Schuler KP, Hemnes AR, Annis J, Farber-Eger E, Lowery BD, Halliday SJ, Brittain EL. An algorithm to identify cases of pulmonary arterial hypertension from the electronic medical record. Respir Res 2022; 23:138. [PMID: 35643554 PMCID: PMC9145474 DOI: 10.1186/s12931-022-02055-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 05/12/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Study of pulmonary arterial hypertension (PAH) in claims-based (CB) cohorts may facilitate understanding of disease epidemiology, however previous CB algorithms to identify PAH have had limited test characteristics. We hypothesized that machine learning algorithms (MLA) could accurately identify PAH in an CB cohort. METHODS ICD-9/10 codes, CPT codes or PAH medications were used to screen an electronic medical record (EMR) for possible PAH. A subset (Development Cohort) was manually reviewed and adjudicated as PAH or "not PAH" and used to train and test MLAs. A second subset (Refinement Cohort) was manually reviewed and combined with the Development Cohort to make The Final Cohort, again divided into training and testing sets, with MLA characteristics defined on test set. The MLA was validated using an independent EMR cohort. RESULTS 194 PAH and 786 "not PAH" in the Development Cohort trained and tested the initial MLA. In the Final Cohort test set, the final MLA sensitivity was 0.88, specificity was 0.93, positive predictive value was 0.89, and negative predictive value was 0.92. Persistence and strength of PAH medication use and CPT code for right heart catheterization were principal MLA features. Applying the MLA to the EMR cohort using a split cohort internal validation approach, we found 265 additional non-confirmed cases of suspected PAH that exhibited typical PAH demographics, comorbidities, hemodynamics. CONCLUSIONS We developed and validated a MLA using only CB features that identified PAH in the EMR with strong test characteristics. When deployed across an entire EMR, the MLA identified cases with known features of PAH.
Collapse
Affiliation(s)
- Kyle P Schuler
- Department of Internal Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Anna R Hemnes
- Division of Allergy, Pulmonary and Critical Care Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jeffrey Annis
- Division of Cardiovascular Medicine, Vanderbilt Pulmonary Circulation Center, 2525 West End Avenue, Nashville, TN, USA
- Vanderbilt Institute for Clinical and Translational Research (VICTR), Nashville, TN, USA
| | - Eric Farber-Eger
- Division of Cardiovascular Medicine, Vanderbilt Pulmonary Circulation Center, 2525 West End Avenue, Nashville, TN, USA
- Vanderbilt Institute for Clinical and Translational Research (VICTR), Nashville, TN, USA
| | - Brandon D Lowery
- Division of Cardiovascular Medicine, Vanderbilt Pulmonary Circulation Center, 2525 West End Avenue, Nashville, TN, USA
- Vanderbilt Institute for Clinical and Translational Research (VICTR), Nashville, TN, USA
| | - Stephen J Halliday
- Division of Pulmonary and Critical Care Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Evan L Brittain
- Division of Cardiovascular Medicine, Vanderbilt Pulmonary Circulation Center, 2525 West End Avenue, Nashville, TN, USA.
| |
Collapse
|
15
|
Rhodes CJ, Sweatt AJ, Maron BA. Harnessing Big Data to Advance Treatment and Understanding of Pulmonary Hypertension. Circ Res 2022; 130:1423-1444. [PMID: 35482840 PMCID: PMC9070103 DOI: 10.1161/circresaha.121.319969] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Pulmonary hypertension is a complex disease with multiple causes, corresponding to phenotypic heterogeneity and variable therapeutic responses. Advancing understanding of pulmonary hypertension pathogenesis is likely to hinge on integrated methods that leverage data from health records, imaging, novel molecular -omics profiling, and other modalities. In this review, we summarize key data sets generated thus far in the field and describe analytical methods that hold promise for deciphering the molecular mechanisms that underpin pulmonary vascular remodeling, including machine learning, network medicine, and functional genetics. We also detail how genetic and subphenotyping approaches enable earlier diagnosis, refined prognostication, and optimized treatment prediction. We propose strategies that identify functionally important molecular pathways, bolstered by findings across multi-omics platforms, which are well-positioned to individualize drug therapy selection and advance precision medicine in this highly morbid disease.
Collapse
Affiliation(s)
- Christopher J Rhodes
- Department of Medicine, National Heart and Lung Institute, Imperial College London, United Kingdom (C.J.R.)
| | - Andrew J Sweatt
- Department of Medicine, National Heart and Lung Institute, Imperial College London, United Kingdom (C.J.R.)
| | - Bradley A Maron
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA (B.A.M.).,Division of Cardiology, VA Boston Healthcare System, West Roxbury, MA (B.A.M.)
| |
Collapse
|
16
|
Shi B, Zhou T, Lv S, Wang M, Chen S, Heidari AA, Huang X, Chen H, Wang L, Wu P. An evolutionary machine learning for pulmonary hypertension animal model from arterial blood gas analysis. Comput Biol Med 2022; 146:105529. [PMID: 35594682 DOI: 10.1016/j.compbiomed.2022.105529] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/11/2022] [Accepted: 04/13/2022] [Indexed: 11/03/2022]
Abstract
Pulmonary hypertension (PH) is a rare and fatal condition that leads to right heart failure and death. The pathophysiology of PH and potential therapeutic approaches are yet unknown. PH animal models' development and proper evaluation are critical to PH research. This work presents an effective analysis technology for PH from arterial blood gas analysis utilizing an evolutionary kernel extreme learning machine with multiple strategies integrated slime mould algorithm (MSSMA). In MSSMA, two efficient bee-foraging learning operators are added to the original slime mould algorithm, ensuring a suitable trade-off between intensity and diversity. The proposed MSSMA is evaluated on thirty IEEE benchmarks and the statistical results show that the search performance of the MSSMA is significantly improved. The MSSMA is utilised to develop a kernel extreme learning machine (MSSMA-KELM) on PH from arterial blood gas analysis. Comprehensively, the proposed MSSMA-KELM can be used as an effective analysis technology for PH from arterial Blood gas analysis with an accuracy of 93.31%, Matthews coefficient of 90.13%, Sensitivity of 91.12%, and Specificity of 90.73%. MSSMA-KELM can be treated as an effective approach for evaluating mouse PH models.
Collapse
Affiliation(s)
- Beibei Shi
- Affiliated People's Hospital of Jiangsu University, 8 Dianli Road, Zhenjiang, Jiangsu, 212000, China.
| | - Tao Zhou
- The First Clinical College, Wenzhou Medical University, Wenzhou, 325000, China.
| | - Shushu Lv
- The First Clinical College, Wenzhou Medical University, Wenzhou, 325000, China.
| | - Mingjing Wang
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China.
| | - Siyuan Chen
- Affiliated People's Hospital of Jiangsu University, 8 Dianli Road, Zhenjiang, Jiangsu, 212000, China.
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran; Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
| | - Xiaoying Huang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China.
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China.
| | - Liangxing Wang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China.
| | - Peiliang Wu
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China.
| |
Collapse
|
17
|
Identifying Patients with Group 3 Pulmonary Hypertension Associated with COPD or ILD Using an Administrative Claims Database. Lung 2022; 200:187-203. [PMID: 35348836 PMCID: PMC9038884 DOI: 10.1007/s00408-022-00521-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 02/14/2022] [Indexed: 10/31/2022]
Abstract
BACKGROUND Group 3 pulmonary hypertension (PH) describes a subpopulation of patients with PH due to chronic lung disease and/or hypoxia, with chronic obstructive pulmonary disease (COPD) and interstitial lung disease (ILD) being two large subgroups. Claims database studies provide insights into the real-world treatment patterns and outcomes among these patients. However, claims data do not provide sufficient detail to assign the clinical subtype of PH required for identifying these patients. METHODS A panel of PH clinical experts and researchers was convened to discuss methodologies to identify patients with Group 3 PH associated with COPD or ILD in retrospective claims databases. To inform the discussion, a literature review was conducted to identify claims-based studies of Group 3 PH associated with COPD or ILD published from 2010 through June 2020. RESULTS Targeted title and abstract review identified 11 claims-based studies and two conference abstracts (eight based in the United States [US] and five conducted outside the US) that met search criteria. Based on insights from the panel and literature review, the following components were detailed across studies in the identification of Group 3 PH associated with COPD and ILD: (a) COPD or ILD identification, (b) PH identification, (c) defining the sequence between COPD/ILD and PH, and (d) other PH Group and Group 3 PH exclusions. CONCLUSION This article provides recommended approaches and considerations for identifying and studying patients with Group 3 PH associated with COPD or ILD using administrative claims data that provide the foundation for future validation studies.
Collapse
|
18
|
Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations. Drug Saf 2022; 45:493-510. [PMID: 35579813 PMCID: PMC9112258 DOI: 10.1007/s40264-022-01158-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2022] [Indexed: 01/28/2023]
Abstract
Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.
Collapse
|
19
|
Sweatt AJ, Reddy R, Rahaghi FN, Al-Naamani N. What's new in pulmonary hypertension clinical research: lessons from the best abstracts at the 2020 American Thoracic Society International Conference. Pulm Circ 2021; 11:20458940211040713. [PMID: 34471517 PMCID: PMC8404658 DOI: 10.1177/20458940211040713] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 07/26/2021] [Indexed: 12/23/2022] Open
Abstract
In this conference paper, we review the 2020 American Thoracic Society International Conference session titled, "What's New in Pulmonary Hypertension Clinical Research: Lessons from the Best Abstracts". This virtual mini-symposium took place on 21 October 2020, in lieu of the annual in-person ATS International Conference which was cancelled due to the COVID-19 pandemic. Seven clinical research abstracts were selected for presentation in the session, which encompassed five major themes: (1) standardizing diagnosis and management of pulmonary hypertension, (2) improving risk assessment in pulmonary arterial hypertension, (3) evaluating biomarkers of disease activity, (4) understanding metabolic dysregulation across the spectrum of pulmonary hypertension, and (5) advancing knowledge in chronic thromboembolic pulmonary hypertension. Focusing on these five thematic contexts, we review the current state of knowledge, summarize presented research abstracts, appraise their significance and limitations, and then discuss relevant future directions in pulmonary hypertension clinical research.
Collapse
Affiliation(s)
- Andrew J. Sweatt
- Division of Pulmonary, Allergy and Critical Care Medicine, Stanford University, Stanford, CA, USA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford, CA, USA
| | - Raju Reddy
- Division of Pulmonary and Critical Care Medicine, Oregon Health and Science University, Portland, OR, USA
| | - Farbod N. Rahaghi
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Nadine Al-Naamani
- Division of Pulmonary and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - on behalf of the American Thoracic Society Pulmonary Circulation Assembly Early Career Working Group
- Division of Pulmonary, Allergy and Critical Care Medicine, Stanford University, Stanford, CA, USA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford, CA, USA
- Division of Pulmonary and Critical Care Medicine, Oregon Health and Science University, Portland, OR, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
20
|
Ong M, Klann JG, Lin KJ, Maron BA, Murphy SN, Natter MD, Mandl KD. Claims-Based Algorithms for Identifying Patients With Pulmonary Hypertension: A Comparison of Decision Rules and Machine-Learning Approaches. J Am Heart Assoc 2020; 9:e016648. [PMID: 32990147 PMCID: PMC7792386 DOI: 10.1161/jaha.120.016648] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background Real‐world healthcare data are an important resource for epidemiologic research. However, accurate identification of patient cohorts—a crucial first step underpinning the validity of research results—remains a challenge. We developed and evaluated claims‐based case ascertainment algorithms for pulmonary hypertension (PH), comparing conventional decision rules with state‐of‐the‐art machine‐learning approaches. Methods and Results We analyzed an electronic health record‐Medicare linked database from two large academic tertiary care hospitals (years 2007–2013). Electronic health record charts were reviewed to form a gold standard cohort of patients with (n=386) and without PH (n=164). Using health encounter data captured in Medicare claims (including patients’ demographics, diagnoses, medications, and procedures), we developed and compared 2 approaches for identifying patients with PH: decision rules and machine‐learning algorithms using penalized lasso regression, random forest, and gradient boosting machine. The most optimal rule‐based algorithm—having ≥3 PH‐related healthcare encounters and having undergone right heart catheterization—attained an area under the receiver operating characteristic curve of 0.64 (sensitivity, 0.75; specificity, 0.48). All 3 machine‐learning algorithms outperformed the most optimal rule‐based algorithm (P<0.001). A model derived from the random forest algorithm achieved an area under the receiver operating characteristic curve of 0.88 (sensitivity, 0.87; specificity, 0.70), and gradient boosting machine achieved comparable results (area under the receiver operating characteristic curve, 0.85; sensitivity, 0.87; specificity, 0.70). Penalized lasso regression achieved an area under the receiver operating characteristic curve of 0.73 (sensitivity, 0.70; specificity, 0.68). Conclusions Research‐grade case identification algorithms for PH can be derived and rigorously validated using machine‐learning algorithms. Simple decision rules commonly applied in published literature performed poorly; more complex rule‐based algorithms may potentially address the limitation of this approach. PH research using claims data would be considerably strengthened through the use of validated algorithms for cohort ascertainment.
Collapse
Affiliation(s)
- Mei‐Sing Ong
- Department of Population MedicineHarvard Medical School &Harvard Pilgrim Health Care InstituteBostonMA
- Computational Health Informatics ProgramBoston Children’s HospitalBostonMA
| | - Jeffrey G. Klann
- Laboratory of Computer ScienceMassachusetts General HospitalHarvard Medical SchoolBostonMA
| | - Kueiyu Joshua Lin
- Division of Pharmacoepidemiology and PharmacoeconomicsDepartment of MedicineBrigham and Women’s HospitalHarvard Medical SchoolBostonMA
| | - Bradley A. Maron
- Cardiovascular DivisionDepartment of MedicineBrigham and Women’s HospitalHarvard Medical SchoolBostonMA
| | - Shawn N. Murphy
- Department of NeurologyMassachusetts General Hospital, Harvard Medical SchoolBostonMA
| | - Marc D. Natter
- Computational Health Informatics ProgramBoston Children’s HospitalBostonMA
- Department of PediatricsHarvard Medical SchoolBostonMA
| | - Kenneth D. Mandl
- Computational Health Informatics ProgramBoston Children’s HospitalBostonMA
- Department of PediatricsHarvard Medical SchoolBostonMA
- Department of Biomedical InformaticsHarvard Medical SchoolBostonMA
| |
Collapse
|