1
|
Gonçalves Pereira J, Fernandes J, Mendes T, Gonzalez FA, Fernandes SM. Artificial Intelligence to Close the Gap between Pharmacokinetic/Pharmacodynamic Targets and Clinical Outcomes in Critically Ill Patients: A Narrative Review on Beta Lactams. Antibiotics (Basel) 2024; 13:853. [PMID: 39335027 PMCID: PMC11428226 DOI: 10.3390/antibiotics13090853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 08/30/2024] [Accepted: 09/04/2024] [Indexed: 09/30/2024] Open
Abstract
Antimicrobial dosing can be a complex challenge. Although a solid rationale exists for a link between antibiotic exposure and outcome, conflicting data suggest a poor correlation between pharmacokinetic/pharmacodynamic targets and infection control. Different reasons may lead to this discrepancy: poor tissue penetration by β-lactams due to inflammation and inadequate tissue perfusion; different bacterial response to antibiotics and biofilms; heterogeneity of the host's immune response and drug metabolism; bacterial tolerance and acquisition of resistance during therapy. Consequently, either a fixed dose of antibiotics or a fixed target concentration may be doomed to fail. The role of biomarkers in understanding and monitoring host response to infection is also incompletely defined. Nowadays, with the ever-growing stream of data collected in hospitals, utilizing the most efficient analytical tools may lead to better personalization of therapy. The rise of artificial intelligence and machine learning has allowed large amounts of data to be rapidly accessed and analyzed. These unsupervised learning models can apprehend the data structure and identify homogeneous subgroups, facilitating the individualization of medical interventions. This review aims to discuss the challenges of β-lactam dosing, focusing on its pharmacodynamics and the new challenges and opportunities arising from integrating machine learning algorithms to personalize patient treatment.
Collapse
Affiliation(s)
- João Gonçalves Pereira
- Grupo de Investigação e Desenvolvimento em Infeção e Sépsis, Clínica Universitária de Medicina Intensiva, Faculdade de Medicina, Universidade de Lisboa, 1649-004 Lisbon, Portugal
- Serviço de Medicina Intensiva, Hospital Vila Franca de Xira, 2600-009 Vila Franca de Xira, Portugal
| | - Joana Fernandes
- Grupo de Investigação e Desenvolvimento em Infeção e Sépsis, Serviço de Medicina Intensiva, Centro Hospitalar de Trás-os-Montes e Alto Douro, 5000-508 Vila Real, Portugal
| | - Tânia Mendes
- Serviço de Medicina Interna, Hospital Vila Franca de Xira, 2600-009 Vila Franca de Xira, Portugal
| | - Filipe André Gonzalez
- Serviço de Medicina Intensiva, Hospital Garcia De Orta, Clínica Universitária de Medicina Intensiva, Faculdade de Medicina, Universidade de Lisboa, 1649-004 Lisbon, Portugal
| | - Susana M Fernandes
- Grupo de Investigação e Desenvolvimento em Infeção e Sépsis, Serviço de Medicina Intensiva, Hospital Santa Maria, Clínica Universitária de Medicina Intensiva, Faculdade de Medicina, Universidade de Lisboa, 1649-004 Lisbon, Portugal
| |
Collapse
|
2
|
Palomino-Echeverria S, Huergo E, Ortega-Legarreta A, Uson Raposo EM, Aguilar F, Peña-Ramirez CDL, López-Vicario C, Alessandria C, Laleman W, Queiroz Farias A, Moreau R, Fernandez J, Arroyo V, Caraceni P, Lagani V, Sánchez-Garrido C, Clària J, Tegner J, Trebicka J, Kiani NA, Planell N, Rautou PE, Gomez-Cabrero D. A robust clustering strategy for stratification unveils unique patient subgroups in acutely decompensated cirrhosis. J Transl Med 2024; 22:599. [PMID: 38937846 PMCID: PMC11210156 DOI: 10.1186/s12967-024-05386-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 06/10/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Patient heterogeneity poses significant challenges for managing individuals and designing clinical trials, especially in complex diseases. Existing classifications rely on outcome-predicting scores, potentially overlooking crucial elements contributing to heterogeneity without necessarily impacting prognosis. METHODS To address patient heterogeneity, we developed ClustALL, a computational pipeline that simultaneously faces diverse clinical data challenges like mixed types, missing values, and collinearity. ClustALL enables the unsupervised identification of patient stratifications while filtering for stratifications that are robust against minor variations in the population (population-based) and against limited adjustments in the algorithm's parameters (parameter-based). RESULTS Applied to a European cohort of patients with acutely decompensated cirrhosis (n = 766), ClustALL identified five robust stratifications, using only data at hospital admission. All stratifications included markers of impaired liver function and number of organ dysfunction or failure, and most included precipitating events. When focusing on one of these stratifications, patients were categorized into three clusters characterized by typical clinical features; notably, the 3-cluster stratification showed a prognostic value. Re-assessment of patient stratification during follow-up delineated patients' outcomes, with further improvement of the prognostic value of the stratification. We validated these findings in an independent prospective multicentre cohort of patients from Latin America (n = 580). CONCLUSIONS By applying ClustALL to patients with acutely decompensated cirrhosis, we identified three patient clusters. Following these clusters over time offers insights that could guide future clinical trial design. ClustALL is a novel and robust stratification method capable of addressing the multiple challenges of patient stratification in most complex diseases.
Collapse
Affiliation(s)
| | - Estefania Huergo
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain
| | - Asier Ortega-Legarreta
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain
| | - Eva M Uson Raposo
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Ferran Aguilar
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | | | - Cristina López-Vicario
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Biochemistry and Molecular Genetics Service, Hospital Clínic-IDIBAPS, Barcelona, Spain
| | - Carlo Alessandria
- Division of Gastroenterology and Hepatology, A.O.U. Città della Salute e della Scienza di Torino, Torino, Italy
| | - Wim Laleman
- Department of Gastroenterology & Hepatology, Section of Liver & Biliopancreatic disorders and Liver Transplantation, University Hospitals Leuven, KU LEUVEN, Leuven, Belgium
| | - Alberto Queiroz Farias
- Department of Gastroenterology, Hospital das Clínicas, University of São Paulo School of Medicine, Paulo School, Brazil
| | - Richard Moreau
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Université Paris-Cité, Inserm, Centre de recherche sur l'inflammation, UMR 1149, Paris, France
- Assistance Publique-Hôpitaux de Paris (AP-HP), Paris, France
- Hôpital Beaujon, Service d'Hépatologie, Clichy, France
| | - Javier Fernandez
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Vicente Arroyo
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
| | - Paolo Caraceni
- Department of Medical and Surgical Science, University of Bologna, Bologna, Italy
- IRCCS Azienda Ospedaliera-Universitaria di Bologna, Bologna, Italy
| | - Vincenzo Lagani
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal, Saudi Arabia
- Institute of Chemical Biology, Ilia State University, Tbilisi, 0162, Georgia
| | | | - Joan Clària
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Biochemistry and Molecular Genetics Service, Hospital Clínic-IDIBAPS, Barcelona, Spain
- CIBERehd, Barcelona, Spain
- Department of Biomedical Sciences, University of Barcelona, Barcelona, Spain
| | - Jesper Tegner
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal, Saudi Arabia
- Unit of Computational Medicine, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Jonel Trebicka
- European Foundation for the Study of Chronic Liver Failure, Barcelona, Spain
- Department of internal medicine B, University of Münster, Münster, Germany
| | - Narsis A Kiani
- Algorithmic Dynamics Lab, Center for Molecular Medicine, Karolinska Institutet, Solna, Sweden
- Department of Oncology-Pathology, Karolinska Institutet, Solna, Sweden
| | - Nuria Planell
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain.
- Computational Biology Program, Universidad de Navarra, CIMA, Instituto de Investigación Sanitaria de Navarra (IdiSNA), Navarra, 31008, Spain.
| | - Pierre-Emmanuel Rautou
- Université Paris-Cité, Inserm, Centre de recherche sur l'inflammation, UMR 1149, Paris, France.
- AP-HP, Hôpital Beaujon, Service d'Hépatologie, DMU DIGEST, Centre de Référence des Maladies Vasculaires du Foie, FILFOIE, ERN RARE-LIVER, Clichy, France.
| | - David Gomez-Cabrero
- Unit of Translational Bioinformatics, Navarrabiomed - Fundación Miguel Servet, Pamplona, Spain.
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| |
Collapse
|
3
|
Perschinka F, Peer A, Joannidis M. [Artificial intelligence and acute kidney injury]. Med Klin Intensivmed Notfmed 2024; 119:199-207. [PMID: 38396124 PMCID: PMC10995052 DOI: 10.1007/s00063-024-01111-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 01/17/2024] [Indexed: 02/25/2024]
Abstract
Digitalization is increasingly finding its way into intensive care units and with it artificial intelligence (AI) for critically ill patients. One promising area for the use of AI is in the field of acute kidney injury (AKI). The use of AI is primarily focused on the prediction of AKI, but further approaches are also being used to classify existing AKI into different phenotypes. Different AI models are used for prediction. The area under the receiver operating characteristic curve values (AUROC) achieved with these models vary and are influenced by several factors, such as the prediction time and the definition of AKI. Most models have an AUROC between 0.650 and 0.900, with lower values for predictions further into the future and when applying Acute Kidney Injury Network (AKIN) instead of KDIGO criteria. Classification into phenotypes already makes it possible to categorize patients into groups with different risks of mortality or requirement of renal replacement therapy (RRT), but the etiologies or therapeutic consequences derived from this are still lacking. However, all the models suffer from AI-specific shortcomings. The use of large databases does not make it possible to promptly include recent changes in therapy and the implementation of new biomarkers in a relevant proportion. For this reason, serum creatinine and urinary output, with their known limitations, dominate current AI models for prediction impairing the performance of the current models. On the other hand, the increasingly complex models no longer allow physicians to understand the basis on which the warning of a threatening AKI is calculated and subsequent initiation of therapy should take place. The successful use of AIs in routine clinical practice will be highly determined by the trust of the physicians in the systems and overcoming the aforementioned weaknesses. However, the clinician will remain irreplaceable as the decisive authority for critically ill patients by combining measurable and nonmeasurable parameters.
Collapse
Affiliation(s)
| | | | - Michael Joannidis
- Gemeinsame Einrichtung für Internistische Notfall- und Intensivmedizin, Department Innere Medizin, Medizinische Universität Innsbruck, Anichstraße 35, 6020, Innsbruck, Österreich.
| |
Collapse
|
4
|
Ogasawara T, Mukaino M, Matsunaga K, Wada Y, Suzuki T, Aoshima Y, Furuzawa S, Kono Y, Saitoh E, Yamaguchi M, Otaka Y, Tsukada S. Prediction of stroke patients' bedroom-stay duration: machine-learning approach using wearable sensor data. Front Bioeng Biotechnol 2024; 11:1285945. [PMID: 38234303 PMCID: PMC10791943 DOI: 10.3389/fbioe.2023.1285945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/11/2023] [Indexed: 01/19/2024] Open
Abstract
Background: The importance of being physically active and avoiding staying in bed has been recognized in stroke rehabilitation. However, studies have pointed out that stroke patients admitted to rehabilitation units often spend most of their day immobile and inactive, with limited opportunities for activity outside their bedrooms. To address this issue, it is necessary to record the duration of stroke patients staying in their bedrooms, but it is impractical for medical providers to do this manually during their daily work of providing care. Although an automated approach using wearable devices and access points is more practical, implementing these access points into medical facilities is costly. However, when combined with machine learning, predicting the duration of stroke patients staying in their bedrooms is possible with reduced cost. We assessed using machine learning to estimate bedroom-stay duration using activity data recorded with wearable devices. Method: We recruited 99 stroke hemiparesis inpatients and conducted 343 measurements. Data on electrocardiograms and chest acceleration were measured using a wearable device, and the location name of the access point that detected the signal of the device was recorded. We first investigated the correlation between bedroom-stay duration measured from the access point as the objective variable and activity data measured with a wearable device and demographic information as explanatory variables. To evaluate the duration predictability, we then compared machine-learning models commonly used in medical studies. Results: We conducted 228 measurements that surpassed a 90% data-acquisition rate using Bluetooth Low Energy. Among the explanatory variables, the period spent reclining and sitting/standing were correlated with bedroom-stay duration (Spearman's rank correlation coefficient (R) of 0.56 and -0.52, p < 0.001). Interestingly, the sum of the motor and cognitive categories of the functional independence measure, clinical indicators of the abilities of stroke patients, lacked correlation. The correlation between the actual bedroom-stay duration and predicted one using machine-learning models resulted in an R of 0.72 and p < 0.001, suggesting the possibility of predicting bedroom-stay duration from activity data and demographics. Conclusion: Wearable devices, coupled with machine learning, can predict the duration of patients staying in their bedrooms. Once trained, the machine-learning model can predict without continuously tracking the actual location, enabling more cost-effective and privacy-centric future measurements.
Collapse
Affiliation(s)
- Takayuki Ogasawara
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Masahiko Mukaino
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
- Department of Rehabilitation Medicine, Hokkaido University Hospital, Sapporo, Japan
| | | | - Yoshitaka Wada
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Takuya Suzuki
- Department of Rehabilitation Medicine, Fujita Health University Hospital, Toyoake, Japan
| | - Yasushi Aoshima
- Department of Rehabilitation Medicine, Fujita Health University Hospital, Toyoake, Japan
| | - Shotaro Furuzawa
- Department of Rehabilitation Medicine, Fujita Health University Hospital, Toyoake, Japan
| | - Yuji Kono
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
- Department of Rehabilitation Medicine, Fujita Health University Hospital, Toyoake, Japan
| | - Eiichi Saitoh
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Masumi Yamaguchi
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
| | - Yohei Otaka
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Shingo Tsukada
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
| |
Collapse
|
5
|
Gao W, Xie J, Ke Y, Tian M, Zeng Z, Ma X, Zhi M. A two-stage prediction filling method with support vector technologies optimized competitively in stages by grey wolf optimizer and particle swarm optimization for missing fasting blood glucose. Proc Inst Mech Eng H 2023; 237:1427-1440. [PMID: 37873735 DOI: 10.1177/09544119231206456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Missing values often affect the data utilization in epidemiological survey. In this study, according to the cut-off point value of the medical diagnostic standard of fasting blood glucose for diabetes, we divide fasting blood glucose test data from the China Health and Nutrition Survey (CHNS) of Shandong province in 2009 into two classes: the normal and the abnormal. Accordingly, for missing fasting blood glucose values, we propose a two-stage prediction filling method with optimized support vector technologies competitively by particle swarm optimization (PSO) or grey wolf optimizer (GWO), which is to first predict the class of the missing data with support vector machine (SVM) in the first stage and then predict the missing value with support vector regression (SVR) within the predicted class in the second stage. In addition, we use the LIBSVM as a gold standard to train both SVM and SVR in different stages. For two kinds of competitive optimizers in stages, in the first stage GWO has the highest classification accuracy (91.1%), and in the second stage PSO has the smallest in-class mean absolute error (0.48). So, GWO-SVM-PSO-SVR is determined as the optimal model and a predicted value with it serves as a fill value. The comparison results of the models in empirical analysis also show that it outdoes any of the other filling models in terms of mean absolute error and mean absolute percentage error. In addition, the sensitivity analysis shows that it presents high tolerance as the sample size changes and has a good stability.
Collapse
Affiliation(s)
- Wenlong Gao
- Institute of Health Statistics and Intelligent Analysis, School of Public Health, Lanzhou University, Lanzhou, Gansu, P. R. China
- Department of Epidemiology and Health Statistics, School of Public Health, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Jingxiang Xie
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Yongsong Ke
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Maoyun Tian
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Zhimei Zeng
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Xiaojie Ma
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| | - Minqian Zhi
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, Gansu, P. R. China
| |
Collapse
|
6
|
Dhingra LS, Shen M, Mangla A, Khera R. Cardiovascular Care Innovation through Data-Driven Discoveries in the Electronic Health Record. Am J Cardiol 2023; 203:136-148. [PMID: 37499593 PMCID: PMC10865722 DOI: 10.1016/j.amjcard.2023.06.104] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/24/2023] [Accepted: 06/29/2023] [Indexed: 07/29/2023]
Abstract
The electronic health record (EHR) represents a rich source of patient information, increasingly being leveraged for cardiovascular research. Although its primary use remains the seamless delivery of health care, the various longitudinally aggregated structured and unstructured data elements for each patient within the EHR can define the computational phenotypes of disease and care signatures and their association with outcomes. Although structured data elements, such as demographic characteristics, laboratory measurements, problem lists, and medications, are easily extracted, unstructured data are underused. The latter include free text in clinical narratives, documentation of procedures, and reports of imaging and pathology. Rapid scaling up of data storage and rapid innovation in natural language processing and computer vision can power insights from unstructured data streams. However, despite an array of opportunities for research using the EHR, specific expertise is necessary to adequately address confidentiality, accuracy, completeness, and heterogeneity challenges in EHR-based research. These often require methodological innovation and best practices to design and conduct successful research studies. Our review discusses these challenges and their proposed solutions. In addition, we highlight the ongoing innovations in federated learning in the EHR through a greater focus on common data models and discuss ongoing work that defines such an approach to large-scale, multicenter, federated studies. Such parallel improvements in technology and research methods enable innovative care and optimization of patient outcomes.
Collapse
Affiliation(s)
| | - Miles Shen
- Section of Cardiovascular Medicine, Department of Internal Medicine; Department of Internal Medicine
| | - Anjali Mangla
- Section of Cardiovascular Medicine, Department of Internal Medicine; Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine; Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut; Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut.; Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut.
| |
Collapse
|
7
|
Tsai YH, Hung KY, Fang WF. Use of Peak Glucose Level and Peak Glycemic Gap in Mortality Risk Stratification in Critically Ill Patients with Sepsis and Prior Diabetes Mellitus of Different Body Mass Indexes. Nutrients 2023; 15:3973. [PMID: 37764757 PMCID: PMC10534504 DOI: 10.3390/nu15183973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/12/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
Sepsis remains a critical concern in healthcare, and its management is complicated when patients have pre-existing diabetes and varying body mass indexes (BMIs). This retrospective multicenter observational study, encompassing data from 15,884 sepsis patients admitted between 2012 and 2017, investigates the relationship between peak glucose levels and peak glycemic gap in the first 3 days of ICU admission, and their impact on mortality. The study reveals that maintaining peak glucose levels between 141-220 mg/dL is associated with improved survival rates in sepsis patients with diabetes. Conversely, peak glycemic gaps exceeding 146 mg/dL are linked to poorer survival outcomes. Patients with peak glycemic gaps below -73 mg/dL also experience inferior survival rates. In terms of predicting mortality, modified Sequential Organ Failure Assessment-Peak Glycemic Gap (mSOFA-pgg) scores outperform traditional SOFA scores by 6.8% for 90-day mortality in overweight patients. Similarly, the modified SOFA-Peak Glucose (mSOFA-pg) score demonstrates a 17.2% improvement over the SOFA score for predicting 28-day mortality in underweight patients. Importantly, both mSOFA-pg and mSOFA-pgg scores exhibit superior predictive power compared to traditional SOFA scores for patients at high nutritional risk. These findings underscore the importance of glycemic control in sepsis management and highlight the potential utility of the mSOFA-pg and mSOFA-pgg scores in predicting mortality risk, especially in patients with diabetes and varying nutritional statuses.
Collapse
Affiliation(s)
- Yi-Hsuan Tsai
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan; (Y.-H.T.); (K.-Y.H.)
| | - Kai-Yin Hung
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan; (Y.-H.T.); (K.-Y.H.)
- Department of Nutritional Therapy, Kaohsiung Chang Gung Memorial Hospital, Kaohsiung 83301, Taiwan
- Department of Nursing, Mei Ho University, Pingtung 91202, Taiwan
| | - Wen-Feng Fang
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan; (Y.-H.T.); (K.-Y.H.)
- Department of Respiratory Therapy, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan
- Department of Respiratory Care, Chang Gung University of Science and Technology, Chiayi 61363, Taiwan
| |
Collapse
|
8
|
Lee JM, Hauskrecht M. Personalized event prediction for Electronic Health Records. Artif Intell Med 2023; 143:102620. [PMID: 37673563 PMCID: PMC10503594 DOI: 10.1016/j.artmed.2023.102620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 03/01/2023] [Accepted: 04/24/2023] [Indexed: 09/08/2023]
Abstract
Clinical event sequences consist of hundreds of clinical events that represent records of patient care in time. Developing accurate predictive models of such sequences is of a great importance for supporting a variety of models for interpreting/classifying the current patient condition, or predicting adverse clinical events and outcomes, all aimed to improve patient care. One important challenge of learning predictive models of clinical sequences is their patient-specific variability. Based on underlying clinical conditions, each patient's sequence may consist of different sets of clinical events (observations, lab results, medications, procedures). Hence, simple population-wide models learned from event sequences for many different patients may not accurately predict patient-specific dynamics of event sequences and their differences. To address the problem, we propose and investigate multiple new event sequence prediction models and methods that let us better adjust the prediction for individual patients and their specific conditions. The methods developed in this work pursue refinement of population-wide models to subpopulations, self-adaptation, and a meta-level model switching that is able to adaptively select the model with the best chance to support the immediate prediction. We analyze and test the performance of these models on clinical event sequences of patients in MIMIC-III database.
Collapse
Affiliation(s)
- Jeong Min Lee
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Milos Hauskrecht
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
9
|
Schleicher M, Unnikrishnan V, Pryss R, Schobel J, Schlee W, Spiliopoulou M. Prediction meets time series with gaps: User clusters with specific usage behavior patterns. Artif Intell Med 2023; 142:102575. [PMID: 37316098 DOI: 10.1016/j.artmed.2023.102575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/25/2023] [Accepted: 04/27/2023] [Indexed: 06/16/2023]
Abstract
With mHealth apps, data can be recorded in real life, which makes them useful, for example, as an accompanying tool in treatments. However, such datasets, especially those based on apps with usage on a voluntary basis, are often affected by fluctuating engagement and by high user dropout rates. This makes it difficult to exploit the data using machine learning techniques and raises the question of whether users have stopped using the app. In this extended paper, we present a method to identify phases with varying dropout rates in a dataset and predict for each. We also present an approach to predict what period of inactivity can be expected for a user in the current state. We use change point detection to identify the phases, show how to deal with uneven misaligned time series and predict the user's phase using time series classification. In addition, we examine how the evolution of adherence develops in individual clusters of individuals. We evaluated our method on the data of an mHealth app for tinnitus, and show that our approach is appropriate for the study of adherence in datasets with uneven, unaligned time series of different lengths and with missing values.
Collapse
Affiliation(s)
- Miro Schleicher
- Knowledge Management & Discovery Lab, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany.
| | - Vishnu Unnikrishnan
- Knowledge Management & Discovery Lab, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany
| | - Rüdiger Pryss
- Institute of Clinical Epidemiology and Biometry, University of Würzburg, Würzburg, Germany
| | - Johannes Schobel
- Institute DigiHealth, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany
| | - Winfried Schlee
- Eastern Switzerland University of Applied Sciences, St. Gallen, Switzerland
| | - Myra Spiliopoulou
- Knowledge Management & Discovery Lab, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany.
| |
Collapse
|
10
|
Liu Z, Chen C, Ma Q. Category-aware optimal transport for incomplete data classification. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
11
|
Zhou Y, Shi J, Stein R, Liu X, Baldassano RN, Forrest CB, Chen Y, Huang J. Missing data matter: an empirical evaluation of the impacts of missing EHR data in comparative effectiveness research. J Am Med Inform Assoc 2023; 30:1246-1256. [PMID: 37337922 PMCID: PMC10280351 DOI: 10.1093/jamia/ocad066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 03/20/2023] [Accepted: 04/08/2023] [Indexed: 06/21/2023] Open
Abstract
OBJECTIVES The impacts of missing data in comparative effectiveness research (CER) using electronic health records (EHRs) may vary depending on the type and pattern of missing data. In this study, we aimed to quantify these impacts and compare the performance of different imputation methods. MATERIALS AND METHODS We conducted an empirical (simulation) study to quantify the bias and power loss in estimating treatment effects in CER using EHR data. We considered various missing scenarios and used the propensity scores to control for confounding. We compared the performance of the multiple imputation and spline smoothing methods to handle missing data. RESULTS When missing data depended on the stochastic progression of disease and medical practice patterns, the spline smoothing method produced results that were close to those obtained when there were no missing data. Compared to multiple imputation, the spline smoothing generally performed similarly or better, with smaller estimation bias and less power loss. The multiple imputation can still reduce study bias and power loss in some restrictive scenarios, eg, when missing data did not depend on the stochastic process of disease progression. DISCUSSION AND CONCLUSION Missing data in EHRs could lead to biased estimates of treatment effects and false negative findings in CER even after missing data were imputed. It is important to leverage the temporal information of disease trajectory to impute missing values when using EHRs as a data resource for CER and to consider the missing rate and the effect size when choosing an imputation method.
Collapse
Affiliation(s)
- Yizhao Zhou
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Jiasheng Shi
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Ronen Stein
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Xiaokang Liu
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Robert N Baldassano
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Christopher B Forrest
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jing Huang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| |
Collapse
|
12
|
Ogasawara T, Mukaino M, Matsuura H, Aoshima Y, Suzuki T, Togo H, Nakashima H, Saitoh E, Yamaguchi M, Otaka Y, Tsukada S. Ensemble averaging for categorical variables: Validation study of imputing lost data in 24-h recorded postures of inpatients. Front Physiol 2023; 14:1094946. [PMID: 36776969 PMCID: PMC9910696 DOI: 10.3389/fphys.2023.1094946] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 01/06/2023] [Indexed: 01/27/2023] Open
Abstract
Acceleration sensors are widely used in consumer wearable devices and smartphones. Postures estimated from recorded accelerations are commonly used as features indicating the activities of patients in medical studies. However, recording for over 24 h is more likely to result in data losses than recording for a few hours, especially when consumer-grade wearable devices are used. Here, to impute postures over a period of 24 h, we propose an imputation method that uses ensemble averaging. This method outputs a time series of postures over 24 h with less lost data by calculating the ratios of postures taken at the same time of day during several measurement-session days. Whereas conventional imputation methods are based on approaches with groups of subjects having multiple variables, the proposed method imputes the lost data variables individually and does not require other variables except posture. We validated the method on 306 measurement data from 99 stroke inpatients in a hospital rehabilitation ward. First, to classify postures from acceleration data measured by a wearable sensor placed on the patient's trunk, we preliminary estimated possible thresholds for classifying postures as 'reclining' and 'sitting or standing' by investigating the valleys in the histogram of occurrences of trunk angles during a long-term recording. Next, the imputations of the proposed method were validated. The proposed method significantly reduced the missing data rate from 5.76% to 0.21%, outperforming a conventional method.
Collapse
Affiliation(s)
- Takayuki Ogasawara
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan,*Correspondence: Takayuki Ogasawara,
| | - Masahiko Mukaino
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan,Department of Rehabilitation Medicine, Hokkaido University Hospital, Sapporo, Japan
| | - Hirotaka Matsuura
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Yasushi Aoshima
- Department of Rehabilitation, Fujita Health University Hospital, Toyoake, Japan
| | - Takuya Suzuki
- Department of Rehabilitation, Fujita Health University Hospital, Toyoake, Japan
| | - Hiroyoshi Togo
- NTT Device Innovation Center, NTT Corporation, Atsugi, Japan
| | - Hiroshi Nakashima
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
| | - Eiichi Saitoh
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Masumi Yamaguchi
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
| | - Yohei Otaka
- Department of Rehabilitation Medicine I, School of Medicine, Fujita Health University, Toyoake, Japan
| | - Shingo Tsukada
- NTT Basic Research Laboratories and Bio-Medical Informatics Research Center, NTT Corporation, Atsugi, Japan
| |
Collapse
|
13
|
Mohammed H, Wang K, Wu H, Wang G. Subject-wise model generalization through pooling and patching for regression: Application on non-invasive systolic blood pressure estimation. Comput Biol Med 2022; 151:106299. [PMID: 36423530 DOI: 10.1016/j.compbiomed.2022.106299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/19/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
BACKGROUND Subject-wise modeling using machine learning is useful in many applications requiring low error and complexity, such as wearable medical devices. However, regression accuracy depends highly on the data available to train the model and the model's generalization ability. Adversely, the prediction error may increase severely if unknown data patterns test the model; such a model is known to be overfitted. In medicine-related applications, such as Non-Invasive Blood Pressure (NIBP) estimation, the high error renders the estimation model useless and dangerous. METHODS This paper presents a novel algorithm to handle overfitting by editing the training data to achieve generalization for subject-wise models. The pooling and patching (PaP) algorithms use a relatively short record segment of a subject as a Key-Segment (KS) to search through a larger dataset for similar subjects. Then samples taken from the matched subjects' pool records are used to patch the original subject's KS. Due to the significance of systolic blood pressure (SBP) and the complexity of its variability, non-invasive estimation of SBP from electrocardiography (ECG) and photoplethysmography (PPG) is introduced as an application to assess the algorithm. The study was performed on 2051 subjects with a wide range of age, height, weight, length, and health status. The subjects' records were taken from a large public dataset, VitalDB, which is acquired from subjects undergoing different surgeries. Finally, all the results are obtained without using other model generalization techniques. RESULTS The generalization effect of the proposed algorithm, PaP, significantly outperformed cross-validation, which is widely used in regression model generalization. Moreover, the testing results show that a KS of 200 to 2000 samples is sufficient for providing high accuracy for much longer testing data of about 12000 to 24000 samples long, which is less than %10 of the record length on average. Furthermore, compared to other works based on the same dataset, PaP provides a significantly lower mean error of -0.75 ± 5.51 mmHg, with a small training data portion of 15% over 2051 subjects.
Collapse
Affiliation(s)
- Hazem Mohammed
- Department of Micro/Nano Electronics, School of Electrical, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Electrical Engineering Department, Faculty of Engineering, Assuit University, Asyut, Egypt.
| | - Kai Wang
- Zhiyuan College, Shanghai Jiao Tong University, Shanghai, China
| | - Hao Wu
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen, Guangdong, China.
| | - Guoxing Wang
- Department of Micro/Nano Electronics, School of Electrical, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
14
|
Spyreli E, McGowan L, Heery E, Kelly A, Croker H, Lawlor C, O'Neill R, Kelleher CC, McCarthy M, Wall P, Heinen MM. Public beliefs about the consequences of living with obesity in the Republic of Ireland and Northern Ireland. BMC Public Health 2022; 22:1910. [PMID: 36229815 PMCID: PMC9559245 DOI: 10.1186/s12889-022-14280-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 09/30/2022] [Indexed: 11/17/2022] Open
Abstract
Background This study aimed to capture public beliefs about living with obesity, examine how these beliefs have changed over time and to explore whether certain characteristics were associated with them in a nationally representative sample of adults from the Republic of Ireland (RoI) and Northern Ireland (NI). Methods A cross-sectional survey employed a random quota sampling approach to recruit a nationally representative sample of 1046 adults across NI and RoI. Telephone interviews captured information on demographics; health behaviours & attitudes; and beliefs about the consequences of obesity (measured using the Obesity Beliefs Scale). Univariable analyses compared beliefs about the consequences of living with obesity between participants with a self-reported healthy weight and those living with overweight or obesity, and non-responders (those for whom weight status could not be ascertained due to missing data). Multiple linear regression examined associations between obesity-related beliefs and socio-demographics, self-rated health and perceived ability to change health behaviours. Multiple linear regression also compared changes in obesity-related beliefs between 2013 and 2020 in the RoI. Results Higher endorsement of the negative outcomes of obesity was significantly associated with living with a healthy weight, higher self-rated health, dietary quality and perceived ability to improve diet and physical activity. Those who lived with overweight, with obesity and non-responders were less likely to endorse the negative consequences of obesity. Those living with obesity and non-responders were also more likely to support there is an increased cost and effort in maintaining a healthy weight. Comparison with survey data from 2013 showed that currently, there is a greater endorsement of the health benefits of maintaining a healthy weight (p < 0001), but also of the increased costs associated with it (p < 0001). Conclusion Beliefs about the consequences of maintaining a healthy body weight are associated with individuals’ weight, self-rated health, diet and perceived ease of adoption of dietary and exercise-related improvements. Beliefs about the health risks of obesity and perceived greater costs associated with maintaining a healthy weight appear to have strengthened over time. Present findings are pertinent to researchers and policy makers involved in the design and framing of interventions to address obesity. Supplementary information The online version contains supplementary material available at 10.1186/s12889-022-14280-9.
Collapse
Affiliation(s)
- Eleni Spyreli
- Centre for Public Health, School of Medicine, Dentistry and Biomedical Science, Queen's University Belfast, Belfast, UK.
| | - L McGowan
- Centre for Public Health, School of Medicine, Dentistry and Biomedical Science, Queen's University Belfast, Belfast, UK
| | - E Heery
- Library and Research Service, Oireachtas, Houses of the Oireachtas Service, Dublin, Ireland
| | - A Kelly
- Centre for Public Health, School of Medicine, Dentistry and Biomedical Science, Queen's University Belfast, Belfast, UK
| | - H Croker
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - C Lawlor
- National Nutrition Surveillance Centre, University College Dublin, Dublin, Ireland
| | - R O'Neill
- Centre for Public Health, School of Medicine, Dentistry and Biomedical Science, Queen's University Belfast, Belfast, UK
| | - C C Kelleher
- National Nutrition Surveillance Centre, University College Dublin, Dublin, Ireland
| | - M McCarthy
- Cork University Business School, University College Cork, Cork, Ireland
| | - P Wall
- National Nutrition Surveillance Centre, University College Dublin, Dublin, Ireland
| | - M M Heinen
- National Nutrition Surveillance Centre, University College Dublin, Dublin, Ireland
| |
Collapse
|
15
|
Duan F, Yang Y. Recognizing Missing Electromyography Signal by Data Split Reorganization Strategy and Weight-Based Multiple Neural Network Voting Method. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2070-2079. [PMID: 34460399 DOI: 10.1109/tnnls.2021.3105595] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Surface electromyography (sEMG) signals have been applied widely in prosthetic hand controlling. In the sEMG signal acquisition, wireless devices bring convenience, but also introduce signal missing due to interference or failure during data transmission. The missing signal may only last for tens of milliseconds, but have a great impact on the recognition. Researchers have employed various methods to complete missing sEMG data, but the completed signal may not totally fit the origins, and more extra calculation time will be spent. When recognizing hand gestures by sEMG from few sensors, to recognize the slightly or not serious signal missing, this study proposed a data split reorganization (DSR) strategy and a weight-based multiple neural network voting (WMV) method. To validate the proposed methods, controllable missing sEMG signals are generated artificially. Three time domain features are extracted based on non-overlapping sliding windows. The DSR is employed to make full use of the features, and then the WMV is utilized to recognize them. Nine subjects participated in the experiments, and the results indicate that the accuracy of the proposed methods is higher. For 5%, 10%, and 15% data missing ratios, the accuracy is 93.66%, 92.55%, and 91.19%, respectively. The Wilcoxon signed-rank test also demonstrates that these results are significantly superior to the situations in which the proposed methods are not applied. In the future, we will optimize the proposed methods to recognize the seriously missing sEMG signal.
Collapse
|
16
|
When Can I Expect the mHealth User to Return? Prediction Meets Time Series with Gaps. Artif Intell Med 2022. [DOI: 10.1007/978-3-031-09342-5_30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
17
|
Steif J, Brant R, Sreepada RS, West N, Murthy S, Görges M. Prediction Model Performance With Different Imputation Strategies: A Simulation Study Using a North American ICU Registry. Pediatr Crit Care Med 2022; 23:e29-e44. [PMID: 34560774 PMCID: PMC8719509 DOI: 10.1097/pcc.0000000000002835] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES To evaluate the performance of pragmatic imputation approaches when estimating model coefficients using datasets with varying degrees of data missingness. DESIGN Performance in predicting observed mortality in a registry dataset was evaluated using simulations of two simple logistic regression models with age-specific criteria for abnormal vital signs (mentation, systolic blood pressure, respiratory rate, WBC count, heart rate, and temperature). Starting with a dataset with complete information, increasing degrees of biased missingness of WBC and mentation were introduced, depending on the values of temperature and systolic blood pressure, respectively. Missing data approaches evaluated included analysis of complete cases only, assuming missing data are normal, and multiple imputation by chained equations. Percent bias and root mean square error, in relation to parameter estimates obtained from the original data, were evaluated as performance indicators. SETTING Data were obtained from the Virtual Pediatric Systems, LLC, database (Los Angeles, CA), which provides clinical markers and outcomes in prospectively collected records from 117 PICUs in the United States and Canada. PATIENTS Children admitted to a participating PICU in 2017, for whom all required data were available. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS Simulations demonstrated that multiple imputation by chained equations is an effective strategy and that even a naive implementation of multiple imputation by chained equations significantly outperforms traditional approaches: the root mean square error for model coefficients was lower using multiple imputation by chained equations in 90 of 99 of all simulations (91%) compared with discarding cases with missing data and lower in 97 of 99 (98%) compared with models assuming missing values are in the normal range. Assuming missing data to be abnormal was inferior to all other approaches. CONCLUSIONS Analyses of large observational studies are likely to encounter the issue of missing data, which are likely not missing at random. Researchers should always consider multiple imputation by chained equations (or similar imputation approaches) when encountering even only small proportions of missing data in their work.
Collapse
Affiliation(s)
- Jonathan Steif
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada
| | - Rollin Brant
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada
| | - Rama Syamala Sreepada
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, BC, Canada
| | - Nicholas West
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada
| | - Srinivas Murthy
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada
- Department of Pediatrics, Division of Critical Care, University of British Columbia, Vancouver, BC, Canada
| | - Matthias Görges
- Research Institute, BC Children's Hospital, Vancouver, BC, Canada
- Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
18
|
Wang S, Celebi ME, Zhang YD, Yu X, Lu S, Yao X, Zhou Q, Miguel MG, Tian Y, Gorriz JM, Tyukin I. Advances in Data Preprocessing for Biomedical Data Fusion: An Overview of the Methods, Challenges, and Prospects. INFORMATION FUSION 2021; 76:376-421. [DOI: 10.1016/j.inffus.2021.07.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
|
19
|
An Ensemble Method for Missing Data of Environmental Sensor Considering Univariate and Multivariate Characteristics. SENSORS 2021; 21:s21227595. [PMID: 34833670 PMCID: PMC8621076 DOI: 10.3390/s21227595] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/12/2021] [Accepted: 11/15/2021] [Indexed: 11/25/2022]
Abstract
With rapid urbanization, awareness of environmental pollution is growing rapidly and, accordingly, interest in environmental sensors that measure atmospheric and indoor air quality is increasing. Since these IoT-based environmental sensors are sensitive and value reliability, it is essential to deal with missing values, which are one of the causes of reliability problems. Characteristics that can be used to impute missing values in environmental sensors are the time dependency of single variables and the correlation between multivariate variables. However, in the existing method of imputing missing values, only one characteristic has been used and there has been no case where both characteristics were used. In this work, we introduced a new ensemble imputation method reflecting this. First, the cases in which missing values occur frequently were divided into four cases and were generated into the experimental data: communication error (aperiodic, periodic), sensor error (rapid change, measurement range). To compare the existing method with the proposed method, five methods of univariate imputation and five methods of multivariate imputation—both of which are widely used—were used as a single model to predict missing values for the four cases. The values predicted by a single model were applied to the ensemble method. Among the ensemble methods, the weighted average and stacking methods were used to derive the final predicted values and replace the missing values. Finally, the predicted values, substituted with the original data, were evaluated by a comparison between the mean absolute error (MAE) and the root mean square error (RMSE). The proposed ensemble method generally performed better than the single method. In addition, this method simultaneously considers the correlation between variables and time dependence, which are characteristics that must be considered in the environmental sensor. As a result, our proposed ensemble technique can contribute to the replacement of the missing values generated by environmental sensors, which can help to increase the reliability of environmental sensor data.
Collapse
|
20
|
Caudai C, Galizia A, Geraci F, Le Pera L, Morea V, Salerno E, Via A, Colombo T. AI applications in functional genomics. Comput Struct Biotechnol J 2021; 19:5762-5790. [PMID: 34765093 PMCID: PMC8566780 DOI: 10.1016/j.csbj.2021.10.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 10/05/2021] [Accepted: 10/05/2021] [Indexed: 12/13/2022] Open
Abstract
We review the current applications of artificial intelligence (AI) in functional genomics. The recent explosion of AI follows the remarkable achievements made possible by "deep learning", along with a burst of "big data" that can meet its hunger. Biology is about to overthrow astronomy as the paradigmatic representative of big data producer. This has been made possible by huge advancements in the field of high throughput technologies, applied to determine how the individual components of a biological system work together to accomplish different processes. The disciplines contributing to this bulk of data are collectively known as functional genomics. They consist in studies of: i) the information contained in the DNA (genomics); ii) the modifications that DNA can reversibly undergo (epigenomics); iii) the RNA transcripts originated by a genome (transcriptomics); iv) the ensemble of chemical modifications decorating different types of RNA transcripts (epitranscriptomics); v) the products of protein-coding transcripts (proteomics); and vi) the small molecules produced from cell metabolism (metabolomics) present in an organism or system at a given time, in physiological or pathological conditions. After reviewing main applications of AI in functional genomics, we discuss important accompanying issues, including ethical, legal and economic issues and the importance of explainability.
Collapse
Affiliation(s)
- Claudia Caudai
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Antonella Galizia
- CNR, Institute of Applied Mathematics and Information Technologies (IMATI), Genoa, Italy
| | - Filippo Geraci
- CNR, Institute for Informatics and Telematics (IIT), Pisa, Italy
| | - Loredana Le Pera
- CNR, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Bari, Italy
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Veronica Morea
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Emanuele Salerno
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Allegra Via
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Teresa Colombo
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| |
Collapse
|
21
|
Shahpari M, Hajji M, Mirnajafi-Zadeh J, Setoodeh P. Modeling plasticity during epileptogenesis by long short term memory neural networks. Cogn Neurodyn 2021; 16:401-409. [PMID: 35401870 PMCID: PMC8934824 DOI: 10.1007/s11571-021-09698-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 05/30/2021] [Accepted: 07/07/2021] [Indexed: 10/20/2022] Open
Abstract
Understanding the pathogenesis of epilepsy including changes in synaptic pathways can improve our knowledge about epilepsy and development of new treatments. In this regard, data-driven models such as artificial neural networks, which are able to capture the effects of synaptic plasticity, can play an important role. This paper proposes long short term memory (LSTM) as the ideal architecture for modeling plasticity changes, and validates this proposal via experimental data. As a special class of recurrent neural networks (RNNs), LSTM is able to track information through time and control its flow via several gating mechanisms, which allow for maintaining the relevant and forgetting the irrelevant information. In our experiments, potentiation and depotentiation of motor circuit and perforant pathway as two forms of plasticity were respectively induced by kindled and kindled + transcranial magnetic stimulation of animal groups. In kindling, both procedure duration and gradual synaptic changes play critical roles. The stimulation of both groups continued for six days. Both after-discharge (AD) and seizure behavior as two biologically measurable effects of plasticity were recorded immediately post each stimulation. Three classes of artificial neural networks-LSTM, RNN, and feedforward neural network (FFNN)-were trained to predict AD and seizure behavior as indicators of plasticity during these six days. Results obtained from the collected data confirm the superiority of LSTM. For seizure behavior, the prediction accuracies achieved by these three models were 0.91 ± 0.01, 0.77 ± 0.02, and 0.59 ± 0.02%, respectively, and for AD, the prediction accuracies were 0.82 ± 0.01, 0.74 ± 0.08 and 0.42 ± 0.1, respectively.
Collapse
|
22
|
E Moura FS, Amin K, Ekwobi C. Artificial intelligence in the management and treatment of burns: a systematic review. BURNS & TRAUMA 2021; 9:tkab022. [PMID: 34423054 PMCID: PMC8375569 DOI: 10.1093/burnst/tkab022] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 03/08/2021] [Accepted: 04/30/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND Artificial intelligence (AI) is an innovative field with potential for improving burn care. This article provides an updated review on machine learning in burn care and discusses future challenges and the role of healthcare professionals in the successful implementation of AI technologies. METHODS A systematic search was carried out on MEDLINE, Embase and PubMed databases for English-language articles studying machine learning in burns. Articles were reviewed quantitatively and qualitatively for clinical applications, key features, algorithms, outcomes and validation methods. RESULTS A total of 46 observational studies were included for review. Assessment of burn depth (n = 26), support vector machines (n = 19) and 10-fold cross-validation (n = 11) were the most common application, algorithm and validation tool used, respectively. CONCLUSION AI should be incorporated into clinical practice as an adjunct to the experienced burns provider once direct comparative analysis to current gold standards outlining its benefits and risks have been studied. Future considerations must include the development of a burn-specific common framework. Authors should use common validation tools to allow for effective comparisons. Level I/II evidence is required to produce robust proof about clinical and economic impacts.
Collapse
Affiliation(s)
| | - Kavit Amin
- Department of Plastic Surgery, Manchester University NHS Foundation Trust, UK
- Department of Plastic Surgery, Lancashire Teaching Hospitals NHS Foundation Trust, Royal Preston Hospital, Preston, UK
| | - Chidi Ekwobi
- Department of Plastic Surgery, Lancashire Teaching Hospitals NHS Foundation Trust, Royal Preston Hospital, Preston, UK
| |
Collapse
|
23
|
Arfat Y, Mittone G, Esposito R, Cantalupo B, DE Ferrari GM, Aldinucci M. A review of machine learning for cardiology. Minerva Cardiol Angiol 2021; 70:75-91. [PMID: 34338485 DOI: 10.23736/s2724-5683.21.05709-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This paper reviews recent cardiology literature and reports how Artificial Intelligence Tools (specifically, Machine Learning techniques) are being used by physicians in the field. Each technique is introduced with enough details to allow the understanding of how it works and its intent, but without delving into details that do not add immediate benefits and require expertise in the field. We specifically focus on the principal Machine Learning based risk scores used in cardiovascular research. After introducing them and summarizing their assumptions and biases, we discuss their merits and shortcomings. We report on how frequently they are adopted in the field and suggest why this is the case based on our expertise in Machine Learning. We complete the analysis by reviewing how corresponding statistical approaches compare with them. Finally, we discuss the main open issues in applying Machine Learning tools to cardiology tasks, also drafting possible future directions. Despite the growing interest in these tools, we argue that there are many still underutilized techniques: while Neural Networks are slowly being incorporated in cardiovascular research, other important techniques such as Semi-Supervised Learning and Federated Learning are still underutilized. The former would allow practitioners to harness the information contained in large datasets that are only partially labeled, while the latter would foster collaboration between institutions allowing building larger and better models.
Collapse
Affiliation(s)
- Yasir Arfat
- Computer Science Department, University of Turin, Turin, Italy -
| | | | | | | | - Gaetano M DE Ferrari
- Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy.,Cardiology, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Marco Aldinucci
- Computer Science Department, University of Turin, Turin, Italy
| |
Collapse
|
24
|
O'Hara C, Gibney ER. Meal Pattern Analysis in Nutritional Science: Recent Methods and Findings. Adv Nutr 2021; 12:1365-1378. [PMID: 33460431 PMCID: PMC8321870 DOI: 10.1093/advances/nmaa175] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 12/01/2020] [Accepted: 12/15/2020] [Indexed: 11/14/2022] Open
Abstract
There is a scarcity of dietary intake research focusing on the intake of whole meals rather than on the nutrients and foods of which those meals are composed. This growing area of research has recently begun to utilize advanced statistical techniques to manage the large number of variables and permutations associated with these complex meal patterns. The aim of this narrative review was to evaluate those techniques and the meal patterns they detect. The 10 observational studies identified used techniques such as principal components analysis, clustering, latent class analysis, and decision trees. They examined meal patterns under 3 categories: temporal patterns (relating to the timing and distribution of meals), content patterns (relating to combinations of foods within a meal and combinations of those meals over a day), and context patterns (relating to external elements of the meal, such as location, activities while eating, and the presence or absence of others). The most common temporal meal patterns were the 3 meals/d pattern, the skipped breakfast pattern, and a grazing pattern consisting of smaller but more frequent meals. The 3 meals/d pattern was associated with increased diet quality compared with the other 2 patterns. Studies identified between 7 and 12 content patterns with limited similarities between studies and no clear associations between the patterns and diet quality or health. One study simultaneously examined temporal and context meal patterns, finding limited associations with diet quality. No study simultaneously examined other combinations of meal patterns. Future research that further develops the statistical techniques required for meal pattern analysis is necessary to clarify the relations between meal patterns and diet quality and health.
Collapse
Affiliation(s)
- Cathal O'Hara
- Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland
- UCD Institute of Food and Health, University College Dublin, Dublin, Ireland
- School of Agriculture and Food Science, University College Dublin, Dublin, Ireland
| | - Eileen R Gibney
- Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland
- UCD Institute of Food and Health, University College Dublin, Dublin, Ireland
- School of Agriculture and Food Science, University College Dublin, Dublin, Ireland
| |
Collapse
|
25
|
Bibicheva TS, Skazkina VV, Ogneva MV, Simonyan MA, Gridnev VI, Karavaev AS. Missing value imputation with linear methods in the database of cardiological patients in prediction of mortality. CARDIO-IT 2021. [DOI: 10.15275/cardioit.2021.0101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
This study examines missing value imputation in the Russian Acute Coronary Syndrome Registry (RusACSR) and assessment of the probability of predicting mortality. Linear methods with the most probable or average value were used for imputation. The prediction problem was solved using the k-nearest neighbors method. This work reveals that the imputation method, despite their simplicity, increases the probability of prediction of mortality by 6%.
Collapse
|
26
|
Cabitza F, Campagner A. The need to separate the wheat from the chaff in medical informatics: Introducing a comprehensive checklist for the (self)-assessment of medical AI studies. Int J Med Inform 2021; 153:104510. [PMID: 34108105 DOI: 10.1016/j.ijmedinf.2021.104510] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 05/26/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
This editorial aims to contribute to the current debate about the quality of studies that apply machine learning (ML) methodologies to medical data to extract value from them and provide clinicians with viable and useful tools supporting everyday care practices. We propose a practical checklist to help authors to self assess the quality of their contribution and to help reviewers to recognize and appreciate high-quality medical ML studies by distinguishing them from the mere application of ML techniques to medical data.
Collapse
Affiliation(s)
- Federico Cabitza
- DISCo, University of Milano-Bicocca, viale Sarca 336, Milano 20126, Italy.
| | - Andrea Campagner
- DISCo, University of Milano-Bicocca, viale Sarca 336, Milano 20126, Italy
| |
Collapse
|
27
|
|
28
|
Ngueilbaye A, Wang H, Mahamat DA, Junaidu SB. Modulo 9 model-based learning for missing data imputation. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107167] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
29
|
Shin J, Yoon S, Kim Y, Kim T, Go B, Cha Y. Effects of class imbalance on resampling and ensemble learning for improved prediction of cyanobacteria blooms. ECOL INFORM 2021. [DOI: 10.1016/j.ecoinf.2020.101202] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
30
|
Nonlinear compensation algorithm for multidimensional temporal data: A missing value imputation for the power grid applications. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106743] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
31
|
The visual outcomes of idiopathic epiretinal membrane removal in eyes with ectopic inner foveal layers and preserved macular segmentation. Graefes Arch Clin Exp Ophthalmol 2021; 259:2193-2201. [PMID: 33528646 DOI: 10.1007/s00417-021-05102-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 01/18/2021] [Accepted: 01/25/2021] [Indexed: 10/22/2022] Open
Abstract
PURPOSE To analyze the functional impact of ectopic inner foveal layers (EIFL), along with other clinical and optical coherence tomography biomarkers, on patients with epiretinal membrane (ERM) and preserved foveal layers' segmentation undergoing ERM removal. METHODS Retrospective review of consecutive patients with ERM who underwent pars plana vitrectomy with ERM peeling from December 2018 to December 2019. Baseline factors including age, gender, lens status, phacoemulsification at the time of surgery, tamponade agent, dye used for ERM and internal limiting membrane (ILM) enhancement, ILM peeling, best-corrected visual acuity (BCVA) and central macular thickness (CMT), presence and thickness of EIFL, thickness of outer nuclear layer (ONL), presence of a cotton ball, subfoveal state of photoreceptors, and presence of cystoid macular edema were included in a multivariable model having the BCVA at 12 months as the main outcome. The changes in EIFL and ONL thickness over time were also analyzed. RESULTS Fifty-one patients (58 eyes, 23 eyes in the no EIFL group, and 35 eyes in the EIFL group) were enrolled. The BCVA significantly improved over 12 months after surgery, regardless of the presence of EIFL (p < 0.001). Eyes with no EIFL had better BCVA at month 3 (p = 0.04), but this difference was no longer detectable at 6 and 12 months. The presence of EIFL was not associated with the final BCVA (p = 0.9), while the CMT at 12 months correlated with EIFL thickness (r = 0.8, p = 0.008). CONCLUSION Patients with EIFL could reach optimal visual acuity in the absence of disorganization of the inner retinal layers but should be warned of potentially longer healing times. None of the morphologic signs included in this study precluded good visual recovery on long-term follow-up.
Collapse
|
32
|
Mostafa SM. Towards improving machine learning algorithms accuracy by benefiting from similarities between cases. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-201077] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Data preprocessing is a necessary core in data mining. Preprocessing involves handling missing values, outlier and noise removal, data normalization, etc. The problem with existing methods which handle missing values is that they deal with the whole data ignoring the characteristics of the data (e.g., similarities and differences between cases). This paper focuses on handling the missing values using machine learning methods taking into account the characteristics of the data. The proposed preprocessing method clusters the data, then imputes the missing values in each cluster depending on the data belong to this cluster rather than the whole data. The author performed a comparative study of the proposed method and ten popular imputation methods namely mean, median, mode, KNN, IterativeImputer, IterativeSVD, Softimpute, Mice, Forimp, and Missforest. The experiments were done on four datasets with different number of clusters, sizes, and shapes. The empirical study showed better effectiveness from the point of view of imputation time, Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and coefficient of determination (R2 score) (i.e., the similarity of the original removed value to the imputed one).
Collapse
Affiliation(s)
- Samih M. Mostafa
- Faculty of Computers and Information, South Valley University, Qena, Egypt
| |
Collapse
|
33
|
Roland D, Suzen N, Coats TJ, Levesley J, Gorban AN, Mirkes EM. What can the randomness of missing values tell you about clinical practice in large data sets of children's vital signs? Pediatr Res 2021; 89:16-21. [PMID: 32294665 DOI: 10.1038/s41390-020-0861-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 01/27/2020] [Accepted: 02/26/2020] [Indexed: 11/09/2022]
Affiliation(s)
- Damian Roland
- SAPPHIRE Group, Health Sciences, University of Leicester, Leicester, UK. .,Paediatric Emergency Medicine Leicester Academic (PEMLA) Group, Children's Emergency Department, Leicester Royal Infirmary, Leicester, UK.
| | - Neslihan Suzen
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, UK
| | - Timothy J Coats
- Emergency Medicine Academic Group, Emergency Department, Leicester Royal Infirmary, Leicester, UK
| | - Jeremy Levesley
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, UK
| | - Alexander N Gorban
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, UK
| | - Evgeny M Mirkes
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, UK
| |
Collapse
|
34
|
Zhang X, Yan C, Gao C, Malin BA, Chen Y. Predicting Missing Values in Medical Data via XGBoost Regression. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2020; 4:383-394. [PMID: 33283143 DOI: 10.1007/s41666-020-00077-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Purpose The data in a patient's laboratory test result is a notable resource to support clinical investigation and enhance medical research. However, for a variety of reasons, this type of data often contains a non-trivial number of missing values. For example, physicians may neglect to order tests or document the results. Such a phenomenon reduces the degree to which this data can be utilized to learn efficient and effective predictive models. To address this problem, various approaches have been developed to impute missing laboratory values; however, their performance has been limited. This is due, in part, to the fact no approaches effectively leverage the contextual information 1) in individual or 2) between laboratory test variables. Method We introduce an approach to combine an unsupervised prefilling strategy with a supervised machine learning approach, in the form of extreme gradient boosting (XGBoost), to leverage both types of context for imputation purposes. We evaluated the methodology through a series of experiments on approximately 8,200 patients' records in the MIMIC-III dataset. Result The results demonstrate that the new model outperforms baseline and state-of-the-art models on 13 commonly collected laboratory test variables. In terms of the normalized root mean square derivation (nRMSD), our model exhibits an imputation improvement by over 20%, on average. Conclusion Missing data imputation on the temporal variables can be largely improved via prefilling strategy and the supervised training technique, which leverages both the longitudinal and cross-sectional context simultaneously.
Collapse
Affiliation(s)
| | - Chao Yan
- Vanderbilt University, Nashville, TN, USA
| | - Cheng Gao
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - You Chen
- Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
35
|
Coito T, Martins MS, Viegas JL, Firme B, Figueiredo J, Vieira SM, Sousa JM. A Middleware Platform for Intelligent Automation: An Industrial Prototype Implementation. COMPUT IND 2020. [DOI: 10.1016/j.compind.2020.103329] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
36
|
Lung PY, Zhong D, Pang X, Li Y, Zhang J. Maximizing the reusability of gene expression data by predicting missing metadata. PLoS Comput Biol 2020; 16:e1007450. [PMID: 33156882 PMCID: PMC7673503 DOI: 10.1371/journal.pcbi.1007450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Revised: 11/18/2020] [Accepted: 10/09/2020] [Indexed: 11/18/2022] Open
Abstract
Reusability is part of the FAIR data principle, which aims to make data Findable, Accessible, Interoperable, and Reusable. One of the current efforts to increase the reusability of public genomics data has been to focus on the inclusion of quality metadata associated with the data. When necessary metadata are missing, most researchers will consider the data useless. In this study, we developed a framework to predict the missing metadata of gene expression datasets to maximize their reusability. We found that when using predicted data to conduct other analyses, it is not optimal to use all the predicted data. Instead, one should only use the subset of data, which can be predicted accurately. We proposed a new metric called Proportion of Cases Accurately Predicted (PCAP), which is optimized in our specifically-designed machine learning pipeline. The new approach performed better than pipelines using commonly used metrics such as F1-score in terms of maximizing the reusability of data with missing values. We also found that different variables might need to be predicted using different machine learning methods and/or different data processing protocols. Using differential gene expression analysis as an example, we showed that when missing variables are accurately predicted, the corresponding gene expression data can be reliably used in downstream analyses.
Collapse
Affiliation(s)
- Pei-Yau Lung
- Department of Statistics, Florida State University, Tallahassee, United States of America
| | - Dongrui Zhong
- Department of Statistics, Florida State University, Tallahassee, United States of America
| | - Xiaodong Pang
- Insilicom LLC, Tallahassee, United States of America
| | - Yan Li
- Department of Breast Surgery, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, United States of America
- * E-mail:
| |
Collapse
|
37
|
Missing data techniques in classification for cardiovascular dysautonomias diagnosis. Med Biol Eng Comput 2020; 58:2863-2878. [PMID: 32970269 DOI: 10.1007/s11517-020-02266-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Accepted: 09/08/2020] [Indexed: 10/23/2022]
Abstract
Missing data (MD) is a common and inevitable problem facing data mining (DM)-based decision systems in e-health since many medical historical datasets contain a huge number of missing values. Therefore, a pre-processing stage is usually required to deal with missing values before building any DM-based decision system. The purpose of this paper is to evaluate the impact of MD techniques on classification systems in cardiovascular dysautonomias diagnosis. We analyzed and compared the accuracy rates of four classification techniques: random forest (RF), support vector machines (SVM), C4.5 decision tree, and Naive Bayes (NB), using two MD techniques: deletion or imputation with k-nearest neighbors (KNN). A total of 216 experiments were therefore carried out using three missingness mechanisms (MCAR: missing completely at random, MAR: missing at random and NMAR: not missing at random), two MD techniques (deletion and KNN imputation), nine MD percentages from 10 to 90% over a dataset collected from the autonomic nervous system (ANS) unit of the University Hospital Avicenne in Morocco. The results obtained suggest that using KNN imputation rather than deletion enhances the accuracy rates of the four classifiers. Moreover, the MD percentages have a negative impact on the performance of classification techniques regardless of the MD mechanisms and MD techniques used. In fact, the accuracy rates of the four classifiers decrease as the MD percentage increases. Graphical abstract.
Collapse
|
38
|
Pan Y, Liu M, Lian C, Xia Y, Shen D. Spatially-Constrained Fisher Representation for Brain Disease Identification With Incomplete Multi-Modal Neuroimages. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:2965-2975. [PMID: 32217472 PMCID: PMC7485604 DOI: 10.1109/tmi.2020.2983085] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Multi-modal neuroimages, such as magnetic resonance imaging (MRI) and positron emission tomography (PET), can provide complementary structural and functional information of the brain, thus facilitating automated brain disease identification. Incomplete data problem is unavoidable in multi-modal neuroimage studies due to patient dropouts and/or poor data quality. Conventional methods usually discard data-missing subjects, thus significantly reducing the number of training samples. Even though several deep learning methods have been proposed, they usually rely on pre-defined regions-of-interest in neuroimages, requiring disease-specific expert knowledge. To this end, we propose a spatially-constrained Fisher representation framework for brain disease diagnosis with incomplete multi-modal neuroimages. We first impute missing PET images based on their corresponding MRI scans using a hybrid generative adversarial network. With the complete (after imputation) MRI and PET data, we then develop a spatially-constrained Fisher representation network to extract statistical descriptors of neuroimages for disease diagnosis, assuming that these descriptors follow a Gaussian mixture model with a strong spatial constraint (i.e., images from different subjects have similar anatomical structures). Experimental results on three databases suggest that our method can synthesize reasonable neuroimages and achieve promising results in brain disease identification, compared with several state-of-the-art methods.
Collapse
Affiliation(s)
- Yongsheng Pan
- Y. Pan and Y. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China. M. Liu, C. Lian, and D. Shen are with the Department of Radiology and BRIC, University of North Carolina, Chapel Hill, NC 27599, USA. D. Shen is also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
| | - Mingxia Liu
- Y. Pan and Y. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China. M. Liu, C. Lian, and D. Shen are with the Department of Radiology and BRIC, University of North Carolina, Chapel Hill, NC 27599, USA. D. Shen is also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
| | - Chunfeng Lian
- Y. Pan and Y. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China. M. Liu, C. Lian, and D. Shen are with the Department of Radiology and BRIC, University of North Carolina, Chapel Hill, NC 27599, USA. D. Shen is also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
| | - Yong Xia
- Y. Pan and Y. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China. M. Liu, C. Lian, and D. Shen are with the Department of Radiology and BRIC, University of North Carolina, Chapel Hill, NC 27599, USA. D. Shen is also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
| | - Dinggang Shen
- Y. Pan and Y. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China. M. Liu, C. Lian, and D. Shen are with the Department of Radiology and BRIC, University of North Carolina, Chapel Hill, NC 27599, USA. D. Shen is also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
| |
Collapse
|
39
|
Vilardell M, Buxó M, Clèries R, Martínez JM, Garcia G, Ameijide A, Font R, Civit S, Marcos-Gragera R, Vilardell ML, Carulla M, Espinàs JA, Galceran J, Izquierdo A, Borràs JM. Missing data imputation and synthetic data simulation through modeling graphical probabilistic dependencies between variables (ModGraProDep): An application to breast cancer survival. Artif Intell Med 2020; 107:101875. [DOI: 10.1016/j.artmed.2020.101875] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 02/12/2020] [Accepted: 05/02/2020] [Indexed: 12/29/2022]
|
40
|
Ferrão JC, Oliveira MD, Janela F, Martins HMG, Gartner D. Can structured EHR data support clinical coding? A data mining approach. Health Syst (Basingstoke) 2020; 10:138-161. [PMID: 34104432 PMCID: PMC8143604 DOI: 10.1080/20476965.2020.1729666] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 10/22/2019] [Indexed: 10/24/2022] Open
Abstract
Structured data formats are gaining momentum in electronic health records and can be leveraged for decision support and research. Nevertheless, such structured data formats have not been explored for clinical coding, which is an essential process requiring significant manual workload in health organisations. This article explores the extent to which fully structured clinical data can support assignment of clinical codes to inpatient episodes, through a methodology that tackles high dimensionality issues, addresses the multi-label nature of coding and optimises model parameters. The methodology encompasses transformation of raw data to define a feature set, build a data matrix representation, and testing combinations of feature selection methods with machine learning models to predict code assignment. The methodology was tested with a real hospital dataset and showed varying predictive power across codes, while demonstrating the potential of leveraging structuring data to reduce workload and increase efficiency in clinical coding.
Collapse
Affiliation(s)
- José Carlos Ferrão
- CEG-IST, Centre for Management Studies of Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Mónica Duarte Oliveira
- CEG-IST, Centre for Management Studies of Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Filipe Janela
- Investigação, Desenvolvimento e Inovação, SIEMENS Healthineers, Amadora, Portugal
| | - Henrique M. G. Martins
- Centre for Research and Creativity in Informatics (CI), Hospital Prof. Doutor Fernando Fonseca, Amadora, Portugal
| | | |
Collapse
|
41
|
|
42
|
Traina AJ, Brinis S, Pedrosa GV, Avalhais LP, Traina C. Querying on large and complex databases by content: Challenges on variety and veracity regarding real applications. INFORM SYST 2019. [DOI: 10.1016/j.is.2019.03.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
43
|
Mostafa SM. Imputing missing values using cumulative linear regression. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2019. [DOI: 10.1049/trit.2019.0032] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Samih M. Mostafa
- Faculty of Science, Mathematics Department, Computer ScienceSouth Valley UniversityQena83523Egypt
| |
Collapse
|
44
|
Venugopalan J, Chanani N, Maher K, Wang MD. Novel Data Imputation for Multiple Types of Missing Data in Intensive Care Units. IEEE J Biomed Health Inform 2019; 23:1243-1250. [DOI: 10.1109/jbhi.2018.2883606] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
45
|
Imani F, Cheng C, Chen R, Yang H. Nested Gaussian process modeling and imputation of high-dimensional incomplete data under uncertainty. ACTA ACUST UNITED AC 2019. [DOI: 10.1080/24725579.2019.1583704] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Farhad Imani
- Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Changqing Cheng
- Department of Systems Science and Industrial Engineering, State University of New York, Binghamton, NY, USA
| | - Ruimin Chen
- Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Hui Yang
- Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
46
|
Abstract
AbstractA large variety of issues influence the success of data mining on a given problem. Two primary and important issues are the representation and the quality of the dataset. Specifically, if much redundant and unrelated or noisy and unreliable information is presented, then knowledge discovery becomes a very difficult problem. It is well-known that data preparation steps require significant processing time in machine learning tasks. It would be very helpful and quite useful if there were various preprocessing algorithms with the same reliable and effective performance across all datasets, but this is impossible. To this end, we present the most well-known and widely used up-to-date algorithms for each step of data preprocessing in the framework of predictive data mining.
Collapse
|
47
|
Idri A, Benhar H, Fernández-Alemán JL, Kadi I. A systematic map of medical data preprocessing in knowledge discovery. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 162:69-85. [PMID: 29903496 DOI: 10.1016/j.cmpb.2018.05.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Revised: 04/25/2018] [Accepted: 05/03/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Datamining (DM) has, over the last decade, received increased attention in the medical domain and has been widely used to analyze medical datasets in order to extract useful knowledge and previously unknown patterns. However, historical medical data can often comprise inconsistent, noisy, imbalanced, missing and high dimensional data. These challenges lead to a serious bias in predictive modeling and reduce the performance of DM techniques. Data preprocessing is, therefore, an essential step in knowledge discovery as regards improving the quality of data and making it appropriate and suitable for DM techniques. The objective of this paper is to review the use of preprocessing techniques in clinical datasets. METHODS We performed a systematic map of studies regarding the application of data preprocessing to healthcare and published between January 2000 and December 2017. A search string was determined on the basis of the mapping questions and the PICO categories. The search string was then applied in digital databases covering the fields of computer science and medical informatics in order to identify relevant studies. The studies were initially selected by reading their titles, abstracts and keywords. Those that were selected at that stage were then reviewed using a set of inclusion and exclusion criteria in order to eliminate any that were not relevant. This process resulted in 126 primary studies. RESULTS Selected studies were analyzed and classified according to their publication years and channels, research type, empirical type and contribution type. The findings of this mapping study revealed that researchers have paid a considerable amount of attention to preprocessing in medical DM in last decade. A significant number of the selected studies used data reduction and cleaning preprocessing tasks. Moreover, the disciplines in which preprocessing have received most attention are: cardiology, endocrinology and oncology. CONCLUSIONS Researchers should develop and implement standards for an effective integration of multiple medical data types. Moreover, we identified the need to perform literature reviews.
Collapse
Affiliation(s)
- A Idri
- Software Project Management Research Team, ENSIAS, University Mohammed V of Rabat, Morocco.
| | - H Benhar
- Software Project Management Research Team, ENSIAS, University Mohammed V of Rabat, Morocco.
| | - J L Fernández-Alemán
- Department of Informatics and Systems, Faculty of Computer Science, University of Murcia, Spain.
| | - I Kadi
- Software Project Management Research Team, ENSIAS, University Mohammed V of Rabat, Morocco.
| |
Collapse
|
48
|
Albers DJ, Elhadad N, Claassen J, Perotte R, Goldstein A, Hripcsak G. Estimating summary statistics for electronic health record laboratory data for use in high-throughput phenotyping algorithms. J Biomed Inform 2018; 78:87-101. [PMID: 29369797 PMCID: PMC5856130 DOI: 10.1016/j.jbi.2018.01.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 12/05/2017] [Accepted: 01/14/2018] [Indexed: 01/12/2023]
Abstract
We study the question of how to represent or summarize raw laboratory data taken from an electronic health record (EHR) using parametric model selection to reduce or cope with biases induced through clinical care. It has been previously demonstrated that the health care process (Hripcsak and Albers, 2012, 2013), as defined by measurement context (Hripcsak and Albers, 2013; Albers et al., 2012) and measurement patterns (Albers and Hripcsak, 2010, 2012), can influence how EHR data are distributed statistically (Kohane and Weber, 2013; Pivovarov et al., 2014). We construct an algorithm, PopKLD, which is based on information criterion model selection (Burnham and Anderson, 2002; Claeskens and Hjort, 2008), is intended to reduce and cope with health care process biases and to produce an intuitively understandable continuous summary. The PopKLD algorithm can be automated and is designed to be applicable in high-throughput settings; for example, the output of the PopKLD algorithm can be used as input for phenotyping algorithms. Moreover, we develop the PopKLD-CAT algorithm that transforms the continuous PopKLD summary into a categorical summary useful for applications that require categorical data such as topic modeling. We evaluate our methodology in two ways. First, we apply the method to laboratory data collected in two different health care contexts, primary versus intensive care. We show that the PopKLD preserves known physiologic features in the data that are lost when summarizing the data using more common laboratory data summaries such as mean and standard deviation. Second, for three disease-laboratory measurement pairs, we perform a phenotyping task: we use the PopKLD and PopKLD-CAT algorithms to define high and low values of the laboratory variable that are used for defining a disease state. We then compare the relationship between the PopKLD-CAT summary disease predictions and the same predictions using empirically estimated mean and standard deviation to a gold standard generated by clinical review of patient records. We find that the PopKLD laboratory data summary is substantially better at predicting disease state. The PopKLD or PopKLD-CAT algorithms are not meant to be used as phenotyping algorithms, but we use the phenotyping task to show what information can be gained when using a more informative laboratory data summary. In the process of evaluation our method we show that the different clinical contexts and laboratory measurements necessitate different statistical summaries. Similarly, leveraging the principle of maximum entropy we argue that while some laboratory data only have sufficient information to estimate a mean and standard deviation, other laboratory data captured in an EHR contain substantially more information than can be captured in higher-parameter models.
Collapse
Affiliation(s)
- D J Albers
- Department of Biomedical Informatics, Columbia University, 622 West 168th Street, New York, NY, USA.
| | - N Elhadad
- Department of Biomedical Informatics, Columbia University, 622 West 168th Street, New York, NY, USA.
| | - J Claassen
- Department of Neurology, Columbia University, 710 West 168th Street, New York, NY 10032, USA.
| | - R Perotte
- Value Institute, New York Presbyterian Hospital, 601 West 168th Street New York, NY 10032, USA.
| | - A Goldstein
- Department of Biomedical Informatics, Columbia University, 622 West 168th Street, New York, NY, USA.
| | - G Hripcsak
- Department of Biomedical Informatics, Columbia University, 622 West 168th Street, New York, NY, USA.
| |
Collapse
|
49
|
Image recognition with missing-features based on gaussian mixture model and graph constrained nonnegative matrix factorization. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:3150-3153. [PMID: 29060566 DOI: 10.1109/embc.2017.8037525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The demand for automatically recognizing medical images for screening, reference and management is growing faster than ever. Missing data phenomenon in medical image applications is common existence, and it could be inevitable. In this paper, we have addressed the problem of recognizing medical images with missing-features via Gaussian mixture model (GMM)-based approach. Since training a GMM by directly using high-dimensional feature vectors will result in instability, we have proposed a novel strategy to train the GMM from the corresponding reduced-dimensional one. The proposed method contains training and test phases. The former contains feature extraction, graph constrained nonnegative matrix factorization (NMF), GMM training, and the alternating expectation conditional maximization (AECM) for extending the reduced-dimensional GMM. In test phase, two methods, marginalizing GMM using Bayesian decision (MGBD) and conditional mean imputation (CMI), are applied to impute missing-features. Posterior probability of test images is calculated to identify objects. Experimental results on three real datasets demonstrate the feasibility and efficiency of the proposed scheme.
Collapse
|
50
|
Nancy JY, Khanna NH, Arputharaj K. Imputing missing values in unevenly spaced clinical time series data to build an effective temporal classification framework. Comput Stat Data Anal 2017. [DOI: 10.1016/j.csda.2017.02.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|