1
|
Arueyingho OV, Al-Taie A, McCallum C. Scoping review: Machine learning interventions in the management of healthcare systems. Digit Health 2024; 10:20552076221144095. [PMID: 39444734 PMCID: PMC11497546 DOI: 10.1177/20552076221144095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 11/18/2022] [Indexed: 10/25/2024] Open
Abstract
Background Healthcare institutions focus on improving the quality of life for end-users, with key performance indicators like access to essential medicines reflecting the effectiveness of management. Effective healthcare management involves planning, organizing, and controlling institutions built on human resources, data systems, service delivery, access to medicines, finance, and leadership. According to the World Health Organization, these elements must be balanced for an optimal healthcare system. Big data generated from healthcare institutions, including health records and genomic data, is crucial for smart staffing, decision-making, risk management, and patient engagement. Properly organizing and analysing this data is essential, and machine learning, a sub-field of artificial intelligence, can optimize these processes, leading to better overall healthcare management. Objectives This review examines the major applications of machine learning in healthcare management, the algorithms frequently used in data analysis, their limitations, and the evidence-based benefits of machine learning in healthcare. Methods Following PRISMA guidelines, databases such as IEEE Xplore, ScienceDirect, ACM Digital Library, and SCOPUS were searched for eligible articles published between 2011 and 2021. Articles had to be in English, peer-reviewed, and include relevant keywords like healthcare, management, and machine learning. Results Out of 51 relevant articles, 6 met the inclusion criteria. Identified algorithms include topic modelling, dynamic clustering, neural networks, decision trees, and ensemble classifiers, applied in areas such as electronic health records, chatbots, and multi-disease prediction. Conclusion Machine learning supports healthcare management by aiding decision-making, processing big data, and providing insights for system improvements.
Collapse
Affiliation(s)
- Oritsetimeyin V Arueyingho
- School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths (SCEEM), Centre for Doctoral Training in Digital Health and Care, University of Bristol, UK
| | - Anmar Al-Taie
- School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths (SCEEM), Centre for Doctoral Training in Digital Health and Care, University of Bristol, UK
| | - Claire McCallum
- Department of Clinical Pharmacy, Faculty of Pharmacy, Istinye University, Istanbul, Turkey
| |
Collapse
|
2
|
A novel medical text classification model with Kalman filter for clinical decision making. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
3
|
Prabhakar SK, Won DO. Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:9425655. [PMID: 34603437 PMCID: PMC8486521 DOI: 10.1155/2021/9425655] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 08/31/2021] [Indexed: 11/18/2022]
Abstract
To unlock information present in clinical description, automatic medical text classification is highly useful in the arena of natural language processing (NLP). For medical text classification tasks, machine learning techniques seem to be quite effective; however, it requires extensive effort from human side, so that the labeled training data can be created. For clinical and translational research, a huge quantity of detailed patient information, such as disease status, lab tests, medication history, side effects, and treatment outcomes, has been collected in an electronic format, and it serves as a valuable data source for further analysis. Therefore, a huge quantity of detailed patient information is present in the medical text, and it is quite a huge challenge to process it efficiently. In this work, a medical text classification paradigm, using two novel deep learning architectures, is proposed to mitigate the human efforts. The first approach is that a quad channel hybrid long short-term memory (QC-LSTM) deep learning model is implemented utilizing four channels, and the second approach is that a hybrid bidirectional gated recurrent unit (BiGRU) deep learning model with multihead attention is developed and implemented successfully. The proposed methodology is validated on two medical text datasets, and a comprehensive analysis is conducted. The best results in terms of classification accuracy of 96.72% is obtained with the proposed QC-LSTM deep learning model, and a classification accuracy of 95.76% is obtained with the proposed hybrid BiGRU deep learning model.
Collapse
Affiliation(s)
- Sunil Kumar Prabhakar
- Department of Artificial Intelligence, Korea University, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Dong-Ok Won
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, Gangwon 24252, Republic of Korea
| |
Collapse
|
4
|
Patient Mix Optimization in Admission Planning under Multitype Patients and Priority Constraints. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:5588241. [PMID: 33790987 PMCID: PMC7997749 DOI: 10.1155/2021/5588241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 02/28/2021] [Accepted: 03/04/2021] [Indexed: 11/17/2022]
Abstract
Hospital beds are one of the most critical medical resources. Large hospitals in China have caused bed utilization rates to exceed 100% due to long-term extra beds. To alleviate the contradiction between the supply of high-quality medical resources and the demand for hospitalization, in this paper, we address the decision of choosing a case mix for a respiratory medicine department. We aim to generate an optimal admission plan of elective patients with the stochastic length of stay and different resource consumption. We assume that we can classify elective patients according to their registration information before admission. We formulated a general integer programming model considering heterogeneous patients and introducing patient priority constraints. The mathematical model is used to generate a scientific and reasonable admission planning, determining the best admission mix for multitype patients in a period. Compared with model II that does not consider priority constraints, model I proposed in this paper is better in terms of admissions and revenue. The proposed model I can adjust the priority parameters to meet the optimal output under different goals and scenarios. The daily admission planning for each type of patient obtained by model I can be used to assist the patient admission management in large general hospitals.
Collapse
|
5
|
Chronic Pain Diagnosis Using Machine Learning, Questionnaires, and QST: A Sensitivity Experiment. Diagnostics (Basel) 2020; 10:diagnostics10110958. [PMID: 33212774 PMCID: PMC7697204 DOI: 10.3390/diagnostics10110958] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 11/13/2020] [Indexed: 11/17/2022] Open
Abstract
In the last decade, machine learning has been widely used in different fields, especially because of its capacity to work with complex data. With the support of machine learning techniques, different studies have been using data-driven approaches to better understand some syndromes like mild cognitive impairment, Alzheimer’s disease, schizophrenia, and chronic pain. Chronic pain is a complex disease that can recurrently be misdiagnosed due to its comorbidities with other syndromes with which it shares symptoms. Within that context, several studies have been suggesting different machine learning algorithms to classify or predict chronic pain conditions. Those algorithms were fed with a diversity of data types, from self-report data based on questionnaires to the most advanced brain imaging techniques. In this study, we assessed the sensitivity of different algorithms and datasets classifying chronic pain syndromes. Together with this assessment, we highlighted important methodological steps that should be taken into account when an experiment using machine learning is conducted. The best results were obtained by ensemble-based algorithms and the dataset containing the greatest diversity of information, resulting in area under the receiver operating curve (AUC) values of around 0.85. In addition, the performance of the algorithms is strongly related to the hyper-parameters. Thus, a good strategy for hyper-parameter optimization should be used to extract the most from the algorithm. These findings support the notion that machine learning can be a powerful tool to better understand chronic pain conditions.
Collapse
|
6
|
A Machine-Learning-Based Approach to Predict the Health Impacts of Commuting in Large Cities: Case Study of London. Symmetry (Basel) 2020. [DOI: 10.3390/sym12050866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The daily commute represents a source of chronic stress that is positively correlated with physiological consequences, including increased blood pressure, heart rate, fatigue, and other negative mental and physical health effects. The purpose of this research is to investigate and predict the physiological effects of commuting in Greater London on the human body based on machine-learning approaches. For each participant, the data were collected for five consecutive working days, before and after the commute, using non-invasive wearable biosensor technology. Multimodal behaviour, analysis and synthesis are the subjects of major efforts in computing field to realise the successful human–human and human–agent interactions, especially for developing future intuitive technologies. Current analysis approaches still focus on individuals, while we are considering methodologies addressing groups as a whole. This research paper employs a pool of machine-learning approaches to predict and analyse the effect of commuting objectively. Comprehensive experimentation has been carried out to choose the best algorithmic structure that suit the problem in question. The results from this study suggest that whether the commuting period was short or long, all objective bio-signals (heat rate and blood pressure) were higher post-commute than pre-commute. In addition, the results match both the subjective evaluation obtained from the Positive and Negative Affect Schedule and the proposed objective evaluation of this study in relation to the correlation between the effect of commuting on bio-signals. Our findings provide further support for shorter commutes and using the healthier or active modes of transportation.
Collapse
|
7
|
Activities of Daily Living and Environment Recognition Using Mobile Devices: A Comparative Study. ELECTRONICS 2020. [DOI: 10.3390/electronics9010180] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The recognition of Activities of Daily Living (ADL) using the sensors available in off-the-shelf mobile devices with high accuracy is significant for the development of their framework. Previously, a framework that comprehends data acquisition, data processing, data cleaning, feature extraction, data fusion, and data classification was proposed. However, the results may be improved with the implementation of other methods. Similar to the initial proposal of the framework, this paper proposes the recognition of eight ADL, e.g., walking, running, standing, going upstairs, going downstairs, driving, sleeping, and watching television, and nine environments, e.g., bar, hall, kitchen, library, street, bedroom, living room, gym, and classroom, but using the Instance Based k-nearest neighbour (IBk) and AdaBoost methods as well. The primary purpose of this paper is to find the best machine learning method for ADL and environment recognition. The results obtained show that IBk and AdaBoost reported better results, with complex data than the deep neural network methods.
Collapse
|
8
|
Leviton A, Oppenheimer J, Chiujdea M, Antonetty A, Ojo OW, Garcia S, Weas S, Fleegler E, Chan E, Loddenkemper T. Characteristics of Future Models of Integrated Outpatient Care. Healthcare (Basel) 2019; 7:healthcare7020065. [PMID: 31035586 PMCID: PMC6627383 DOI: 10.3390/healthcare7020065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 04/23/2019] [Accepted: 04/24/2019] [Indexed: 01/01/2023] Open
Abstract
Replacement of fee-for-service with capitation arrangements, forces physicians and institutions to minimize health care costs, while maintaining high-quality care. In this report we described how patients and their families (or caregivers) can work with members of the medical care team to achieve these twin goals of maintaining-and perhaps improving-high-quality care and minimizing costs. We described how increased self-management enables patients and their families/caregivers to provide electronic patient-reported outcomes (i.e., symptoms, events) (ePROs), as frequently as the patient or the medical care team consider appropriate. These capabilities also allow ongoing assessments of physiological measurements/phenomena (mHealth). Remote surveillance of these communications allows longer intervals between (fewer) patient visits to the medical-care team, when this is appropriate, or earlier interventions, when it is appropriate. Systems are now available that alert medical care providers to situations when interventions might be needed.
Collapse
Affiliation(s)
- Alan Leviton
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Julia Oppenheimer
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Madeline Chiujdea
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Annalee Antonetty
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Oluwafemi William Ojo
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Stephanie Garcia
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Sarah Weas
- Division of Developmental Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Eric Fleegler
- Division of Emergency Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Eugenia Chan
- Division of Developmental Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| | - Tobias Loddenkemper
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA.
| |
Collapse
|
9
|
Luo L, Li J, Liu C, Shen W. Using machine-learning methods to support health-care professionals in making admission decisions. Int J Health Plann Manage 2019; 34:e1236-e1246. [PMID: 30957270 DOI: 10.1002/hpm.2769] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 02/08/2019] [Accepted: 02/08/2019] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Large tertiary hospitals usually face long waiting lines; patients who want to receive hospitalization need to be screened in advance. The patient admission screening process involves a health-care professional ranking patients by analyzing registration information. OBJECTIVE The purpose of this study was to develop a machine-learning approach to screening, using historical data and the experience of health-care professionals to develop a set of screening rules to help health-care professionals prioritize patient needs automatically. METHODS We used five machine-learning methods to sequence and predict elective patients: logistic regression (LR), random forest (RF), gradient-boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and an ensemble model of the four models. RESULTS The results indicate that all of the five models showed a good prioritization performance with high predictive values. In particular, XGBoost had the best predictive performance compared with others in terms of the area under the receiver operating characteristic curve (AUC), with the AUC values of LR, RF, GBDT, XGBoost, and the ensemble model being 0.881, 0.816, 0.820, 0.901, and 0.897, respectively. CONCLUSION The results reported here indicate that machine-learning techniques can be valuable for automating the screening process. Our model can assist health-care professionals in automatically evaluating less complex cases by identifying important factors affecting patient admission.
Collapse
Affiliation(s)
- Li Luo
- Business School, Sichuan University, Chengdu, China
| | - Jialing Li
- Business School, Sichuan University, Chengdu, China
| | - Chuang Liu
- Logistics Engineering School, Chengdu Vocational & Technical College of Industry, Chengdu, China
| | - Wenwu Shen
- Outpatient Department, West China Hospital of Sichuan University, Chengdu, China
| |
Collapse
|
10
|
Evans HP, Anastasiou A, Edwards A, Hibbert P, Makeham M, Luz S, Sheikh A, Donaldson L, Carson-Stevens A. Automated classification of primary care patient safety incident report content and severity using supervised machine learning (ML) approaches. Health Informatics J 2019; 26:3123-3139. [PMID: 30843455 DOI: 10.1177/1460458219833102] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Learning from patient safety incident reports is a vital part of improving healthcare. However, the volume of reports and their largely free-text nature poses a major analytic challenge. The objective of this study was to test the capability of autonomous classifying of free text within patient safety incident reports to determine incident type and the severity of harm outcome. Primary care patient safety incident reports (n=31333) previously expert-categorised by clinicians (training data) were processed using J48, SVM and Naïve Bayes.The SVM classifier was the highest scoring classifier for incident type (AUROC, 0.891) and severity of harm (AUROC, 0.708). Incident reports containing deaths were most easily classified, correctly identifying 72.82% of reports. In conclusion, supervised ML can be used to classify patient safety incident report categories. The severity classifier, whilst not accurate enough to replace manual processing, could provide a valuable screening tool for this critical aspect of patient safety.
Collapse
Affiliation(s)
| | | | | | - Peter Hibbert
- Macquarie University, Australia; University of South Australia, Australia
| | | | | | | | | | | |
Collapse
|
11
|
Oppenheimer J, Leviton A, Chiujdea M, Antonetty A, Ojo OW, Garcia S, Weas S, Fleegler EW, Chan E, Loddenkemper T. Caring electronically for young outpatients who have epilepsy. Epilepsy Behav 2018; 87:226-232. [PMID: 30197227 DOI: 10.1016/j.yebeh.2018.06.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 06/08/2018] [Accepted: 06/11/2018] [Indexed: 01/17/2023]
Abstract
PURPOSE The purpose of this study was to review electronic tools that might improve the delivery of epilepsy care, reduce medical care costs, and empower families to improve self-management capability. METHOD We reviewed the epilepsy-specific literature about self-management, electronic patient-reported or provider-reported outcomes, on-going remote surveillance, and alerting/warning systems. CONCLUSIONS The improved care delivery system that we envision includes self-management, electronic patient (or provider)-reported outcomes, on-going remote surveillance, and alerting/warning systems. This system and variants have the potential to reduce seizure burden through improved management, keep children out of the emergency department and hospital, and even reduce the number of outpatient visits.
Collapse
Affiliation(s)
- Julia Oppenheimer
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alan Leviton
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
| | - Madeline Chiujdea
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Annalee Antonetty
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Oluwafemi William Ojo
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Stephanie Garcia
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Sarah Weas
- Division of Developmental Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric W Fleegler
- Division of Emergency Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Eugenia Chan
- Division of Developmental Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Tobias Loddenkemper
- Division of Epilepsy and Clinical Neurophysiology, Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
12
|
Identifying people at risk of developing type 2 diabetes: A comparison of predictive analytics techniques and predictor variables. Int J Med Inform 2018; 119:22-38. [PMID: 30342683 DOI: 10.1016/j.ijmedinf.2018.08.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 07/26/2018] [Accepted: 08/16/2018] [Indexed: 01/21/2023]
Abstract
BACKGROUND The present study aims to identify the patients at risk of type 2 diabetes (T2D). There is a body of literature that uses machine learning classification algorithms to predict development of T2D among patients. The current study compares the performance of these classification algorithms to identify patients who are at risk of developing T2D in short, medium and long terms. In addition, the list of predictor variables important for prediction for T2D progression is provided. METHODS This study uses 10,911 records generated in 36 clinics from the 15th of November 2008-15th of November 2016. Syntactic minority oversampling and random under sampling were used to create a balanced dataset. The performance of Neural Networks, Support Vector Machines, Decision Tress and Logistic Regression to identify patients developing T2D in short, medium and long terms was compared. The measures were Area Under Curve, Sensitivity, Specificity, Matthew correlation coefficient and Mean Calibration Error. Through importance analysis and information fusion techniques the predictors of developing T2D were identified for short, medium and long-term risk analysis. RESULTS The findings show that the performance of analytics techniques depends on both period and purpose of prediction whether the prediction is to identify people who will not develop T2D or to determine at risk patients. Oversampling as opposed to under sampling improved performance. 16 predictors and their importance to determine patients at risk of T2D in short, medium and long terms were identified. CONCLUSIONS This study provides guidelines for an automated system to prompt patients for screening. Several predictors are reportable by patients, others can be examined by physicians or ordered for further lab examination, which offers a potential reduction of the burden placed upon the clinical settings.
Collapse
|
13
|
Natural Language Processing Based Instrument for Classification of Free Text Medical Records. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8313454. [PMID: 27668260 PMCID: PMC5030470 DOI: 10.1155/2016/8313454] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 07/11/2016] [Accepted: 08/17/2016] [Indexed: 11/17/2022]
Abstract
According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful.
Collapse
|
14
|
Macedo AA, Pollettini JT, Baranauskas JA, Chaves JCA. A Health Surveillance Software Framework to deliver information on preventive healthcare strategies. J Biomed Inform 2016; 62:159-70. [PMID: 27318270 DOI: 10.1016/j.jbi.2016.06.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Revised: 06/03/2016] [Accepted: 06/07/2016] [Indexed: 11/24/2022]
Abstract
A software framework can reduce costs related to the development of an application because it allows developers to reuse both design and code. Recently, companies and research groups have announced that they have been employing health software frameworks. This paper presents the design, proof-of-concept implementations and experimentation of the Health Surveillance Software Framework (HSSF). The HSSF is a framework that tackles the demand for the recommendation of surveillance information aiming at supporting preventive healthcare strategies. Examples of such strategies are the automatic recommendation of surveillance levels to patients in need of healthcare and the automatic recommendation of scientific literature that elucidates epigenetic problems related to patients. HSSF was created from two systems we developed in our previous work on health surveillance systems: the Automatic-SL and CISS systems. The Automatic-SL system aims to assist healthcare professionals in making decisions and in identifying children with developmental problems. The CISS service associates genetic and epigenetic risk factors related to chronic diseases with patient's clinical records. Towards evaluating the HSSF framework, two new systems, CISS+ and CISS-SW, were created by means of abstractions and instantiations of the framework (design and code). We show that HSSF supported the development of the two new systems given that they both recommend scientific papers using medical records as queries even though they exploit different computational technologies. In an experiment using simulated patients' medical records, we show that CISS, CISS+, and CISS-SW systems recommended more closely related and somewhat related documents than Google, Google Scholar and PubMed. Considering recall and precision measures, CISS+ surpasses CISS-SW in terms of precision.
Collapse
Affiliation(s)
- Alessandra Alaniz Macedo
- Biomedical Informatics Group, Department of Computer Science and Mathematics, University of São Paulo (USP), Av. Bandeirantes, 3900, Ribeirão Preto, SP 14040-901, Brazil.
| | - Juliana Tarossi Pollettini
- Biomedical Informatics Group, Department of Computer Science and Mathematics, University of São Paulo (USP), Av. Bandeirantes, 3900, Ribeirão Preto, SP 14040-901, Brazil.
| | - José Augusto Baranauskas
- Biomedical Informatics Group, Department of Computer Science and Mathematics, University of São Paulo (USP), Av. Bandeirantes, 3900, Ribeirão Preto, SP 14040-901, Brazil.
| | - Julia Carmona Almeida Chaves
- Biomedical Informatics Group, Department of Computer Science and Mathematics, University of São Paulo (USP), Av. Bandeirantes, 3900, Ribeirão Preto, SP 14040-901, Brazil.
| |
Collapse
|
15
|
Effective Automated Prediction of Vertebral Column Pathologies Based on Logistic Model Tree with SMOTE Preprocessing. J Med Syst 2014; 38:50. [DOI: 10.1007/s10916-014-0050-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 03/27/2014] [Indexed: 10/25/2022]
|
16
|
Pollettini JT, Baranauskas JA, Ruiz ES, da Graça Pimentel M, Macedo AA. Surveillance for the prevention of chronic diseases through information association. BMC Med Genomics 2014; 7:7. [PMID: 24479447 PMCID: PMC3938472 DOI: 10.1186/1755-8794-7-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 01/16/2014] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Research on Genomic medicine has suggested that the exposure of patients to early life risk factors may induce the development of chronic diseases in adulthood, as the presence of premature risk factors can influence gene expression. The large number of scientific papers published in this research area makes it difficult for the healthcare professional to keep up with individual results and to establish association between them. Therefore, in our work we aim at building a computational system that will offer an innovative approach that alerts health professionals about human development problems such as cardiovascular disease, obesity and type 2 diabetes. METHODS We built a computational system called Chronic Illness Surveillance System (CISS), which retrieves scientific studies that establish associations (conceptual relationships) between chronic diseases (cardiovascular diseases, diabetes and obesity) and the risk factors described on clinical records. To evaluate our approach, we submitted ten queries to CISS as well as to three other search engines (Google™, Google Scholar™ and Pubmed®;) - the queries were composed of terms and expressions from a list of risk factors provided by specialists. RESULTS CISS retrieved a higher number of closely related (+) and somewhat related (+/-) documents, and a smaller number of unrelated (-) and almost unrelated (-/+) documents, in comparison with the three other systems. The results from the Friedman's test carried out with the post-hoc Holm procedure (95% confidence) for our system (control) versus the results for the three other engines indicate that our system had the best performance in three of the categories (+), (-) and (+/-). This is an important result, since these are the most relevant categories for our users. CONCLUSION Our system should be able to assist researchers and health professionals in finding out relationships between potential risk factors and chronic diseases in scientific papers.
Collapse
Affiliation(s)
- Juliana Tarossi Pollettini
- Department of Computer Science and Mathematics - FFCLRP - University of São Paulo (USP), Ribeirão Preto-SP, Brazil
| | - José Augusto Baranauskas
- Department of Computer Science and Mathematics - FFCLRP - University of São Paulo (USP), Ribeirão Preto-SP, Brazil
| | - Evandro Seron Ruiz
- Department of Computer Science and Mathematics - FFCLRP - University of São Paulo (USP), Ribeirão Preto-SP, Brazil
| | | | - Alessandra Alaniz Macedo
- Department of Computer Science and Mathematics - FFCLRP - University of São Paulo (USP), Ribeirão Preto-SP, Brazil
| |
Collapse
|
17
|
Golino HF, Amaral LSDB, Duarte SFP, Gomes CMA, Soares TDJ, dos Reis LA, Santos J. Predicting increased blood pressure using machine learning. J Obes 2014; 2014:637635. [PMID: 24669313 PMCID: PMC3941962 DOI: 10.1155/2014/637635] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Revised: 10/12/2013] [Accepted: 11/16/2013] [Indexed: 01/21/2023] Open
Abstract
The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R (2) (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R (2) (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.
Collapse
Affiliation(s)
- Hudson Fernandes Golino
- Laboratório de Investigação da Arquitetura Cognitiva, Universidade Federal de Minas Gerais, 30000-000 Belo Horizonte, Minas Gerais, MG, Brazil
- *Hudson Fernandes Golino:
| | | | - Stenio Fernando Pimentel Duarte
- Núcleo de Pós-Graduação, Pesquisa e Extenção, Faculdade Independente do Nordeste, São Luís Avenue, 1305, 45000-000 Candeias, Vitória da Conquista, BA, Brazil
| | - Cristiano Mauro Assis Gomes
- Laboratório de Investigação da Arquitetura Cognitiva, Universidade Federal de Minas Gerais, 30000-000 Belo Horizonte, Minas Gerais, MG, Brazil
| | - Telma de Jesus Soares
- Instituto Multidisciplinar de Saúde, Universidade Federal da Bahia, 40000-000 Bahia, BA, Brazil
| | - Luciana Araujo dos Reis
- Núcleo de Pós-Graduação, Pesquisa e Extenção, Faculdade Independente do Nordeste, São Luís Avenue, 1305, 45000-000 Candeias, Vitória da Conquista, BA, Brazil
| | - Joselito Santos
- Núcleo de Pós-Graduação, Pesquisa e Extenção, Faculdade Independente do Nordeste, São Luís Avenue, 1305, 45000-000 Candeias, Vitória da Conquista, BA, Brazil
| |
Collapse
|