1
|
Khosravi S, Soltanian S, Servati A, Khademhosseini A, Zhu Y, Servati P. Screen-Printed Textile-Based Electrochemical Biosensor for Noninvasive Monitoring of Glucose in Sweat. BIOSENSORS 2023; 13:684. [PMID: 37504083 PMCID: PMC10377550 DOI: 10.3390/bios13070684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 06/23/2023] [Accepted: 06/23/2023] [Indexed: 07/29/2023]
Abstract
Wearable sweat biosensors for noninvasive monitoring of health parameters have attracted significant attention. Having these biosensors embedded in textile substrates can provide a convenient experience due to their soft and flexible nature that conforms to the skin, creating good contact for long-term use. These biosensors can be easily integrated with everyday clothing by using textile fabrication processes to enhance affordable and scalable manufacturing. Herein, a flexible electrochemical glucose sensor that can be screen-printed onto a textile substrate has been demonstrated. The screen-printed textile-based glucose biosensor achieved a linear response in the range of 20-1000 µM of glucose concentration and high sensitivity (18.41 µA mM-1 cm-2, R2 = 0.996). In addition, the biosensors show high selectivity toward glucose among other interfering analytes and excellent stability over 30 days of storage. The developed textile-based biosensor can serve as a platform for monitoring bio analytes in sweat, and it is expected to impact the next generation of wearable devices.
Collapse
Affiliation(s)
- Safoora Khosravi
- Flexible Electronics and Energy Lab (FEEL), Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90064, USA
| | - Saeid Soltanian
- Flexible Electronics and Energy Lab (FEEL), Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Amir Servati
- Materials Engineering Department, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Ali Khademhosseini
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90064, USA
| | - Yangzhi Zhu
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA 90064, USA
| | - Peyman Servati
- Flexible Electronics and Energy Lab (FEEL), Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
2
|
Carrera-Escalé L, Benali A, Rathert AC, Martín-Pinardel R, Bernal-Morales C, Alé-Chilet A, Barraso M, Marín-Martinez S, Feu-Basilio S, Rosinés-Fonoll J, Hernandez T, Vilá I, Castro-Dominguez R, Oliva C, Vinagre I, Ortega E, Gimenez M, Vellido A, Romero E, Zarranz-Ventura J. Radiomics-Based Assessment of OCT Angiography Images for Diabetic Retinopathy Diagnosis. OPHTHALMOLOGY SCIENCE 2022; 3:100259. [PMID: 36578904 PMCID: PMC9791596 DOI: 10.1016/j.xops.2022.100259] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/25/2022] [Accepted: 11/14/2022] [Indexed: 11/23/2022]
Abstract
Purpose To evaluate the diagnostic accuracy of machine learning (ML) techniques applied to radiomic features extracted from OCT and OCT angiography (OCTA) images for diabetes mellitus (DM), diabetic retinopathy (DR), and referable DR (R-DR) diagnosis. Design Cross-sectional analysis of a retinal image dataset from a previous prospective OCTA study (ClinicalTrials.govNCT03422965). Participants Patients with type 1 DM and controls included in the progenitor study. Methods Radiomic features were extracted from fundus retinographies, OCT, and OCTA images in each study eye. Logistic regression, linear discriminant analysis, support vector classifier (SVC)-linear, SVC-radial basis function, and random forest models were created to evaluate their diagnostic accuracy for DM, DR, and R-DR diagnosis in all image types. Main Outcome Measures Area under the receiver operating characteristic curve (AUC) mean and standard deviation for each ML model and each individual and combined image types. Results A dataset of 726 eyes (439 individuals) were included. For DM diagnosis, the greatest AUC was observed for OCT (0.82, 0.03). For DR detection, the greatest AUC was observed for OCTA (0.77, 0.03), especially in the 3 × 3 mm superficial capillary plexus OCTA scan (0.76, 0.04). For R-DR diagnosis, the greatest AUC was observed for OCTA (0.87, 0.12) and the deep capillary plexus OCTA scan (0.86, 0.08). The addition of clinical variables (age, sex, etc.) improved most models AUC for DM, DR and R-DR diagnosis. The performance of the models was similar in unilateral and bilateral eyes image datasets. Conclusions Radiomics extracted from OCT and OCTA images allow identification of patients with DM, DR, and R-DR using standard ML classifiers. OCT was the best test for DM diagnosis, OCTA for DR and R-DR diagnosis and the addition of clinical variables improved most models. This pioneer study demonstrates that radiomics-based ML techniques applied to OCT and OCTA images may be an option for DR screening in patients with type 1 DM. Financial Disclosures Proprietary or commercial disclosure may be found after the references.
Collapse
Key Words
- AI, artificial intelligence
- AUC, area under the curve
- Artificial intelligence
- DCP, deep capillary plexus
- DM, diabetes mellitus
- DR, diabetic retinopathy
- Diabetic retinopathy
- FR, fundus retinographies
- LDA, linear discriminant analysis
- LR, logistic regression
- ML, machine learning
- Machine learning
- OCT angiography
- OCTA, OCT angiography
- R-DR, referable DR
- RF, random forest
- Radiomics
- SCP, superficial capillary plexus
- SVC, support vector classifier
- rbf, radial basis function
Collapse
Affiliation(s)
- Laura Carrera-Escalé
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Anass Benali
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Ann-Christin Rathert
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Ruben Martín-Pinardel
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain,August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain
| | | | - Anibal Alé-Chilet
- Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Marina Barraso
- Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Sara Marín-Martinez
- Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Silvia Feu-Basilio
- Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Josep Rosinés-Fonoll
- Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Teresa Hernandez
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Irene Vilá
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | | | - Cristian Oliva
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain
| | - Irene Vinagre
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Diabetes Unit, Hospital Clínic de Barcelona, Spain,Institut Clínic de Malalties Digestives i Metaboliques (ICMDM), Hospital Clínic de Barcelona, Spain
| | - Emilio Ortega
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Diabetes Unit, Hospital Clínic de Barcelona, Spain,Institut Clínic de Malalties Digestives i Metaboliques (ICMDM), Hospital Clínic de Barcelona, Spain
| | - Marga Gimenez
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Diabetes Unit, Hospital Clínic de Barcelona, Spain,Institut Clínic de Malalties Digestives i Metaboliques (ICMDM), Hospital Clínic de Barcelona, Spain
| | - Alfredo Vellido
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Enrique Romero
- Intelligent Data Science and Artificial Intelligence (IDEAI) Research Center,Department of Computer Science, Facultat d’Informàtica de Barcelona (FIB), Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Javier Zarranz-Ventura
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain,Institut Clínic d´Oftalmología (ICOF), Hospital Clínic de Barcelona, Barcelona, Spain,Diabetes Unit, Hospital Clínic de Barcelona, Spain,School of Medicine, Universitat de Barcelona, Spain,Correspondence: Javier Zarranz-Ventura, MD, PhD, C/ Sabino Arana 1, Barcelona 08028, Spain.
| |
Collapse
|
3
|
Han N, He J, Shi L, Zhang M, Zheng J, Fan Y. Identification of biomarkers in nonalcoholic fatty liver disease: A machine learning method and experimental study. Front Genet 2022; 13:1020899. [PMID: 36419827 PMCID: PMC9676265 DOI: 10.3389/fgene.2022.1020899] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/24/2022] [Indexed: 10/13/2023] Open
Abstract
Nonalcoholic fatty liver disease (NAFLD) has become the most common chronic liver disease. However, the early diagnosis of NAFLD is challenging. Thus, the purpose of this study was to identify diagnostic biomarkers of NAFLD using machine learning algorithms. Differentially expressed genes between NAFLD and normal samples were identified separately from the GEO database. The key DEGs were selected through a protein‒protein interaction network, and their biological functions were analysed. Next, three machine learning algorithms were selected to construct models of NAFLD separately, and the model with the smallest sample residual was determined to be the best model. Then, logistic regression analysis was used to judge the accuracy of the five genes in predicting the risk of NAFLD. A single-sample gene set enrichment analysis algorithm was used to evaluate the immune cell infiltration of NAFLD, and the correlation between diagnostic biomarkers and immune cell infiltration was analysed. Finally, 10 pairs of peripheral blood samples from NAFLD patients and normal controls were collected for RNA isolation and quantitative real-time polymerase chain reaction for validation. Taken together, CEBPD, H4C11, CEBPB, GATA3, and KLF4 were identified as diagnostic biomarkers of NAFLD by machine learning algorithms and were related to immune cell infiltration in NAFLD. These key genes provide novel insights into the mechanisms and treatment of patients with NAFLD.
Collapse
Affiliation(s)
- Na Han
- Department of Endocrinology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Juan He
- Department of Endocrinology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Lixin Shi
- Department of Endocrinology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Miao Zhang
- Department of Endocrinology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Jing Zheng
- Department of Endocrinology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Yuanshuo Fan
- Department of Endocrinology, Guizhou Provincial People's Hospital, Guiyang, China
| |
Collapse
|
4
|
Abstract
AbstractCancer survival prediction is one of the three major tasks of cancer prognosis. To improve the accuracy of cancer survival prediction, in this paper, we propose a priori knowledge- and stability-based feature selection (PKSFS) method and develop a novel two-stage heterogeneous stacked ensemble learning model (BQAXR) to predict the survival status of cancer patients. Specifically, PKSFS first obtains the optimal feature subsets from the high-dimensional cancer datasets to guide the subsequent model construction. Then, BQAXR seeks to generate five high-quality heterogeneous learners, among which the shortcomings of the learners are overcome by using improved methods, and integrate them in two stages through the stacked generalization strategy based on optimal feature subsets. To verify the merits of PKSFS and BQAXR, this paper collected the real survival datasets of gastric cancer and skin cancer from the Surveillance, Epidemiology, and End Results (SEER) database of the National Cancer Institute, and conducted extensive numerical experiments from different perspectives based on these two datasets. The accuracy and AUC of the proposed method are 0.8209 and 0.8203 in the gastric cancer dataset, and 0.8336 and 0.8214 in the skin cancer dataset. The results show that PKSFS has marked advantages over popular feature selection methods in processing high-dimensional datasets. By taking full advantage of heterogeneous high-quality learners, BQAXR is not only superior to mainstream machine learning methods, but also outperforms improved machine learning methods, which indicates can effectively improve the accuracy of cancer survival prediction and provide a reference for doctors to make medical decisions.
Collapse
|
5
|
Hao J, Luo S, Pan L. Rule extraction from biased random forest and fuzzy support vector machine for early diagnosis of diabetes. Sci Rep 2022; 12:9858. [PMID: 35701587 PMCID: PMC9198101 DOI: 10.1038/s41598-022-14143-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 04/15/2022] [Indexed: 12/04/2022] Open
Abstract
Due to concealed initial symptoms, many diabetic patients are not diagnosed in time, which delays treatment. Machine learning methods have been applied to increase the diagnosis rate, but most of them are black boxes lacking interpretability. Rule extraction is usually used to turn on the black box. As the number of diabetic patients is far less than that of healthy people, the rules obtained by the existing rule extraction methods tend to identify healthy people rather than diabetic patients. To address the problem, a method for extracting reduced rules based on biased random forest and fuzzy support vector machine is proposed. Biased random forest uses the k-nearest neighbor (k-NN) algorithm to identify critical samples and generates more trees that tend to diagnose diabetes based on critical samples to improve the tendency of the generated rules for diabetic patients. In addition, the conditions and rules are reduced based on the error rate and coverage rate to enhance interpretability. Experiments on the Diabetes Medical Examination Data collected by Beijing Hospital (DMED-BH) dataset demonstrate that the proposed approach has outstanding results (MCC = 0.8802) when the rules are similar in number. Moreover, experiments on the Pima Indian Diabetes (PID) and China Health and Nutrition Survey (CHNS) datasets prove the generalization of the proposed method.
Collapse
Affiliation(s)
- Jingwei Hao
- Information System and Security and Countermeasures Experiments Center, Beijing Institute of Technology, Beijing, 100081, People's Republic of China.
| | - Senlin Luo
- Information System and Security and Countermeasures Experiments Center, Beijing Institute of Technology, Beijing, 100081, People's Republic of China
| | - Limin Pan
- Information System and Security and Countermeasures Experiments Center, Beijing Institute of Technology, Beijing, 100081, People's Republic of China
| |
Collapse
|
6
|
Supervised Machine Learning Empowered Multifactorial Genetic Inheritance Disorder Prediction. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1051388. [PMID: 35685134 PMCID: PMC9173933 DOI: 10.1155/2022/1051388] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 04/15/2022] [Accepted: 05/03/2022] [Indexed: 12/18/2022]
Abstract
Fatal diseases like cancer, dementia, and diabetes are very dangerous. This leads to fear of death if these are not diagnosed at early stages. Computer science uses biomedical studies to diagnose cancer, dementia, and diabetes. With the advancement of machine learning, there are various techniques which are accessible to predict and prognosis these diseases based on different datasets. These datasets varied (image datasets and CSV datasets) around the world. So, there is a need for some machine learning classifiers to predict cancer, dementia, and diabetes in a human. In this paper, we used a multifactorial genetic inheritance disorder dataset to predict cancer, dementia, and diabetes. Several studies used different machine learning classifiers to predict cancer, dementia, and diabetes separately with the help of different types of datasets. So, in this paper, multiclass classification proposed methodology used support vector machine (SVM) and K-nearest neighbor (KNN) machine learning techniques to predict three diseases and compared these techniques based on accuracy. Simulation results have shown that the proposed model of SVM and KNN for prediction of dementia, cancer, and diabetes from multifactorial genetic inheritance disorder achieved 92.8% and 92.5%, 92.8% and 91.2% accuracy during training and testing, respectively. So, it is observed that proposed SVM-based dementia, cancer, and diabetes from multifactorial genetic inheritance disorder prediction (MGIDP) give attractive results as compared with the proposed model of KNN. The application of the proposed model helps to prognosis and prediction of cancer, dementia, and diabetes before time and plays a vital role to minimize the death ratio around the world.
Collapse
|
7
|
Fitzsimmons L, Dewan M, Dexheimer JW. Diversity in Machine Learning: A Systematic Review of Text-Based Diagnostic Applications. Appl Clin Inform 2022; 13:569-582. [PMID: 35613914 DOI: 10.1055/s-0042-1749119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022] Open
Abstract
OBJECTIVE As the storage of clinical data has transitioned into electronic formats, medical informatics has become increasingly relevant in providing diagnostic aid. The purpose of this review is to evaluate machine learning models that use text data for diagnosis and to assess the diversity of the included study populations. METHODS We conducted a systematic literature review on three public databases. Two authors reviewed every abstract for inclusion. Articles were included if they used or developed machine learning algorithms to aid in diagnosis. Articles focusing on imaging informatics were excluded. RESULTS From 2,260 identified papers, we included 78. Of the machine learning models used, neural networks were relied upon most frequently (44.9%). Studies had a median population of 661.5 patients, and diseases and disorders of 10 different body systems were studied. Of the 35.9% (N = 28) of papers that included race data, 57.1% (N = 16) of study populations were majority White, 14.3% were majority Asian, and 7.1% were majority Black. In 75% (N = 21) of papers, White was the largest racial group represented. Of the papers included, 43.6% (N = 34) included the sex ratio of the patient population. DISCUSSION With the power to build robust algorithms supported by massive quantities of clinical data, machine learning is shaping the future of diagnostics. Limitations of the underlying data create potential biases, especially if patient demographics are unknown or not included in the training. CONCLUSION As the movement toward clinical reliance on machine learning accelerates, both recording demographic information and using diverse training sets should be emphasized. Extrapolating algorithms to demographics beyond the original study population leaves large gaps for potential biases.
Collapse
Affiliation(s)
- Lane Fitzsimmons
- College of Agriculture and Life Science, Cornell University, Ithaca, New York, United States
| | - Maya Dewan
- Division of Critical Care Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States
| | - Judith W Dexheimer
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States.,Division of Emergency Medicine; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States
| |
Collapse
|
8
|
Tuppad A, Patil SD. Machine learning for diabetes clinical decision support: a review. ADVANCES IN COMPUTATIONAL INTELLIGENCE 2022; 2:22. [PMID: 35434723 PMCID: PMC9006199 DOI: 10.1007/s43674-022-00034-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 02/27/2022] [Accepted: 03/03/2022] [Indexed: 12/14/2022]
Abstract
Type 2 diabetes has recently acquired the status of an epidemic silent killer, though it is non-communicable. There are two main reasons behind this perception of the disease. First, a gradual but exponential growth in the disease prevalence has been witnessed irrespective of age groups, geography or gender. Second, the disease dynamics are very complex in terms of multifactorial risks involved, initial asymptomatic period, different short-term and long-term complications posing serious health threat and related co-morbidities. Majority of its risk factors are lifestyle habits like physical inactivity, lack of exercise, high body mass index (BMI), poor diet, smoking except some inevitable ones like family history of diabetes, ethnic predisposition, ageing etc. Nowadays, machine learning (ML) is increasingly being applied for alleviation of diabetes health burden and many research works have been proposed in the literature to offer clinical decision support in different application areas as well. In this paper, we present a review of such efforts for the prevention and management of type 2 diabetes. Firstly, we present the medical gaps in diabetes knowledge base, guidelines and medical practice identified from relevant articles and highlight those that can be addressed by ML. Further, we review the ML research works in three different application areas namely—(1) risk assessment (statistical risk scores and ML-based risk models), (2) diagnosis (using non-invasive and invasive features), (3) prognosis (from normoglycemia/prior morbidity to incident diabetes and prognosis of incident diabetes to related complications). We discuss and summarize the shortcomings or gaps in the existing ML methodologies for diabetes to be addressed in future. This review provides the breadth of ML predictive modeling applications for diabetes while highlighting the medical and technological gaps as well as various aspects involved in ML-based diabetes clinical decision support.
Collapse
Affiliation(s)
- Ashwini Tuppad
- School of Computer Science and Engineering, REVA University, Rukmini Knowledge Park, Kattigenahalli, Bangalore, Karnataka India
| | - Shantala Devi Patil
- School of Computer Science and Engineering, REVA University, Rukmini Knowledge Park, Kattigenahalli, Bangalore, Karnataka India
| |
Collapse
|
9
|
Delpino F, Costa Â, Farias S, Chiavegatto Filho A, Arcêncio R, Nunes B. Machine learning for predicting chronic diseases: a systematic review. Public Health 2022; 205:14-25. [DOI: 10.1016/j.puhe.2022.01.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 10/26/2021] [Accepted: 01/11/2022] [Indexed: 12/12/2022]
|
10
|
Nagpal MS, Barbaric A, Sherifali D, Morita PP, Cafazzo JA. Patient-Generated Data Analytics of Health Behaviors of People Living With Type 2 Diabetes: Scoping Review. JMIR Diabetes 2021; 6:e29027. [PMID: 34783668 PMCID: PMC8726031 DOI: 10.2196/29027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 08/01/2021] [Accepted: 10/31/2021] [Indexed: 11/13/2022] Open
Abstract
Background Complications due to type 2 diabetes (T2D) can be mitigated through proper self-management that can positively change health behaviors. Technological tools are available to help people living with, or at risk of developing, T2D to manage their condition, and such tools provide a large repository of patient-generated health data (PGHD). Analytics can provide insights into the health behaviors of people living with T2D. Objective The aim of this review is to investigate what can be learned about the health behaviors of those living with, or at risk of developing, T2D through analytics from PGHD. Methods A scoping review using the Arksey and O’Malley framework was conducted in which a comprehensive search of the literature was conducted by 2 reviewers. In all, 3 electronic databases (PubMed, IEEE Xplore, and ACM Digital Library) were searched using keywords associated with diabetes, behaviors, and analytics. Several rounds of screening using predetermined inclusion and exclusion criteria were conducted, after which studies were selected. Critical examination took place through a descriptive-analytical narrative method, and data extracted from the studies were classified into thematic categories. These categories reflect the findings of this study as per our objective. Results We identified 43 studies that met the inclusion criteria for this review. Although 70% (30/43) of the studies examined PGHD independently, 30% (13/43) combined PGHD with other data sources. Most of these studies used machine learning algorithms to perform their analysis. The themes identified through this review include predicting diabetes or obesity, deriving factors that contribute to diabetes or obesity, obtaining insights from social media or web-based forums, predicting glycemia, improving adherence and outcomes, analyzing sedentary behaviors, deriving behavior patterns, discovering clinical correlations from behaviors, and developing design principles. Conclusions The increased volume and availability of PGHD have the potential to derive analytical insights into the health behaviors of people living with T2D. From the literature, we determined that analytics can predict outcomes and identify granular behavior patterns from PGHD. This review determined the broad range of insights that can be examined through PGHD, which constitutes a unique source of data for these applications that would not be possible through the use of other data sources.
Collapse
Affiliation(s)
- Meghan S Nagpal
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada.,Centre for Global eHealth Innovation, Techna Institute, University Health Network, Toronto, ON, Canada
| | - Antonia Barbaric
- Centre for Global eHealth Innovation, Techna Institute, University Health Network, Toronto, ON, Canada.,Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Diana Sherifali
- School of Nursing, McMaster University, Hamilton, ON, Canada
| | - Plinio P Morita
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada.,Centre for Global eHealth Innovation, Techna Institute, University Health Network, Toronto, ON, Canada.,School of Public Health and Health Systems, University of Waterloo, Waterloo, ON, Canada
| | - Joseph A Cafazzo
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada.,Centre for Global eHealth Innovation, Techna Institute, University Health Network, Toronto, ON, Canada.,Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
11
|
Jeyafzam F, Vaziri B, Suraki MY, Hosseinabadi AAR, Slowik A. Improvement of grey wolf optimizer with adaptive middle filter to adjust support vector machine parameters to predict diabetes complications. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06143-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractIn medical science, collecting and classifying data from various diseases is a vital task. The confused and large amounts of data are problems that prevent us from achieving acceptable results. One of the major problems for diabetic patients is a failure to properly diagnose the disease. As a result of this mistake in diagnosis or failure in early diagnosis, the patient may suffer from complications such as blindness, kidney failure, and cutting off the toes. Nowadays, doctors diagnose the disease by relying on their experience and knowledge and performing complex and time-consuming tests. One of the problems with current diabetic, diagnostic methods is the lack of appropriate features to diagnose the disease and consequently the weakness in its diagnosis, especially in its early stages. Since diabetes diagnosis relies on large amounts of data with many parameters, it is necessary to use machine learning methods such as support vector machine (SVM) to predict the complications of diabetes. One of the disadvantages of SVM is its parameter adjustment, which can be accomplished using metaheuristic algorithms such as particle swarm optimization algorithm (PSO), genetic algorithm, or grey wolf optimizer (GWO). In this paper, after preprocessing and preparing the dataset for data mining, we use SVM to predict complications of diabetes based on selected parameters of a patient acquired by laboratory test using improved GWO. We improve the selection process of GWO by employing dynamic adaptive middle filter, a nonlinear filter that assigns appropriate weight to each value based on the data value. Comparison of the final results of the proposed algorithm with classification methods such as a multilayer perceptron neural network, decision tree, simple Bayes, and temporal fuzzy min–max neural network (TFMM-PSO) shows the superiority of the proposed method over the comparable ones.
Collapse
|
12
|
Stiglic G, Wang F, Sheikh A, Cilar L. Development and validation of the type 2 diabetes mellitus 10-year risk score prediction models from survey data. Prim Care Diabetes 2021; 15:699-705. [PMID: 33896755 DOI: 10.1016/j.pcd.2021.04.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Accepted: 04/13/2021] [Indexed: 12/23/2022]
Abstract
AIMS In this paper, we demonstrate the development and validation of the 10-years type 2 diabetes mellitus (T2DM) risk prediction models based on large survey data. METHODS The Survey of Health, Ageing and Retirement in Europe (SHARE) data collected in 12 European countries using 53 variables representing behavioural as well as physical and mental health characteristics of the participants aged 50 or older was used to build and validate prediction models. To account for strongly unbalanced outcome variables, each instance was assigned a weight according to the inverse proportion of the outcome label when the regularized logistic regression model was built. RESULTS A pooled sample of 16,363 individuals was used to build and validate a global regularized logistic regression model that achieved an area under the receiver operating characteristic curve of 0.702 (95% CI: 0.698-0.706). Additionally, we measured performance of local country-specific models where AUROC ranged from 0.578 (0.565-0.592) to 0.768 (0.749-0.787). CONCLUSIONS We have developed and validated a survey-based 10-year T2DM risk prediction model for use across 12 European countries. Our results demonstrate the importance of re-calibration of the models as well as strengths of pooling the data from multiple countries to reduce the variance and consequently increase the precision of the results.
Collapse
Affiliation(s)
- Gregor Stiglic
- University of Maribor, Faculty of Health Sciences, Zitna ulica 15, 2000 Maribor, Slovenia; University of Maribor, Faculty of Electrical Engineering and Computer Science, Koroska cesta 46, 2000 Maribor, Slovenia; Usher Institute, University of Edinburgh, Old Medical School, Teviot Place, Edinburgh EH8 9AG, UK.
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, 425 East 61 Street, New York, NY 10065
| | - Aziz Sheikh
- Usher Institute, University of Edinburgh, Old Medical School, Teviot Place, Edinburgh EH8 9AG, UK
| | - Leona Cilar
- University of Maribor, Faculty of Health Sciences, Zitna ulica 15, 2000 Maribor, Slovenia
| |
Collapse
|
13
|
Wang Z, Yin Z, Argyris YA. Detecting Medical Misinformation on Social Media Using Multimodal Deep Learning. IEEE J Biomed Health Inform 2021; 25:2193-2203. [PMID: 33170786 DOI: 10.1109/jbhi.2020.3037027] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In 2019, outbreaks of vaccine-preventable diseases reached the highest number in the US since 1992. Medical misinformation, such as antivaccine content propagating through social media, is associated with increases in vaccine delay and refusal. Our overall goal is to develop an automatic detector for antivaccine messages to counteract the negative impact that antivaccine messages have on the public health. Very few extant detection systems have considered multimodality of social media posts (images, texts, and hashtags), and instead focus on textual components, despite the rapid growth of photo-sharing applications (e.g., Instagram). As a result, existing systems are not sufficient for detecting antivaccine messages with heavy visual components (e.g., images) posted on these newer platforms. To solve this problem, we propose a deep learning network that leverages both visual and textual information. A new semantic- and task-level attention mechanism was created to help our model to focus on the essential contents of a post that signal antivaccine messages. The proposed model, which consists of three branches, can generate comprehensive fused features for predictions. Moreover, an ensemble method is proposed to further improve the final prediction accuracy. To evaluate the proposed model's performance, a real-world social media dataset that consists of more than 30,000 samples was collected from Instagram between January 2016 and October 2019. Our 30 experiment results demonstrate that the final network achieves above 97% testing accuracy and outperforms other relevant models, demonstrating that it can detect a large amount of antivaccine messages posted daily. The implementation code is available at https://github.com/wzhings/antivaccine_detection.
Collapse
|
14
|
Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning. BIOMED RESEARCH INTERNATIONAL 2021; 2021:1984690. [PMID: 34104645 PMCID: PMC8162250 DOI: 10.1155/2021/1984690] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 04/01/2021] [Accepted: 05/06/2021] [Indexed: 12/13/2022]
Abstract
Background Gestational diabetes mellitus (GDM) is the most prevalent metabolic disease during pregnancy, but the diagnosis is controversial and lagging partly due to the lack of useful biomarkers. CpG methylation is involved in the development of GDM. However, the specific CpG methylation sites serving as diagnostic biomarkers of GDM remain unclear. Here, we aimed to explore CpG signatures and establish the predicting model for the GDM diagnosis. Methods DNA methylation data of GSE88929 and GSE102177 were obtained from the GEO database, followed by the epigenome-wide association study (EWAS). GO and KEGG pathway analyses were performed by using the clusterProfiler package of R. The PPI network was constructed in the STRING database and Cytoscape software. The SVM model was established, in which the β-values of selected CpG sites were the predictor variable and the occurrence of GDM was the outcome variable. Results We identified 62 significant CpG methylation sites in the GDM samples compared with the control samples. GO and KEGG analyses based on the 62 CpG sites demonstrated that several essential cellular processes and signaling pathways were enriched in the system. A total of 12 hub genes related to the identified CpG sites were found in the PPI network. The SVM model based on the selected CpGs within the promoter region, including cg00922748, cg05216211, cg05376185, cg06617468, cg17097119, and cg22385669, was established, and the AUC values of the training set and testing set in the model were 0.8138 and 0.7576. The AUC value of the independent validation set of GSE102177 was 0.6667. Conclusion We identified potential diagnostic CpG signatures by EWAS integrated with the SVM model. The SVM model based on the identified 6 CpG sites reliably predicted the GDM occurrence, contributing to the diagnosis of GDM. Our finding provides new insights into the cross-application of EWAS and machine learning in GDM investigation.
Collapse
|
15
|
Feng Y, Wang X, Zhang J. A heterogeneous ensemble learning method for neuroblastoma survival prediction. IEEE J Biomed Health Inform 2021; 26:1472-1483. [PMID: 33848254 DOI: 10.1109/jbhi.2021.3073056] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Neuroblastoma is a pediatric cancer with high morbidity and mortality. Accurate survival prediction of patients with neuroblastoma plays an important role in the formulation of treatment plans. In this study, we proposed a heterogeneous ensemble learning method to predict the survival of neuroblastoma patients and extract decision rules from the proposed method to assist doctors in making decisions. After data preprocessing, five heterogeneous base learners were developed, which consisted of decision tree, random forest, support vector machine based on genetic algorithm, extreme gradient boosting and light gradient boosting machine. Subsequently, a heterogeneous feature selection method was devised to obtain the optimal feature subset of each base learner, and the optimal feature subset of each base learner guided the construction of the base learners as a priori knowledge. Furthermore, an area under curve-based ensemble mechanism was proposed to integrate the five heterogeneous base learners. Finally, the proposed method was compared with mainstream machine learning methods from different indicators, and valuable information was extracted by using the partial dependency plot analysis method and rule-extracted method from the proposed method. Experimental results show that the proposed method achieves an accuracy of 91.64%, recall of 91.14%, and AUC of 91.35% and is significantly better than the mainstream machine learning methods. In addition, interpretable rules with accuracy higher than 0.900 and predicted responses are extracted from the proposed method. Our study can effectively improve the performance of the clinical decision support system to improve the survival of neuroblastoma patients.
Collapse
|
16
|
Yavari A, Rajabzadeh A, Abdali-Mohammadi F. Profile-based assessment of diseases affective factors using fuzzy association rule mining approach: A case study in heart diseases. J Biomed Inform 2021; 116:103695. [PMID: 33549658 DOI: 10.1016/j.jbi.2021.103695] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 12/15/2020] [Accepted: 02/01/2021] [Indexed: 10/22/2022]
Abstract
The existing data mining solutions to identify risk factors associated with diseases are burdened with quite a few shortcomings. They usually use crisp partitions for numerical features and also do not use patient-specific profiles. These shortcomings create limitations for solving real problems. Discretizing a numerical feature through crisp partitions can also generate substantial partitioning errors, particularly for features whose values are closer to crisp boundaries. Since the normal range of each numerical feature varies according to the age, gender, and medical conditions of the patients, then ignoring these differences can undermine the accuracy of the extracted itemsets and rules. This paper presents a profile-based fuzzy association rule mining (PB-FARM) approach for the assessment of risk factors highly correlated with diseases. The proposed approach has three phases. Phase I involves creating profiles for patients based on their age, gender, and medical conditions, to determine a normal range of each numerical feature. Then fuzzy partitioning is done for all features (namely, numerical and categorical), and consequently, a structure, called FirstScan, is created. In Phase II, the FirstScan structure is utilized to mine for large fuzzy k-itemsets. Ultimately, in Phase III, the given k-itemsets are employed to generate fuzzy rules for associations between risk factors and diseases. To evaluate the performance of the proposed method the Z-Alizadeh Sani coronary artery disease (CAD) dataset, containing 303 records and 54 features, was used. The results show a positive correlation between typical chest pain and old age with the incidence of CAD. The comparisons made in this study showed that, firstly, the proposed algorithm has a higher partitioning accuracy than other methods, and secondly, it has a reasonably short execution time.
Collapse
Affiliation(s)
- Ali Yavari
- Department of Electrical and Computer Engineering, Razi University, Kermanshah, Iran.
| | - Amir Rajabzadeh
- Department of Electrical and Computer Engineering, Razi University, Kermanshah, Iran.
| | | |
Collapse
|
17
|
De Silva K, Jönsson D, Demmer RT. A combined strategy of feature selection and machine learning to identify predictors of prediabetes. J Am Med Inform Assoc 2021; 27:396-406. [PMID: 31889178 DOI: 10.1093/jamia/ocz204] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 11/07/2019] [Accepted: 11/13/2019] [Indexed: 02/07/2023] Open
Abstract
OBJECTIVE To identify predictors of prediabetes using feature selection and machine learning on a nationally representative sample of the US population. MATERIALS AND METHODS We analyzed n = 6346 men and women enrolled in the National Health and Nutrition Examination Survey 2013-2014. Prediabetes was defined using American Diabetes Association guidelines. The sample was randomly partitioned to training (n = 3174) and internal validation (n = 3172) sets. Feature selection algorithms were run on training data containing 156 preselected exposure variables. Four machine learning algorithms were applied on 46 exposure variables in original and resampled training datasets built using 4 resampling methods. Predictive models were tested on internal validation data (n = 3172) and external validation data (n = 3000) prepared from National Health and Nutrition Examination Survey 2011-2012. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC). Predictors were assessed by odds ratios in logistic models and variable importance in others. The Centers for Disease Control (CDC) prediabetes screening tool was the benchmark to compare model performance. RESULTS Prediabetes prevalence was 23.43%. The CDC prediabetes screening tool produced 64.40% AUROC. Seven optimal (≥ 70% AUROC) models identified 25 predictors including 4 potentially novel associations; 20 by both logistic and other nonlinear/ensemble models and 5 solely by the latter. All optimal models outperformed the CDC prediabetes screening tool (P < 0.05). DISCUSSION Combined use of feature selection and machine learning increased predictive performance outperforming the recommended screening tool. A range of predictors of prediabetes was identified. CONCLUSION This work demonstrated the value of combining feature selection with machine learning to identify a wide range of predictors that could enhance prediabetes prediction and clinical decision-making.
Collapse
Affiliation(s)
- Kushan De Silva
- Department of Clinical Sciences, Faculty of Medicine, Lund University, Lund,Sweden.,Department of General Practice, School of Primary and Allied Health Care, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Notting Hill, Australia
| | - Daniel Jönsson
- Department of Periodontology, Malmö University, Malmö and Swedish Dental Service of Skane, Lund, Sweden
| | - Ryan T Demmer
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
18
|
Lin TH, Jhang JY, Huang CR, Tsai YC, Cheng HC, Sheu BS. Deep Ensemble Feature Network for Gastric Section Classification. IEEE J Biomed Health Inform 2021; 25:77-87. [PMID: 32750926 DOI: 10.1109/jbhi.2020.2999731] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper, we propose a novel deep ensemble feature (DEF) network to classify gastric sections from endoscopic images. Different from recent deep ensemble learning methods, which need to train deep features and classifiers individually to obtain fused classification results, the proposed method can simultaneously learn the deep ensemble feature from arbitrary number of convolutional neural networks (CNNs) and the decision classifier in an end-to-end trainable manner. It comprises two sub networks, the ensemble feature network and the decision network. The former sub network learns the deep ensemble feature from multiple CNNs to represent endoscopic images. The latter sub network learns to obtain the classification labels by using the deep ensemble feature. Both sub networks are optimized based on the proposed ensemble feature loss and the decision loss which guide the learning of deep features and decisions. As shown in the experimental results, the proposed method outperforms the state-of-the-art deep learning, ensemble learning, and deep ensemble learning methods.
Collapse
|
19
|
Sowah RA, Bampoe-Addo AA, Armoo SK, Saalia FK, Gatsi F, Sarkodie-Mensah B. Design and Development of Diabetes Management System Using Machine Learning. Int J Telemed Appl 2020; 2020:8870141. [PMID: 32724304 PMCID: PMC7381989 DOI: 10.1155/2020/8870141] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 06/25/2020] [Accepted: 07/04/2020] [Indexed: 11/18/2022] Open
Abstract
This paper describes the design and implementation of a software system to improve the management of diabetes using a machine learning approach and to demonstrate and evaluate its effectiveness in controlling diabetes. The proposed approach for this management system handles the various factors that affect the health of people with diabetes by combining multiple artificial intelligence algorithms. The proposed framework factors the diabetes management problem into subgoals: building a Tensorflow neural network model for food classification; thus, it allows users to upload an image to determine if a meal is recommended for consumption; implementing K-Nearest Neighbour (KNN) algorithm to recommend meals; using cognitive sciences to build a diabetes question and answer chatbot; tracking user activity, user geolocation, and generating pdfs of logged blood sugar readings. The food recognition model was evaluated with cross-entropy metrics that support validation using Neural networks with a backpropagation algorithm. The model learned features of the images fed from local Ghanaian dishes with specific nutritional value and essence in managing diabetics and provided accurate image classification with given labels and corresponding accuracy. The model achieved specified goals by predicting with high accuracy, labels of new images. The food recognition and classification model achieved over 95% accuracy levels for specific calorie intakes. The performance of the meal recommender model and question and answer chatbot was tested with a designed cross-platform user-friendly interface using Cordova and Ionic Frameworks for software development for both mobile and web applications. The system recommended meals to meet the calorific needs of users successfully using KNN (with k = 5) and answered questions asked in a human-like way. The implemented system would solve the problem of managing activity, dieting recommendations, and medication notification of diabetics.
Collapse
Affiliation(s)
- Robert A. Sowah
- Department of Computer Engineering, University of Ghana, P.O. Box LG 77, Legon, Accra-, Ghana
| | - Adelaide A. Bampoe-Addo
- Department of Computer Engineering, University of Ghana, P.O. Box LG 77, Legon, Accra-, Ghana
| | - Stephen K. Armoo
- Department of Computer Engineering, University of Ghana, P.O. Box LG 77, Legon, Accra-, Ghana
| | - Firibu K. Saalia
- Department of Food Process Engineering, And Department of Nutrition and Food Science, University of Ghana, P.O. Box LG 77, Legon, Accra-, Ghana
| | - Francis Gatsi
- Department of Engineering and Computer Science, Ashesi University, Berekuso, Eastern Region, Ghana
| | - Baffour Sarkodie-Mensah
- Department of Computer Engineering, University of Ghana, P.O. Box LG 77, Legon, Accra-, Ghana
| |
Collapse
|
20
|
Yang T, Zhang L, Yi L, Feng H, Li S, Chen H, Zhu J, Zhao J, Zeng Y, Liu H. Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation. JMIR Med Inform 2020; 8:e15431. [PMID: 32554386 PMCID: PMC7333074 DOI: 10.2196/15431] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 12/22/2019] [Accepted: 02/07/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Early diabetes screening can effectively reduce the burden of disease. However, natural population-based screening projects require a large number of resources. With the emergence and development of machine learning, researchers have started to pursue more flexible and efficient methods to screen or predict type 2 diabetes. OBJECTIVE The aim of this study was to build prediction models based on the ensemble learning method for diabetes screening to further improve the health status of the population in a noninvasive and inexpensive manner. METHODS The dataset for building and evaluating the diabetes prediction model was extracted from the National Health and Nutrition Examination Survey from 2011-2016. After data cleaning and feature selection, the dataset was split into a training set (80%, 2011-2014), test set (20%, 2011-2014) and validation set (2015-2016). Three simple machine learning methods (linear discriminant analysis, support vector machine, and random forest) and easy ensemble methods were used to build diabetes prediction models. The performance of the models was evaluated through 5-fold cross-validation and external validation. The Delong test (2-sided) was used to test the performance differences between the models. RESULTS We selected 8057 observations and 12 attributes from the database. In the 5-fold cross-validation, the three simple methods yielded highly predictive performance models with areas under the curve (AUCs) over 0.800, wherein the ensemble methods significantly outperformed the simple methods. When we evaluated the models in the test set and validation set, the same trends were observed. The ensemble model of linear discriminant analysis yielded the best performance, with an AUC of 0.849, an accuracy of 0.730, a sensitivity of 0.819, and a specificity of 0.709 in the validation set. CONCLUSIONS This study indicates that efficient screening using machine learning methods with noninvasive tests can be applied to a large population and achieve the objective of secondary prevention.
Collapse
Affiliation(s)
- Tianzhou Yang
- School of Life Science, Liaoning University, Shenyang, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang, China
| | - Liwei Yi
- School of Information, Liaoning University, Shenyang, China
| | - Huawei Feng
- School of Life Science, Liaoning University, Shenyang, China
| | - Shimeng Li
- School of Life Science, Liaoning University, Shenyang, China
| | - Haoyu Chen
- School of Information, Liaoning University, Shenyang, China
| | - Junfeng Zhu
- School of Life Science, Liaoning University, Shenyang, China
| | - Jian Zhao
- School of Life Science, Liaoning University, Shenyang, China
| | - Yingyue Zeng
- School of Life Science, Liaoning University, Shenyang, China
| | - Hongsheng Liu
- School of Life Science, Liaoning University, Shenyang, China.,Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, Liaoning University, Shenyang, China.,Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, China
| |
Collapse
|
21
|
Wang X, Yang Y, Xu Y, Chen Q, Wang H, Gao H. Predicting hypoglycemic drugs of type 2 diabetes based on weighted rank support vector machine. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105868] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
22
|
Lin GM, Nagamine M, Yang SN, Tai YM, Lin C, Sato H. Machine Learning Based Suicide Ideation Prediction for Military Personnel. IEEE J Biomed Health Inform 2020; 24:1907-1916. [PMID: 32324581 DOI: 10.1109/jbhi.2020.2988393] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Military personnel have greater psychological stress and are at higher suicide attempt risk compared with the general population. High mental stress may cause suicide ideations which are crucially driving suicide attempts. However, traditional statistical methods could only find a moderate degree of correlation between psychological stress and suicide ideation in non-psychiatric individuals. This article utilizes machine learning techniques including logistic regression, decision tree, random forest, gradient boosting regression tree, support vector machine and multilayer perceptron to predict the presence of suicide ideation by six important psychological stress domains of the military males and females. The accuracies of all the six machine learning methods are over 98%. Among them, the multilayer perceptron and support vector machine provide the best predictions of suicide ideation approximately to 100%. As compared with the BSRS-5 score ≥7, a conventional criterion, for the presence of suicide ideation ≥1, the proposed algorithms can improve the performances of accuracy, sensitivity, specificity, precision, the AUC of ROC curve and the AUC of PR curve up to 5.7%, 35.9%, 4.6%, 65.2%, 4.3% and 53.2%, respectively; and for the presence of more severely intense suicide ideation ≥2, the improvements are 6.1%, 26.2%, 5.8%, 83.5%, 2.8% and 64.7%, respectively.
Collapse
|
23
|
Abstract
PURPOSE OF REVIEW Machine learning (ML) is increasingly being studied for the screening, diagnosis, and management of diabetes and its complications. Although various models of ML have been developed, most have not led to practical solutions for real-world problems. There has been a disconnect between ML developers, regulatory bodies, health services researchers, clinicians, and patients in their efforts. Our aim is to review the current status of ML in various aspects of diabetes care and identify key challenges that must be overcome to leverage ML to its full potential. RECENT FINDINGS ML has led to impressive progress in development of automated insulin delivery systems and diabetic retinopathy screening tools. Compared with these, use of ML in other aspects of diabetes is still at an early stage. The Food & Drug Administration (FDA) is adopting some innovative models to help bring technologies to the market in an expeditious and safe manner. ML has great potential in managing diabetes and the future is in furthering the partnership of regulatory bodies with health service researchers, clinicians, developers, and patients to improve the outcomes of populations and individual patients with diabetes.
Collapse
Affiliation(s)
- David T Broome
- Department of Endocrinology, Diabetes & Metabolism, Cleveland Clinic Foundation, F-20 9500 Euclid Avenue, Cleveland, OH, 44195, USA
| | - C Beau Hilton
- Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, 9500 Euclid Ave, Cleveland, OH, 44195, USA
| | - Neil Mehta
- Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, EC-40 9500 Euclid Ave, Cleveland, OH, 44195, USA.
| |
Collapse
|
24
|
Mienye ID, Sun Y, Wang Z. An improved ensemble learning approach for the prediction of heart disease risk. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100402] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
25
|
Predicting long-term type 2 diabetes with support vector machine using oral glucose tolerance test. PLoS One 2019; 14:e0219636. [PMID: 31826018 PMCID: PMC6905529 DOI: 10.1371/journal.pone.0219636] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/08/2019] [Indexed: 12/13/2022] Open
Abstract
Diabetes is a large healthcare burden worldwide. There is substantial evidence that lifestyle modifications and drug intervention can prevent diabetes, therefore, an early identification of high risk individuals is important to design targeted prevention strategies. In this paper, we present an automatic tool that uses machine learning techniques to predict the development of type 2 diabetes mellitus (T2DM). Data generated from an oral glucose tolerance test (OGTT) was used to develop a predictive model based on the support vector machine (SVM). We trained and validated the models using the OGTT and demographic data of 1,492 healthy individuals collected during the San Antonio Heart Study. This study collected plasma glucose and insulin concentrations before glucose intake and at three time-points thereafter (30, 60 and 120 min). Furthermore, personal information such as age, ethnicity and body-mass index was also a part of the data-set. Using 11 OGTT measurements, we have deduced 61 features, which are then assigned a rank and the top ten features are shortlisted using minimum redundancy maximum relevance feature selection algorithm. All possible combinations of the 10 best ranked features were used to generate SVM based prediction models. This research shows that an individual’s plasma glucose levels, and the information derived therefrom have the strongest predictive performance for the future development of T2DM. Significantly, insulin and demographic features do not provide additional performance improvement for diabetes prediction. The results of this work identify the parsimonious clinical data needed to be collected for an efficient prediction of T2DM. Our approach shows an average accuracy of 96.80% and a sensitivity of 80.09% obtained on a holdout set.
Collapse
|
26
|
Sohail A, Arif F. Supervised and unsupervised algorithms for bioinformatics and data science. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2019; 151:14-22. [PMID: 31816343 DOI: 10.1016/j.pbiomolbio.2019.11.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 09/25/2019] [Accepted: 11/27/2019] [Indexed: 01/16/2023]
Abstract
Bioinformatics refers to an ever evolving huge field of research based on millions of algorithms, designated to several data banks. Such algorithms are either supervised or unsupervised. In this article, a detailed overview of the supervised and unsupervised techniques is presented with the aid of examples. The aim of this article is to provide the readers with the basic understanding of the state of the art models, which are key ingredients of explainable machine learning in the field of bioinformatics.
Collapse
Affiliation(s)
- Ayesha Sohail
- Department of Mathematics, Comsats University Islamabad, Lahore Campus, 54000, Pakistan.
| | - Fatima Arif
- Department of Mathematics, Comsats University Islamabad, Lahore Campus, 54000, Pakistan
| |
Collapse
|
27
|
A Dynamic Multi-Reduction Algorithm for Brain Functional Connection Pathways Analysis. Symmetry (Basel) 2019. [DOI: 10.3390/sym11050701] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Revealing brain functional connection pathways is of great significance in understanding the cognitive mechanism of the brain. In this paper, we present a novel rough set based dynamic multi-reduction algorithm (DMRA) to analyze brain functional connection pathways. First, a binary discernibility matrix is introduced to obtain a reduction, and a reduction equivalence theorem is proposed and proved to verify the feasibility of reduction algorithm. Based on this idea, we propose a dynamic single-reduction algorithm (DSRA) to obtain a seed reduction, in which two dynamical acceleration mechanisms are presented to reduce the size of the binary discernibility matrix dynamically. Then, the dynamic multi-reduction algorithm is proposed, and multi-reductions can be obtained by replacing the non-core attributes in seed reduction. Comparative performance experiments were carried out on the UCI datasets to illustrate the superiority of DMRA in execution time and classification accuracy. A memory cognitive experiment was designed and three brain functional connection pathways were successfully obtained from brain functional Magnetic Resonance Imaging (fMRI) by employing the proposed DMRA. The theoretical and empirical results both illustrate the potentials of DMRA for brain functional connection pathways analysis.
Collapse
|
28
|
Jahangir M, Afzal H, Ahmed M, Khurshid K, Amjad MF, Nawaz R, Abbas H. Auto-MeDiSine: an auto-tunable medical decision support engine using an automated class outlier detection method and AutoMLP. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04137-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
29
|
Nirala N, Periyasamy R, Singh BK, Kumar A. Detection of type-2 diabetes using characteristics of toe photoplethysmogram by applying support vector machine. Biocybern Biomed Eng 2019. [DOI: 10.1016/j.bbe.2018.09.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
30
|
Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting Diabetes Mellitus With Machine Learning Techniques. Front Genet 2018; 9:515. [PMID: 30459809 PMCID: PMC6232260 DOI: 10.3389/fgene.2018.00515] [Citation(s) in RCA: 188] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 10/12/2018] [Indexed: 12/30/2022] Open
Abstract
Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world’s diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes. There is no doubt that this alarming figure needs great attention. With the rapid development of machine learning, machine learning has been applied to many aspects of medical health. In this study, we used decision tree, random forest and neural network to predict diabetes mellitus. The dataset is the hospital physical examination data in Luzhou, China. It contains 14 attributes. In this study, five-fold cross validation was used to examine the models. In order to verity the universal applicability of the methods, we chose some methods that have the better performance to conduct independent test experiments. We randomly selected 68994 healthy people and diabetic patients’ data, respectively as training set. Due to the data unbalance, we randomly extracted 5 times data. And the result is the average of these five experiments. In this study, we used principal component analysis (PCA) and minimum redundancy maximum relevance (mRMR) to reduce the dimensionality. The results showed that prediction with random forest could reach the highest accuracy (ACC = 0.8084) when all the attributes were used.
Collapse
Affiliation(s)
- Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Kaiyang Qu
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Yamei Luo
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, China
| | - Dehui Yin
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, China
| | - Ying Ju
- School of Information Science and Technology, Xiamen University, Xiamen, China
| | - Hua Tang
- Department of Pathophysiology, School of Basic Medicine, Southwest Medical University, Luzhou, China
| |
Collapse
|
31
|
Dankwa-Mullan I, Rivo M, Sepulveda M, Park Y, Snowdon J, Rhee K. Transforming Diabetes Care Through Artificial Intelligence: The Future Is Here. Popul Health Manag 2018; 22:229-242. [PMID: 30256722 PMCID: PMC6555175 DOI: 10.1089/pop.2018.0129] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
An estimated 425 million people globally have diabetes, accounting for 12% of the world's health expenditures, and yet 1 in 2 persons remain undiagnosed and untreated. Applications of artificial intelligence (AI) and cognitive computing offer promise in diabetes care. The purpose of this article is to better understand what AI advances may be relevant today to persons with diabetes (PWDs), their clinicians, family, and caregivers. The authors conducted a predefined, online PubMed search of publicly available sources of information from 2009 onward using the search terms "diabetes" and "artificial intelligence." The study included clinically-relevant, high-impact articles, and excluded articles whose purpose was technical in nature. A total of 450 published diabetes and AI articles met the inclusion criteria. The studies represent a diverse and complex set of innovative approaches that aim to transform diabetes care in 4 main areas: automated retinal screening, clinical decision support, predictive population risk stratification, and patient self-management tools. Many of these new AI-powered retinal imaging systems, predictive modeling programs, glucose sensors, insulin pumps, smartphone applications, and other decision-support aids are on the market today with more on the way. AI applications have the potential to transform diabetes care and help millions of PWDs to achieve better blood glucose control, reduce hypoglycemic episodes, and reduce diabetes comorbidities and complications. AI applications offer greater accuracy, efficiency, ease of use, and satisfaction for PWDs, their clinicians, family, and caregivers.
Collapse
Affiliation(s)
| | - Marc Rivo
- 2 Population Health Innovations, Inc., Miami Beach, Florida
| | | | - Yoonyoung Park
- 4 IBM Corporation, IBM Research, Cambridge, Massachusetts
| | - Jane Snowdon
- 5 IBM Corporation, Watson Health, Yorktown Heights, New York
| | - Kyu Rhee
- 6 IBM Corporation, Watson Health, Cambridge, Massachusetts
| |
Collapse
|
32
|
Personalized prediction of drug efficacy for diabetes treatment via patient-level sequential modeling with neural networks. Artif Intell Med 2018; 85:1-6. [DOI: 10.1016/j.artmed.2018.02.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Revised: 01/15/2018] [Accepted: 02/15/2018] [Indexed: 01/24/2023]
|
33
|
Sobrinho A, da Silva LD, Perkusich A, Pinheiro ME, Cunha P. Design and evaluation of a mobile application to assist the self-monitoring of the chronic kidney disease in developing countries. BMC Med Inform Decis Mak 2018; 18:7. [PMID: 29329530 PMCID: PMC5767024 DOI: 10.1186/s12911-018-0587-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 01/05/2018] [Indexed: 12/13/2022] Open
Abstract
Background The chronic kidney disease (CKD) is a worldwide critical problem, especially in developing countries. CKD patients usually begin their treatment in advanced stages, which requires dialysis and kidney transplantation, and consequently, affects mortality rates. This issue is faced by a mobile health (mHealth) application (app) that aims to assist the early diagnosis and self-monitoring of the disease progression. Methods A user-centered design (UCD) approach involving health professionals (nurse and nephrologists) and target users guided the development process of the app between 2012 and 2016. In-depth interviews and prototyping were conducted along with healthcare professionals throughout the requirements elicitation process. Elicited requirements were translated into a native mHealth app targeting the Android platform. Afterward, the Cohen’s Kappa coefficient statistics was applied to evaluate the agreement between the app and three nephrologists who analyzed test results collected from 60 medical records. Finally, eight users tested the app and were interviewed about usability and user perceptions. Results A mHealth app was designed to assist the CKD early diagnosis and self-monitoring considering quality attributes such as safety, effectiveness, and usability. A global Kappa value of 0.7119 showed a substantial degree of agreement between the app and three nephrologists. Results of face-to-face interviews with target users indicated a good user satisfaction. However, the task of CKD self-monitoring proved difficult because most of the users did not fully understand the meaning of specific biomarkers (e.g., creatinine). Conclusion The UCD approach provided mechanisms to develop the app based on the real needs of users. Even with no perfect Kappa degree of agreement, results are satisfactory because it aims to refer patients to nephrologists in early stages, where they may confirm the CKD diagnosis.
Collapse
Affiliation(s)
- Alvaro Sobrinho
- Federal Rural University of the Semiarid, Rodovia BR-226, Pau dos Ferros, 59900-000, Brazil.
| | - Leandro Dias da Silva
- Federal University of Alagoas, Av. Lourival Melo Mota, S/N Tabuleiro do Martins, Maceió, 57072-900, Brazil
| | - Angelo Perkusich
- Federal University of Campina Grande, R. Aprígio Veloso, 882, Universitário, Paraíba, 58429-900, Brazil
| | - Maria Eliete Pinheiro
- Federal University of Alagoas, Av. Lourival Melo Mota, S/N Tabuleiro do Martins, Maceió, 57072-900, Brazil
| | - Paulo Cunha
- Federal Institute of Alagoas, R. Prof. Domingos Correia, 1207, Ouro Preto, Alagoas, 57300-010, Brazil
| |
Collapse
|
34
|
Disease Diagnosis in Smart Healthcare: Innovation, Technologies and Applications. SUSTAINABILITY 2017. [DOI: 10.3390/su9122309] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
35
|
Han L, Luo S, Wang H, Pan L, Ma X, Zhang T. An Intelligible Risk Stratification Model Based on Pairwise and Size Constrained Kmeans. IEEE J Biomed Health Inform 2017; 21:1288-1296. [DOI: 10.1109/jbhi.2016.2633403] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
36
|
Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine Learning and Data Mining Methods in Diabetes Research. Comput Struct Biotechnol J 2017; 15:104-116. [PMID: 28138367 PMCID: PMC5257026 DOI: 10.1016/j.csbj.2016.12.005] [Citation(s) in RCA: 332] [Impact Index Per Article: 47.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 12/20/2016] [Accepted: 12/27/2016] [Indexed: 12/14/2022] Open
Abstract
The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.
Collapse
Affiliation(s)
- Ioannis Kavakiotis
- Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
- Institute of Applied Biosciences, CERTH, Thessaloniki, Greece
| | - Olga Tsave
- Laboratory of Inorganic Chemistry, Department of Chemical Engineering, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
| | - Athanasios Salifoglou
- Laboratory of Inorganic Chemistry, Department of Chemical Engineering, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
| | - Nicos Maglaveras
- Institute of Applied Biosciences, CERTH, Thessaloniki, Greece
- Lab of Computing and Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
| | - Ioannis Vlahavas
- Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
| | - Ioanna Chouvarda
- Institute of Applied Biosciences, CERTH, Thessaloniki, Greece
- Lab of Computing and Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
| |
Collapse
|
37
|
|
38
|
Yilmaz T, Kılıç MA, Erdoğan M, Çayören M, Tunaoğlu D, Kurtoğlu İ, Yaslan Y, Çayören H, Arkan AE, Teksöz S, Cancan G, Kepil N, Erdamar S, Özcan M, Akduman İ, Kalkan T. Machine learning aided diagnosis of hepatic malignancies through in vivo dielectric measurements with microwaves. Phys Med Biol 2016; 61:5089-5102. [PMID: 27321132 DOI: 10.1088/0031-9155/61/13/5089] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In the past decade, extensive research on dielectric properties of biological tissues led to characterization of dielectric property discrepancy between the malignant and healthy tissues. Such discrepancy enabled the development of microwave therapeutic and diagnostic technologies. Traditionally, dielectric property measurements of biological tissues is performed with the well-known contact probe (open-ended coaxial probe) technique. However, the technique suffers from limited accuracy and low loss resolution for permittivity and conductivity measurements, respectively. Therefore, despite the inherent dielectric property discrepancy, a rigorous measurement routine with open-ended coaxial probes is required for accurate differentiation of malignant and healthy tissues. In this paper, we propose to eliminate the need for multiple measurements with open-ended coaxial probe for malignant and healthy tissue differentiation by applying support vector machine (SVM) classification algorithm to the dielectric measurement data. To do so, first, in vivo malignant and healthy rat liver tissue dielectric property measurements are collected with open-ended coaxial probe technique between 500 MHz to 6 GHz. Cole-Cole functions are fitted to the measured dielectric properties and measurement data is verified with the literature. Malign tissue classification is realized by applying SVM to the open-ended coaxial probe measurements where as high as 99.2% accuracy (F1 Score) is obtained.
Collapse
Affiliation(s)
- Tuba Yilmaz
- Department of Electronics and Communication Engineering, Istanbul Technical University, Istanbul, Turkey. MITOS Medical Technologies A.S, Istanbul, Turkey
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Luo G. Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction. Health Inf Sci Syst 2016; 4:2. [PMID: 26958341 PMCID: PMC4782293 DOI: 10.1186/s13755-016-0015-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 03/01/2016] [Indexed: 11/10/2022] Open
Abstract
Background Predictive modeling is a key component of solutions to many healthcare problems. Among all predictive modeling approaches, machine learning methods often achieve the highest prediction accuracy, but suffer from a long-standing open problem precluding their widespread use in healthcare. Most machine learning models give no explanation for their prediction results, whereas interpretability is essential for a predictive model to be adopted in typical healthcare settings. Methods This paper presents the first complete method for automatically explaining results for any machine learning predictive model without degrading accuracy. We did a computer coding implementation of the method. Using the electronic medical record data set from the Practice Fusion diabetes classification competition containing patient records from all 50 states in the United States, we demonstrated the method on predicting type 2 diabetes diagnosis within the next year. Results For the champion machine learning model of the competition, our method explained prediction results for 87.4 % of patients who were correctly predicted by the model to have type 2 diabetes diagnosis within the next year. Conclusions Our demonstration showed the feasibility of automatically explaining results for any machine learning predictive model without degrading accuracy.
Collapse
Affiliation(s)
- Gang Luo
- Department of Biomedical Informatics, University of Utah, Suite 140, 421 Wakara Way, Salt Lake City, UT 84108 USA
| |
Collapse
|
40
|
Shar PA, Tao W, Gao S, Huang C, Li B, Zhang W, Shahen M, Zheng C, Bai Y, Wang Y. Pred-binding: large-scale protein-ligand binding affinity prediction. J Enzyme Inhib Med Chem 2016; 31:1443-50. [PMID: 26888050 DOI: 10.3109/14756366.2016.1144594] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Drug target interactions (DTIs) are crucial in pharmacology and drug discovery. Presently, experimental determination of compound-protein interactions remains challenging because of funding investment and difficulties of purifying proteins. In this study, we proposed two in silico models based on support vector machine (SVM) and random forest (RF), using 1589 molecular descriptors and 1080 protein descriptors in 9948 ligand-protein pairs to predict DTIs that were quantified by Ki values. The cross-validation coefficient of determination of 0.6079 for SVM and 0.6267 for RF were obtained, respectively. In addition, the two-dimensional (2D) autocorrelation, topological charge indices and three-dimensional (3D)-MoRSE descriptors of compounds, the autocorrelation descriptors and the amphiphilic pseudo-amino acid composition of protein are found most important for Ki predictions. These models provide a new opportunity for the prediction of ligand-receptor interactions that will facilitate the target discovery and toxicity evaluation in drug development.
Collapse
Affiliation(s)
- Piar Ali Shar
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Weiyang Tao
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Shuo Gao
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Chao Huang
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Bohui Li
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Wenjuan Zhang
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Mohamed Shahen
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Chunli Zheng
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Yaofei Bai
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| | - Yonghua Wang
- a Bioinformatics Center, College of Life Sciences, Northwest A & F University , Yangling , Shaanxi , China
| |
Collapse
|