1
|
Marateb HR, Mansourian M, Koochekian A, Shirzadi M, Zamani S, Mansourian M, Mañanas MA, Kelishadi R. Prevention of Cardiometabolic Syndrome in Children and Adolescents Using Machine Learning and Noninvasive Factors: The CASPIAN-V Study. INFORMATION 2024; 15:564. [DOI: 10.3390/info15090564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2024] Open
Abstract
Cardiometabolic syndrome (CMS) is a growing concern in children and adolescents, marked by obesity, hypertension, insulin resistance, and dyslipidemia. This study aimed to predict CMS using machine learning based on data from the CASPIAN-V study, which involved 14,226 participants aged 7–18 years, with a CMS prevalence of 82.9%. We applied the XGBoost algorithm to analyze key noninvasive variables, including self-rated health, sunlight exposure, screen time, consanguinity, healthy and unhealthy dietary habits, discretionary salt and sugar consumption, birthweight, and birth order, father and mother education, oral hygiene behavior, and family history of dyslipidemia, obesity, hypertension, and diabetes using five-fold cross-validation. The model achieved high sensitivity (94.7% ± 4.8) and specificity (78.8% ± 13.7), with an area under the ROC curve (AUC) of 0.867 ± 0.087, indicating strong predictive performance and significantly outperformed triponderal mass index (TMI) (adjusted paired t-test; p < 0.05). The most critical selected modifiable factors were sunlight exposure, screen time, consanguinity, healthy and unhealthy diet, dietary fat type, and discretionary salt consumption. This study emphasizes the clinical importance of early identification of at-risk individuals to implement timely interventions. It offers a promising tool for CMS risk screening. These findings support using predictive analytics in clinical settings to address the rising CMS epidemic in children and adolescents.
Collapse
Affiliation(s)
- Hamid Reza Marateb
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain
| | - Mahsa Mansourian
- Department of Medical Physics, School of Medicine, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran
| | - Amirhossein Koochekian
- Child Growth and Development Research Center, Research Institute for Primordial Prevention of Non-Communicable Disease, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran
| | - Mehdi Shirzadi
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain
| | - Shadi Zamani
- Biomedical Engineering Department, Engineering Faculty, University of Isfahan, Isfahan 81746-73441, Iran
| | - Marjan Mansourian
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain
| | - Miquel Angel Mañanas
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain
- Biomedical Research Networking Center in Bioengineering, Biomaterials, and Nanomedicine (CIBER-BBN), 28029 Madrid, Spain
| | - Roya Kelishadi
- Child Growth and Development Research Center, Research Institute for Primordial Prevention of Non-Communicable Disease, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran
| |
Collapse
|
2
|
Naderian S, Nikniaz Z, Farhangi MA, Nikniaz L, Sama-Soltani T, Rostami P. Predicting dyslipidemia incidence: unleashing machine learning algorithms on Lifestyle Promotion Project data. BMC Public Health 2024; 24:1777. [PMID: 38961394 PMCID: PMC11223414 DOI: 10.1186/s12889-024-19261-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 06/25/2024] [Indexed: 07/05/2024] Open
Abstract
BACKGROUND Dyslipidemia, characterized by variations in plasma lipid profiles, poses a global health threat linked to millions of deaths annually. OBJECTIVES This study focuses on predicting dyslipidemia incidence using machine learning methods, addressing the crucial need for early identification and intervention. METHODS The dataset, derived from the Lifestyle Promotion Project (LPP) in East Azerbaijan Province, Iran, undergoes a comprehensive preprocessing, merging, and null handling process. Target selection involves five distinct dyslipidemia-related variables. Normalization techniques and three feature selection algorithms are applied to enhance predictive modeling. RESULT The study results underscore the potential of different machine learning algorithms, specifically multi-layer perceptron neural network (MLP), in reaching higher performance metrics such as accuracy, F1 score, sensitivity and specificity, among other machine learning methods. Among other algorithms, Random Forest also showed remarkable accuracies and outperformed K-Nearest Neighbors (KNN) in metrics like precision, recall, and F1 score. The study's emphasis on feature selection detected meaningful patterns among five target variables related to dyslipidemia, indicating fundamental shared unities among dyslipidemia-related factors. Features such as waist circumference, serum vitamin D, blood pressure, sex, age, diabetes, and physical activity related to dyslipidemia. CONCLUSION These results cooperatively highlight the complex nature of dyslipidemia and its connections with numerous factors, strengthening the importance of applying machine learning methods to understand and predict its incidence precisely.
Collapse
Affiliation(s)
- Senobar Naderian
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
- Student Research Committee, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Zeinab Nikniaz
- Liver and Gastrointestinal Diseases Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | | | - Leila Nikniaz
- Tabriz Health Services Management Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Taha Sama-Soltani
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Parisa Rostami
- Student Research Committee, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
3
|
Sahoo H, Dhillon P, Anand E, Srivastava A, Usman M, Agrawal PK, Johnston R, Unisa S. Status and correlates of non-communicable diseases among children and adolescents in slum and non-slum areas of India's four metropolitan cities. J Biosoc Sci 2023; 55:1064-1085. [PMID: 36698328 DOI: 10.1017/s0021932022000530] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The emergence of non-communicable diseases (NCDs) in childhood poses a serious risk to a healthy adult life. The present study aimed to estimate the prevalence of NCDs among children and adolescents in slums and non-slums areas of four metropolitan cities of India, and in rural areas of the respective states The study further, investigated the effect of the place residence as slum vs. non-slum and other risk factors of the NCDs. Nationally representative data from the Comprehensive National Nutrition Survey (CNNS) was used.. Estimates were based on children (5-9 years) and adolescents (10-19 years) for whom biomarkers predicting diabetes, high total cholesterol, high triglycerides and hypertension were determined. Weight, height and age data were used to calculate z-scores of the body mass index. Overweight and obesity was higher in urban areas than in rural areas among children and adolescents. Regional differences in the prevalence of diseases were observed; children in Delhi and Chennai had a higher likelihood of being diabetic while children in Kolkata were at a greater risk of high total cholesterol and high triglycerides. The risk of hypertension was strikingly high among non-slum children in Delhi. Children from slums were at a higher risk of diabetes compared to the children from non-slums, while children and adolecents from non-slums were at a greater risk of high triglycerides and hypertension respectively than their counterparts from slums. Male children and adolecents had a higher risk of diabetes and high cholesterol. Screening of children for early detection of NCDs should be integrated with the already existing child and adolescent development schemes in schools and the community can help in prevention and control of NCDs in childhood.
Collapse
Affiliation(s)
- Harihar Sahoo
- Department of Family and Generations, International Institute for Population Sciences, (IIPS)Mumbai, India
| | - Preeti Dhillon
- Department of Survey Research and Data Analytics, IIPS, Mumbai, India
| | - Enu Anand
- Doctoral Fellow, IIPS, Mumbai, India
| | | | | | | | | | - Sayeed Unisa
- Department of Biostatistics and Epidemiology, IIPS, Mumbai, India
| |
Collapse
|
4
|
Hai Y, Zhao W, Meng Q, Liu L, Wen Y. Bayesian linear mixed model with multiple random effects for family-based genetic studies. Front Genet 2023; 14:1267704. [PMID: 37928242 PMCID: PMC10620972 DOI: 10.3389/fgene.2023.1267704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/25/2023] [Indexed: 11/07/2023] Open
Abstract
Motivation: Family-based study design is one of the popular designs used in genetic research, and the whole-genome sequencing data obtained from family-based studies offer many unique features for risk prediction studies. They can not only provide a more comprehensive view of many complex diseases, but also utilize information in the design to further improve the prediction accuracy. While promising, existing analytical methods often ignore the information embedded in the study design and overlook the predictive effects of rare variants, leading to a prediction model with sub-optimal performance. Results: We proposed a Bayesian linear mixed model for the prediction analysis of sequencing data obtained from family-based studies. Our method can not only capture predictive effects from both common and rare variants, but also easily accommodate various disease model assumptions. It uses information embedded in the study design to form surrogates, where the predictive effects from unmeasured/unknown genetic and environmental risk factors can be modelled. Through extensive simulation studies and the analysis of sequencing data obtained from the Michigan State University Twin Registry study, we have demonstrated that the proposed method outperforms commonly adopted techniques. Availability: R package is available at https://github.com/yhai943/FBLMM.
Collapse
Affiliation(s)
- Yang Hai
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Wenxuan Zhao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Qingyu Meng
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Long Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Yalu Wen
- Department of Statistics, University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Moradifar P, Amiri MM. Prediction of hypercholesterolemia using machine learning techniques. J Diabetes Metab Disord 2023; 22:255-265. [PMID: 37255802 PMCID: PMC10225453 DOI: 10.1007/s40200-022-01125-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 08/15/2022] [Accepted: 09/06/2022] [Indexed: 06/01/2023]
Abstract
Purpose Hypercholesterolemia is a major risk factor for a wide range of cardiovascular diseases. Developing countries are more susceptible to hypercholesterolemia and its complications due to the increasing prevalence and the lack of adequate resources for conducting screening and/or prevention programs. Using machine learning techniques to identify factors contributing to hypercholesterolemia and developing predictive models can help early detection of hypercholesterolemia, especially in developing countries. Methods Data from the nationwide 2016 STEPs study in Iran were used to identify socioeconomic, lifestyle, and metabolic risk factors associated with hypercholesterolemia. Furthermore, the predictive power of the identified risk factors was assessed using five commonly used machine learning algorithms (random forest; gradient boosting; support vector machine; logistic regression; artificial neural network) and 10-fold cross validation in terms of specificity, sensitivity, and the area under the receiver operating characteristic curve. Results A total of 14,667 individuals were included in this study, of those 12.8% (n = 1879) had (undiagnosed) hypercholesterolemia. Based on multivariate logistic regression analysis the five most important risk factors for hypercholesterolemia were: older age (for the elderly group: OR = 2.243; for the middle-aged group: OR = 1.869), obesity-related factors including high BMI status (morbidly obese: OR = 1.884; obese: OR = 1.499; overweight: OR = 1.426) and AO (OR = 1.339), raised BP (hypertension: OR = 1.729; prehypertension: OR = 1.577), consuming fish once or twice per week (OR = 1.261), and having risky diet (OR = 1.163). Furthermore, all the five hypercholesterolemia prediction models achieved AUC around 0.62, and models based on random forest (AUC = 0.6282; specificity = 65.14%; sensitivity = 60.51%) and gradient boosting (AUC = 0.6263; specificity = 64.11%; sensitivity = 61.15%) had the optimal performance. Conclusion The study shows that socioeconomic inequalities, unhealthy lifestyle, and metabolic syndrome (including obesity and hypertension) are significant predictors of hypercholesterolemia. Therefore controlling these factors is necessary to reduce the burden of hypercholesterolemia. Furthermore, machine learning algorithms such as random forest and gradient boosting can be employed for hypercholesterolemia screening and its timely diagnosis. Applying deep learning algorithms as well as techniques for handling the class overlap problem seems necessary to improve the performance of the models.
Collapse
|
6
|
Chen XY, Fang L, Zhang J, Zhong JM, Lin JJ, Lu F. The association of body mass index and its interaction with family history of dyslipidemia towards dyslipidemia in patients with type 2 diabetes: a cross-sectional study in Zhejiang Province, China. Front Public Health 2023; 11:1188212. [PMID: 37255759 PMCID: PMC10225544 DOI: 10.3389/fpubh.2023.1188212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 04/17/2023] [Indexed: 06/01/2023] Open
Abstract
Objectives This study aimed to investigate the association between body mass index (BMI) and dyslipidemia and to explore the interaction between BMI and family history of dyslipidemia towards dyslipidemia in patients with type 2 diabetes. Methods This cross-sectional study was conducted between March and November 2018 in Zhejiang Province, China. A total of 1,756 patients with type 2 diabetes were included, physical examination data, fasting blood samples and face-to-face questionnaire survey data were collected. Restricted cubic spline analysis was used to evaluate the association between BMI and the risk of dyslipidemia. Unconditional multivariable logistic regression was used to estimate the interaction between BMI and family history of dyslipidemia towards dyslipidemia. Results The prevalence of dyslipidemia was 53.7% in the study population. The risk of dyslipidemia elevated with increased BMI value (p for non-linearity <0.05). After adjusting for covariates, individuals with high BMI (≥24 kg/m2) and a family history of dyslipidemia had a 4.50-fold (95% CI: 2.99-6.78) increased risk of dyslipidemia compared to the normal reference group, which was higher than the risk associated with high BMI alone (OR = 1.83, 95% CI: 1.47-2.28) or family history of dyslipidemia alone (OR = 1.79 95% CI: 1.14-2.83). Significant additive interaction between high BMI and a family history of dyslipidemia was detected, with RERI, AP, and SI values of 1.88 (95% CI: 0.17-4.10), 0.42 (95% CI: 0.02-0.62), and 2.16 (95% CI: 1.07-4.37), respectively. However, stratified by status of diabetes control, this additive interaction was only find significant among patients with controlled diabetes. Conclusion Both high BMI and a family history of dyslipidemia were related with high risk of dyslipidemia. Moreover, there were synergistic interaction between these two factors. Patients with type 2 diabetes who had a family history of dyslipidemia were more susceptible to the negative impact of being overweight or obesity on dyslipidemia.
Collapse
|
7
|
Ahsan MM, Siddique Z. Machine learning-based heart disease diagnosis: A systematic literature review. Artif Intell Med 2022; 128:102289. [DOI: 10.1016/j.artmed.2022.102289] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/22/2022] [Indexed: 01/01/2023]
|
8
|
Abedi M, Marateb HR, Mohebian MR, Aghaee-Bakhtiari SH, Nassiri SM, Gheisari Y. Systems biology and machine learning approaches identify drug targets in diabetic nephropathy. Sci Rep 2021; 11:23452. [PMID: 34873190 PMCID: PMC8648918 DOI: 10.1038/s41598-021-02282-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 11/12/2021] [Indexed: 11/15/2022] Open
Abstract
Diabetic nephropathy (DN), the leading cause of end-stage renal disease, has become a massive global health burden. Despite considerable efforts, the underlying mechanisms have not yet been comprehensively understood. In this study, a systematic approach was utilized to identify the microRNA signature in DN and to introduce novel drug targets (DTs) in DN. Using microarray profiling followed by qPCR confirmation, 13 and 6 differentially expressed (DE) microRNAs were identified in the kidney cortex and medulla, respectively. The microRNA-target interaction networks for each anatomical compartment were constructed and central nodes were identified. Moreover, enrichment analysis was performed to identify key signaling pathways. To develop a strategy for DT prediction, the human proteome was annotated with 65 biochemical characteristics and 23 network topology parameters. Furthermore, all proteins targeted by at least one FDA-approved drug were identified. Next, mGMDH-AFS, a high-performance machine learning algorithm capable of tolerating massive imbalanced size of the classes, was developed to classify DT and non-DT proteins. The sensitivity, specificity, accuracy, and precision of the proposed method were 90%, 86%, 88%, and 89%, respectively. Moreover, it significantly outperformed the state-of-the-art (P-value ≤ 0.05) and showed very good diagnostic accuracy and high agreement between predicted and observed class labels. The cortex and medulla networks were then analyzed with this validated machine to identify potential DTs. Among the high-rank DT candidates are Egfr, Prkce, clic5, Kit, and Agtr1a which is a current well-known target in DN. In conclusion, a combination of experimental and computational approaches was exploited to provide a holistic insight into the disorder for introducing novel therapeutic targets.
Collapse
Affiliation(s)
- Maryam Abedi
- grid.411036.10000 0001 1498 685XRegenerative Medicine Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hamid Reza Marateb
- grid.411750.60000 0001 0454 365XBiomedical Engineering Department, Engineering Faculty, University of Isfahan, Isfahan, Iran ,grid.6835.80000 0004 1937 028XDepartment of Automatic Control, Biomedical Engineering Research Center, Universitat Politècnica de Catalunya, BarcelonaTech (UPC), Barcelona, Spain
| | - Mohammad Reza Mohebian
- grid.25152.310000 0001 2154 235XDepartment of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Canada
| | - Seyed Hamid Aghaee-Bakhtiari
- grid.411583.a0000 0001 2198 6209Bioinformatics Research Group, Mashhad University of Medical Sciences, Mashhad, Iran ,grid.411583.a0000 0001 2198 6209Department of Medical Biotechnology and Nanotechnology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Seyed Mahdi Nassiri
- grid.46072.370000 0004 0612 7950Department of Clinical Pathology, Faculty of Veterinary Medicine, University of Tehran, Tehran, Iran
| | - Yousof Gheisari
- Regenerative Medicine Research Center, Isfahan University of Medical Sciences, Isfahan, Iran. .,Department of Genetics and Molecular Biology, Isfahan University of Medical Sciences, Isfahan, Iran.
| |
Collapse
|
9
|
Marateb HR, Ziaie Nezhad F, Mohebian MR, Sami R, Haghjooy Javanmard S, Dehghan Niri F, Akafzadeh-Savari M, Mansourian M, Mañanas MA, Wolkewitz M, Binder H. Automatic Classification Between COVID-19 and Non-COVID-19 Pneumonia Using Symptoms, Comorbidities, and Laboratory Findings: The Khorshid COVID Cohort Study. Front Med (Lausanne) 2021; 8:768467. [PMID: 34869483 PMCID: PMC8640954 DOI: 10.3389/fmed.2021.768467] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/06/2021] [Indexed: 01/08/2023] Open
Abstract
Coronavirus disease-2019, also known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was a disaster in 2020. Accurate and early diagnosis of coronavirus disease-2019 (COVID-19) is still essential for health policymaking. Reverse transcriptase-polymerase chain reaction (RT-PCR) has been performed as the operational gold standard for COVID-19 diagnosis. We aimed to design and implement a reliable COVID-19 diagnosis method to provide the risk of infection using demographics, symptoms and signs, blood markers, and family history of diseases to have excellent agreement with the results obtained by the RT-PCR and CT-scan. Our study primarily used sample data from a 1-year hospital-based prospective COVID-19 open-cohort, the Khorshid COVID Cohort (KCC) study. A sample of 634 patients with COVID-19 and 118 patients with pneumonia with similar characteristics whose RT-PCR and chest CT scan were negative (as the control group) (dataset 1) was used to design the system and for internal validation. Two other online datasets, namely, some symptoms (dataset 2) and blood tests (dataset 3), were also analyzed. A combination of one-hot encoding, stability feature selection, over-sampling, and an ensemble classifier was used. Ten-fold stratified cross-validation was performed. In addition to gender and symptom duration, signs and symptoms, blood biomarkers, and comorbidities were selected. Performance indices of the cross-validated confusion matrix for dataset 1 were as follows: sensitivity of 96% [confidence interval, CI, 95%: 94-98], specificity of 95% [90-99], positive predictive value (PPV) of 99% [98-100], negative predictive value (NPV) of 82% [76-89], diagnostic odds ratio (DOR) of 496 [198-1,245], area under the ROC (AUC) of 0.96 [0.94-0.97], Matthews Correlation Coefficient (MCC) of 0.87 [0.85-0.88], accuracy of 96% [94-98], and Cohen's Kappa of 0.86 [0.81-0.91]. The proposed algorithm showed excellent diagnosis accuracy and class-labeling agreement, and fair discriminant power. The AUC on the datasets 2 and 3 was 0.97 [0.96-0.98] and 0.92 [0.91-0.94], respectively. The most important feature was white blood cell count, shortness of breath, and C-reactive protein for datasets 1, 2, and 3, respectively. The proposed algorithm is, thus, a promising COVID-19 diagnosis method, which could be an amendment to simple blood tests and screening of symptoms. However, the RT-PCR and chest CT-scan, performed as the gold standard, are not 100% accurate.
Collapse
Affiliation(s)
- Hamid Reza Marateb
- The Biomedical Engineering Department, Engineering Faculty, University of Isfahan, Isfahan, Iran
| | - Farzad Ziaie Nezhad
- The Biomedical Engineering Department, Engineering Faculty, University of Isfahan, Isfahan, Iran
| | - Mohammad Reza Mohebian
- Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Ramin Sami
- Department of Internal Medicine, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Shaghayegh Haghjooy Javanmard
- Department of Physiology, Applied Physiology Research Center, School of Medicine, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | | | - Mahsa Akafzadeh-Savari
- Isfahan Clinical Toxicology Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Marjan Mansourian
- Automatic Control Department (ESAII), Biomedical Engineering Research Centre (CREB), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), Barcelona, Spain
- Department of Epidemiology and Biostatistics, School of Health, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Miquel Angel Mañanas
- Automatic Control Department (ESAII), Biomedical Engineering Research Centre (CREB), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), Barcelona, Spain
- Biomedical Research Networking Center in Bioengineering, Biomaterials, and Nanomedicine (CIBER-BBN), Madrid, Spain
| | - Martin Wolkewitz
- Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
| | - Harald Binder
- Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
| |
Collapse
|
10
|
Niu M, Zhang L, Wang Y, Tu R, Liu X, Hou J, Huo W, Mao Z, Wang Z, Wang C. Genetic factors increase the identification efficiency of predictive models for dyslipidaemia: a prospective cohort study. Lipids Health Dis 2021; 20:11. [PMID: 33579296 PMCID: PMC7881493 DOI: 10.1186/s12944-021-01439-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Accepted: 01/27/2021] [Indexed: 11/10/2022] Open
Abstract
Background Few studies have developed risk models for dyslipidaemia, especially for rural populations. Furthermore, the performance of genetic factors in predicting dyslipidaemia has not been explored. The purpose of this study is to develop and evaluate prediction models with and without genetic factors for dyslipidaemia in rural populations. Methods A total of 3596 individuals from the Henan Rural Cohort Study were included in this study. According to the ratio of 7:3, all individuals were divided into a training set and a testing set. The conventional models and conventional+GRS (genetic risk score) models were developed with Cox regression, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) classifiers in the training set. The area under the receiver operating characteristic curve (AUC), net reclassification index (NRI), and integrated discrimination index (IDI) were used to assess the discrimination ability of the models, and the calibration curve was used to show calibration ability in the testing set. Results Compared to the lowest quartile of GRS, the hazard ratio (HR) (95% confidence interval (CI)) of individuals in the highest quartile of GRS was 1.23(1.07, 1.41) in the total population. Age, family history of diabetes, physical activity, body mass index (BMI), triglycerides (TGs), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C) were used to develop the conventional models, and the AUCs of the Cox, ANN, RF, and GBM classifiers were 0.702(0.673, 0.729), 0.736(0.708, 0.762), 0.787 (0.762, 0.811), and 0.816(0.792, 0.839), respectively. After adding GRS, the AUCs increased by 0.005, 0.018, 0.023, and 0.015 with the Cox, ANN, RF, and GBM classifiers, respectively. The corresponding NRI and IDI were 25.6, 7.8, 14.1, and 18.1% and 2.3, 1.0, 2.5, and 1.8%, respectively. Conclusion Genetic factors could improve the predictive ability of the dyslipidaemia risk model, suggesting that genetic information could be provided as a potential predictor to screen for clinical dyslipidaemia. Trial registration The Henan Rural Cohort Study has been registered at the Chinese Clinical Trial Register. (Trial registration: ChiCTR-OOC-15006699. Registered 6 July 2015 - Retrospectively registered). Supplementary Information The online version contains supplementary material available at 10.1186/s12944-021-01439-3.
Collapse
Affiliation(s)
- Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Liying Zhang
- School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, People's Republic of China
| | - Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Runqi Tu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Xiaotian Liu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Wenqian Huo
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Zhenfei Wang
- School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, People's Republic of China.
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China.
| |
Collapse
|
11
|
Naghavi A, Teismann T, Asgari Z, Mohebbian MR, Mansourian M, Mañanas MÁ. Accurate Diagnosis of Suicide Ideation/Behavior Using Robust Ensemble Machine Learning: A University Student Population in the Middle East and North Africa (MENA) Region. Diagnostics (Basel) 2020; 10:E956. [PMID: 33207776 PMCID: PMC7696788 DOI: 10.3390/diagnostics10110956] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/10/2020] [Accepted: 11/13/2020] [Indexed: 12/14/2022] Open
Abstract
Suicide is one of the most critical public health concerns in the world and the second cause of death among young people in many countries. However, to date, no study can diagnose suicide ideation/behavior among university students in the Middle East and North Africa (MENA) region using a machine learning approach. Therefore, stability feature selection and stacked ensembled decision trees were employed in this classification problem. A total of 573 university students responded to a battery of questionnaires. Three-fold cross-validation with a variety of performance indices was sued. The proposed diagnostic system had excellent balanced diagnosis accuracy (AUC = 0.90 [CI 95%: 0.86-0.93]) with a high correlation between predicted and observed class labels, fair discriminant power, and excellent class labeling agreement rate. Results showed that 23 items out of all items could accurately diagnose suicide ideation/behavior. These items were psychological problems and how to experience trauma, from the demographic variables, nine items from Post-Traumatic Stress Disorder Checklist (PCL-5), two items from Post Traumatic Growth (PTG), two items from the Patient Health Questionnaire (PHQ), six items from the Positive Mental Health (PMH) questionnaire, and one item related to social support. Such features could be used as a screening tool to identify young adults who are at risk of suicide ideation/behavior.
Collapse
Affiliation(s)
- Azam Naghavi
- Department of Counseling, Faculty of Education and Psychology, University of Isfahan, Azadi Sq, Isfahan 8174673441, Iran
| | - Tobias Teismann
- Department of Clinical Psychology and Psychotherapy, Ruhr-Universität Bochum, 44787 Bochum, Germany;
| | - Zahra Asgari
- Department of Counseling, Faculty of Education and Psychology, University of Isfahan, Isfahan 8174673441, Iran;
| | - Mohammad Reza Mohebbian
- Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, SK S7N5A9, Canada;
| | - Marjan Mansourian
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain;
- Epidemiology and Biostatistics Department, Health School, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran
| | - Miguel Ángel Mañanas
- Biomedical Engineering Research Centre (CREB), Automatic Control Department (ESAII), Universitat Politècnica de Catalunya-Barcelona Tech (UPC), 08028 Barcelona, Spain;
- Biomedical Research Networking Center in Bioengineering, Biomaterials, and Nanomedicine (CIBER-BBN), 28029 Madrid, Spain
| |
Collapse
|
12
|
Bogari NM, Aljohani A, Dannoun A, Elkhateeb O, Porqueddu M, Amin AA, Bogari DN, Taher MM, Buba F, Allam RM, Bogari MN, Alamanni F. Association between HindIII (rs320) variant in the lipoprotein lipase gene and the presence of coronary artery disease and stroke among the Saudi population. Saudi J Biol Sci 2020; 27:2018-2024. [PMID: 32714026 PMCID: PMC7376116 DOI: 10.1016/j.sjbs.2020.06.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/13/2020] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
Lipoprotein Lipase (LPL) is known to be a key enzyme for lipid metabolism specifically in an enzymatic glycoprotein which provide tissues without fatty-acids and eliminates triglycerides (TG) by the circulation. Mutations in LPL were proven to cause alteration in fractions within lipoprotein, causing the development of atherosclerosis which predispose to weakening coronary artery disease (CAD) and stroke. We examined the linkage between genetic variant HindIII in LPL on lipoprotein fractions, stroke occurrences and CAD. In this case-control study, we have recruited 315 CAD cases and 205 age-matched controls. A total of 520 genomic DNA was digested with the purified PCR products for restriction fragment length polymorphism with HindIII restriction enzyme. The distribution of genotypes in a decreasing order were TT, 148 (47%), GT 135 (42.9%) and GG 32 (10.2%) in CAD groups of the study while the pattern in controls were GT 91 (44.4%), TT 86 (42%) and GG 28 (13.7%). None of all the allele or genotype frequencies were found to be significant in our study (p greater than 0.05), while the biochemical levels for both TG and LDL-c were shown to be prone in CAD patients when compare with the controls. Furthermore, the occurence of strokes were more in CAD groups vs. controls: 72 (22.9%) vs. 7 (3.4%) [p 0.000]. This could indicate the influence of HindIII variant on plasma lipid levels, and the possibility of considering it a risk factor for atherosclerosis leading to CAD and stroke occurrence.
Collapse
Affiliation(s)
- Neda M Bogari
- Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University (UQU), Saudi Arabia
| | - Ashwag Aljohani
- Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University (UQU), Saudi Arabia
| | - Anas Dannoun
- Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University (UQU), Saudi Arabia
| | - Osama Elkhateeb
- Department of Cardiology, King Abdulla Medical city, Makkah, Saudi Arabia.,Department of Cardiology, Dalhousie University Halifax, Nova Scotia, Canada
| | - Masimo Porqueddu
- Department of Cardiac Surgery, King Fahd Armed Medical Forces Hospitals, Jeddah, KSA, Saudi Arabia.,Department of Cardiac Surgery, Monzino Heart Center, University of Milan, Milan, Italy
| | - Amr A Amin
- Biochemistry Department, Faculty of medicine, UQU, Saudi Arabia.,Faculty of Medicine, Ain-Shams University, Egypt
| | - Dema N Bogari
- Biomedical Sciences, University of Brighton, England, UK
| | - Mohiuddin M Taher
- Department of Medical Genetics, Faculty of Medicine, Umm Al-Qura University (UQU), Saudi Arabia.,Science and technology Unit, UQU, Saudi Arabia
| | - Faruk Buba
- Department of Internal Medicine, College of Medical Sciences, University of Maiduguri, Nigeria
| | - Reem M Allam
- Clinical Pathology Department, Faculty of Medicine, Zagazig University, Egypt
| | | | - Francesco Alamanni
- Department of Cardiac Surgery, Monzino Heart Center, University of Milan, Milan, Italy
| |
Collapse
|
13
|
Blind, Cuff-less, Calibration-Free and Continuous Blood Pressure Estimation using Optimized Inductive Group Method of Data Handling. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101682] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
14
|
Zhang X, Tang F, Ji J, Han W, Lu P. Risk Prediction of Dyslipidemia for Chinese Han Adults Using Random Forest Survival Model. Clin Epidemiol 2019; 11:1047-1055. [PMID: 31849535 PMCID: PMC6911320 DOI: 10.2147/clep.s223694] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 11/29/2019] [Indexed: 12/24/2022] Open
Abstract
OBJECTIVE Dyslipidemia has been recognized as a major risk factor of several diseases, and early prevention and management of dyslipidemia is effective in the primary prevention of cardiovascular events. The present study aims to develop risk models for predicting dyslipidemia using Random Survival Forest (RSF), which take the complex relationship between the variables into account. METHODS We used data from 6328 participants aged between 19 and 90 years free of dyslipidemia at baseline with a maximum follow-up of 5 years. RSF was applied to develop gender-specific risk model for predicting dyslipidemia using variables from anthropometric and laboratory test in the cohort. Cox regression was also adopted in comparison with the RSF model, and Harrell's concordance statistic with 10-fold cross-validation was used to validate the models. RESULTS The incidence density of dyslipidemia was 101/1000 in total and subgroup incidence densities were 121/1000 for men and 69/1000 for women. Twenty-four predictors were identified in the prediction model of males and 23 in females. The C-statistics of the prediction models for males and females were 0.731 and 0.801, respectively. The RSF model shows better discriminative performance than CPH model (0.719 for males and 0.787 for females). Moreover, some predictors were observed to have a nonlinear effect on dyslipidemia. CONCLUSION The RSF model is a promising method in identifying high-risk individuals for the prevention of dyslipidemia and related diseases.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- School of Statistics, Shandong University of Finance and Economics, Jinan, People’s Republic of China
| | - Fang Tang
- Center for Data Science in Health and Medicine, Shandong Provincial Qianfoshan Hospital, The First Hospital Affiliated with Shandong First Medical University, Jinan, People’s Republic of China
| | - Jiadong Ji
- School of Statistics, Shandong University of Finance and Economics, Jinan, People’s Republic of China
| | - Wenting Han
- Department of Preventive Medicine, School of Public Health and Management, Binzhou Medical University, Yantai, People’s Republic of China
| | - Peng Lu
- Department of Preventive Medicine, School of Public Health and Management, Binzhou Medical University, Yantai, People’s Republic of China
| |
Collapse
|