1
|
Sadegh-Zadeh SA, Soleimani Mamalo A, Kavianpour K, Atashbar H, Heidari E, Hajizadeh R, Roshani AS, Habibzadeh S, Saadat S, Behmanesh M, Saadat M, Gargari SS. Artificial intelligence approaches for tinnitus diagnosis: leveraging high-frequency audiometry data for enhanced clinical predictions. Front Artif Intell 2024; 7:1381455. [PMID: 38774833 PMCID: PMC11106786 DOI: 10.3389/frai.2024.1381455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 04/22/2024] [Indexed: 05/24/2024] Open
Abstract
This research investigates the application of machine learning to improve the diagnosis of tinnitus using high-frequency audiometry data. A Logistic Regression (LR) model was developed alongside an Artificial Neural Network (ANN) and various baseline classifiers to identify the most effective approach for classifying tinnitus presence. The methodology encompassed data preprocessing, feature extraction focused on point detection, and rigorous model evaluation through performance metrics including accuracy, Area Under the ROC Curve (AUC), precision, recall, and F1 scores. The main findings reveal that the LR model, supported by the ANN, significantly outperformed other machine learning models, achieving an accuracy of 94.06%, an AUC of 97.06%, and high precision and recall scores. These results demonstrate the efficacy of the LR model and ANN in accurately diagnosing tinnitus, surpassing traditional diagnostic methods that rely on subjective assessments. The implications of this research are substantial for clinical audiology, suggesting that machine learning, particularly advanced models like ANNs, can provide a more objective and quantifiable tool for tinnitus diagnosis, especially when utilizing high-frequency audiometry data not typically assessed in standard hearing tests. The study underscores the potential for machine learning to facilitate earlier and more accurate tinnitus detection, which could lead to improved patient outcomes. Future work should aim to expand the dataset diversity, explore a broader range of algorithms, and conduct clinical trials to validate the models' practical utility. The research highlights the transformative potential of machine learning, including the LR model and ANN, in audiology, paving the way for advancements in the diagnosis and treatment of tinnitus.
Collapse
Affiliation(s)
- Seyed-Ali Sadegh-Zadeh
- Department of Computing, School of Digital, Technologies and Arts, Staffordshire University, Stoke-on-Trent, United Kingdom
| | | | - Kaveh Kavianpour
- Department of Computer Science and Mathematics, Amirkabir University of Technology, Tehran, Iran
| | - Hamed Atashbar
- Department of Computer Science and Mathematics, Amirkabir University of Technology, Tehran, Iran
| | - Elham Heidari
- Department of Computer Science and Mathematics, Amirkabir University of Technology, Tehran, Iran
| | - Reza Hajizadeh
- Department of Cardiology, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran
| | - Amir Sam Roshani
- Department of Otorhinolaryngology - Head and Neck Surgery, Imam Khomeini University Hospital, Urmia, Iran
| | - Shima Habibzadeh
- Department of Audiology, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Shayan Saadat
- Hull York Medical School, University of York, York, United Kingdom
| | - Majid Behmanesh
- Student Research Committee, Urmia University of Medical Sciences, Urmia, Iran
| | - Mozafar Saadat
- Department of Mechanical Engineering, School of Engineering, University of Birmingham, Birmingham, United Kingdom
| | | |
Collapse
|
2
|
Mlakar M, Gradišek A, Luštrek M, Jurak G, Sorić M, Leskošek B, Starc G. Adult height prediction using the growth curve comparison method. PLoS One 2023; 18:e0281960. [PMID: 36795791 PMCID: PMC9934345 DOI: 10.1371/journal.pone.0281960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 02/04/2023] [Indexed: 02/17/2023] Open
Abstract
Understanding the growth pattern is important in view of child and adolescent development. Due to different tempo of growth and timing of adolescent growth spurt, individuals reach their adult height at different ages. Accurate models to assess the growth involve intrusive radiological methods whereas the predictive models based solely on height data are typically limited to percentiles and therefore rather inaccurate, especially during the onset of puberty. There is a need for more accurate non-invasive methods for height prediction that are easily applicable in the fields of sports and physical education, as well as in endocrinology. We developed a novel method, called Growth Curve Comparison (GCC), for height prediction, based on a large cohort of > 16,000 Slovenian schoolchildren followed yearly from ages 8 to 18. We compared the GCC method to the percentile method, linear regressor, decision tree regressor, and extreme gradient boosting. The GCC method outperformed the predictions of other methods over the entire age span both in boys and girls. The method was incorporated into a publicly available web application. We anticipate our method to be applicable also to other models predicting developmental outcomes of children and adolescents, such as for comparison of any developmental curves of anthropometric as well as fitness data. It can serve as a useful tool for assessment, planning, implementation, and monitoring of somatic and motor development of children and youth.
Collapse
Affiliation(s)
- Miha Mlakar
- Department of Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Anton Gradišek
- Department of Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia
- * E-mail: (AG); (GS)
| | - Mitja Luštrek
- Department of Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Gregor Jurak
- Faculty of Sport, University of Ljubljana, Ljubljana, Slovenia
| | - Maroje Sorić
- Faculty of Sport, University of Ljubljana, Ljubljana, Slovenia
- Faculty of Kinesiology, University of Zagreb, Zagreb, Croatia
| | - Bojan Leskošek
- Faculty of Sport, University of Ljubljana, Ljubljana, Slovenia
| | - Gregor Starc
- Faculty of Sport, University of Ljubljana, Ljubljana, Slovenia
- * E-mail: (AG); (GS)
| |
Collapse
|
3
|
Ma Y, Lu Q, Yuan F, Chen H. Comparison of the effectiveness of different machine learning algorithms in predicting new fractures after PKP for osteoporotic vertebral compression fractures. J Orthop Surg Res 2023; 18:62. [PMID: 36683045 PMCID: PMC9869614 DOI: 10.1186/s13018-023-03551-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 01/19/2023] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The use of machine learning has the potential to estimate the probability of a second classification event more accurately than traditional statistical methods, and few previous studies on predicting new fractures after osteoporotic vertebral compression fractures (OVCFs) have focussed on this point. The aim of this study was to explore whether several different machine learning models could produce better predictions than logistic regression models and to select an optimal model. METHODS A retrospective analysis of 529 patients who underwent percutaneous kyphoplasty (PKP) for OVCFs at our institution between June 2017 and June 2020 was performed. The patient data were used to create machine learning (including decision trees (DT), random forests (RF), support vector machines (SVM), gradient boosting machines (GBM), neural networks (NNET), and regularized discriminant analysis (RDA)) and logistic regression models (LR) to estimate the probability of new fractures occurring after surgery. The dataset was divided into a training set (75%) and a test set (25%), and machine learning models were built in the training set after ten cross-validations, after which each model was evaluated in the test set, and model performance was assessed by comparing the area under the curve (AUC) of each model. RESULTS Among the six machine learning algorithms, except that the AUC of DT [0.775 (95% CI 0.728-0.822)] was lower than that of LR [0.831 (95% CI 0.783-0.878)], RA [0.953 (95% CI 0.927-0.980)], GBM [0.941 (95% CI 0.911-0.971)], SVM [0.869 (95% CI 0.827-0.910), NNET [0.869 (95% CI 0.826-0.912)], and RDA [0.890 (95% CI 0.851-0.929)] were all better than LR. CONCLUSIONS For prediction of the probability of new fracture after PKP, machine learning algorithms outperformed logistic regression, with random forest having the strongest predictive power.
Collapse
Affiliation(s)
- Yiming Ma
- Department of Orthopaedic Surgery, Affiliated Hospital of Xuzhou Medical University, 99 Huaihai Road, Xuzhou, 221006 Jiangsu China
- Xuzhou Medical University, 209 Tongshan Road, Xuzhou, 221004 Jiangsu China
| | - Qi Lu
- Department of Orthopaedic Surgery, Affiliated Hospital of Xuzhou Medical University, 99 Huaihai Road, Xuzhou, 221006 Jiangsu China
- Xuzhou Medical University, 209 Tongshan Road, Xuzhou, 221004 Jiangsu China
| | - Feng Yuan
- Department of Orthopaedic Surgery, Affiliated Hospital of Xuzhou Medical University, 99 Huaihai Road, Xuzhou, 221006 Jiangsu China
| | - Hongliang Chen
- Department of Orthopaedic Surgery, Affiliated Hospital of Xuzhou Medical University, 99 Huaihai Road, Xuzhou, 221006 Jiangsu China
| |
Collapse
|
4
|
Wang H, Zhou Z, Li H, Xiang W, Lan Y, Dou X, Zhang X. Blood Biomarkers Panels for Screening of Colorectal Cancer and Adenoma on a Machine Learning-Assisted Detection Platform. Cancer Control 2023; 30:10732748231222109. [PMID: 38146088 PMCID: PMC10750512 DOI: 10.1177/10732748231222109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/01/2023] [Accepted: 11/21/2023] [Indexed: 12/27/2023] Open
Abstract
OBJECTIVE A mini-invasive and good-compliance program is critical to broaden colorectal cancer (CRC) screening and reduce CRC-related mortality. Blood testing combined with imaging examination has been proved to be feasible on screen for multicancer and guide intervention. The study aims to construct a machine learning-assisted detection platform with available multi-targets for CRC and colorectal adenoma (CRA) screening. METHODS This was a retrospective study that the blood test data from 204 CRCs, 384 CRAs, and 229 healthy controls was extracted. The classified models were constructed with 4 machine learning (ML) algorithms including support vector machine (SVM), random forest (RF), decision tree (DT), and eXtreme Gradient Boosting (XGB) based on the candidate biomarkers. The importance index was used by SHapely Adaptive exPlanations (SHAP) analysis to identify the dominant characteristics. The performance of classified models was evaluated. The most dominating features from the proposed panel were developed by logistic regression (LR) for identification CRC from control. RESULTS The candidate biomarkers consisted of 26 multi-targets panel including CEA, AFP, and so on. Among the 4 models, the SVM classifier for CRA yields the best predictive performance (the area under the receiver operating curve, AUC: .925, sensitivity: .904, and specificity: .771). As for CRC classification, the RF model with 26 candidate biomarkers provided the best predictive parameters (AUC: .941, sensitivity: .902, and specificity: .912). Compared with CEA and CA199, the predictive performance was significantly improved. The streamlined model with 6 biomarkers for CRC also obtained a good performance (AUC: .946, sensitivity: .885, and specificity: .913). CONCLUSIONS The predictive models consisting of 26 multi-targets panel would be used as a non-invasive, economical, and effective risk stratification platform, which was expected to be applied for auxiliary screening of CRA and CRC in clinical practice.
Collapse
Affiliation(s)
- Hui Wang
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Medical Laboratory of the Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Zhiwei Zhou
- Shenzhen Luohu People's Hospital, The Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Haijun Li
- Shenzhen Luohu People's Hospital, The Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Weiguang Xiang
- Shenzhen Luohu People's Hospital, The Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Yilin Lan
- Shenzhen Luohu People's Hospital, The Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Xiaowen Dou
- Medical Laboratory of the Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| | - Xiuming Zhang
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Medical Laboratory of the Third Affiliated Hospital of Shenzhen University, Shenzhen, Guangdong, China
| |
Collapse
|
5
|
Bhimala KR, Patra GK, Mopuri R, Mutheneni SR. Prediction of COVID-19 cases using the weather integrated deep learning approach for India. Transbound Emerg Dis 2022. [PMID: 33837675 DOI: 10.1111/tbed.14102.advanceonlinepublication.10.1111/tbed.14102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Advanced and accurate forecasting of COVID-19 cases plays a crucial role in planning and supplying resources effectively. Artificial Intelligence (AI) techniques have proved their capability in time series forecasting non-linear problems. In the present study, the relationship between weather factor and COVID-19 cases was assessed, and also developed a forecasting model using long short-term memory (LSTM), a deep learning model. The study found that the specific humidity has a strong positive correlation, whereas there is a negative correlation with maximum temperature, and a positive correlation with minimum temperature was observed in various geographic locations of India. The weather data and COVID-19 confirmed case data (1 April to 30 June 2020) were used to optimize univariate and multivariate LSTM time series forecast models. The optimized models were utilized to forecast the daily COVID-19 cases for the period 1 July 2020 to 31 July 2020 with 1 to 14 days of lead time. The results showed that the univariate LSTM model was reasonably good for the short-term (1 day lead) forecast of COVID-19 cases (relative error <20%). Moreover, the multivariate LSTM model improved the medium-range forecast skill (1-7 days lead) after including the weather factors. The study observed that the specific humidity played a crucial role in improving the forecast skill majorly in the West and northwest region of India. Similarly, the temperature played a significant role in model enhancement in the Southern and Eastern regions of India.
Collapse
Affiliation(s)
| | | | - Rajasekhar Mopuri
- ENVIS Resource Partner on Climate Change and Public Health, Applied Biology Division, CSIR-Indian Institute of Chemical Technology (CSIR-IICT), Hyderabad, Telegana, India
| | - Srinivasa Rao Mutheneni
- ENVIS Resource Partner on Climate Change and Public Health, Applied Biology Division, CSIR-Indian Institute of Chemical Technology (CSIR-IICT), Hyderabad, Telegana, India
| |
Collapse
|
6
|
Development of a multi-stage model for intelligent and quantitative appraising of skeletal maturity using cervical vertebras cone-beam CT images of Chinese girls. Int J Comput Assist Radiol Surg 2022; 17:761-773. [PMID: 34982398 DOI: 10.1007/s11548-021-02550-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 12/17/2021] [Indexed: 11/05/2022]
Abstract
PURPOSE Nowadays, the integration of Artificial intelligence algorithms and quantified radiographic imaging-based diagnostic procedures is hailing amplified deliberation particularly in assessment of skeletal maturity. So we intend to formulate a logistic regression model for intelligent and quantitative estimation of Fishman skeletal maturation index (SMI) based on the parameters attained from the cervical vertebrae CBCT images of Chinese girls. METHODS From 709 hand wrist radiographs and CBCT images, 447 samples were randomly selected (called as G1) to build a logistic regression model. The reliability and reproducibility were assessed by the intraclass correlation coefficient (ICC) and weighted Cohen's kappa, followed by Spearman's rank correlation coefficient to identify the parameters significantly associated with the SMI. Two hundred and sixty-two other subjects (named G2) were recruited for external examination of the models by direct visual comparison and the receiver operating characteristic (ROC) curve. In cases of confusion and mispredictions, the model was modified to improve the consistency. RESULTS Five significant parameters (Chronological age, C3 height (H3)[Formula: see text], C4 upper width (UW4), C4 lower width (LW4), and the ratio of posterior height to lower width of C4 ([Formula: see text]) were administered into logistic regression model. Despite total agreement percentage which was 84% (total AUC = 0.92), unsatisfactory performance was noticed for the 6th and 8th stages which were confused with their neighboring stages. After adjustments of the models, the total agreement percentage and AUC were upgraded to 88% and 0.96, respectively. CONCLUSION Consistency and fitness evaluation of our models demonstrated adequate prediction percentage and reliability for automated classification of skeletal maturation. The presented constructed logistic regression model has the potential to serve as a maturity evaluation index in clinical craniofacial orthopedics in Chinese girls. The proposed model in this study showed promising strength for being expended in the event of other clinical multi-stage conditions.
Collapse
|
7
|
Patient Factors That Matter in Predicting Hip Arthroplasty Outcomes: A Machine-Learning Approach. J Arthroplasty 2021; 36:2024-2032. [PMID: 33558044 DOI: 10.1016/j.arth.2020.12.038] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 12/09/2020] [Accepted: 12/22/2020] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Despite the success of total hip arthroplasty (THA), approximately 10%-15% of patients will be dissatisfied with their outcome. Identifying patients at risk of not achieving meaningful gains postoperatively is critical to pre-surgical counseling and clinical decision support. Machine learning has shown promise in creating predictive models. This study used a machine-learning model to identify patient-specific variables that predict the postoperative functional outcome in THA. METHODS A prospective longitudinal cohort of 160 consecutive patients undergoing total hip replacement for the treatment of degenerative arthritis completed self-reported measures preoperatively and at 3 months postoperatively. Using four types of independent variables (patient demographics, patient-reported health, cognitive appraisal processes and surgical approach), a machine-learning model utilizing Least Absolute Shrinkage Selection Operator (LASSO) was constructed to predict postoperative Hip Disability and Osteoarthritis Outcome Score (HOOS) at 3 months. RESULTS The most predictive independent variables of postoperative HOOS were cognitive appraisal processes. Variables that predicted a worse HOOS consisted of frequent thoughts of work (β = -0.34), frequent comparison to healthier peers (β = -0.26), increased body mass index (β = -0.17), increased medical comorbidities (β = -0.19), and the anterior surgical approach (β = -0.15). Variables that predicted a better HOOS consisted of employment at the time of surgery (β = 0.17), and thoughts related to family interaction (β = 0.12), trying not to complain (β = 0.13), and helping others (β = 0.22). CONCLUSIONS This clinical prediction model in THA revealed that the factors most predictive of outcome were cognitive appraisal processes, demonstrating their importance to outcome-based research. LEVEL OF EVIDENCE Prognostic Level 1.
Collapse
|
8
|
Bhimala KR, Patra GK, Mopuri R, Mutheneni SR. Prediction of COVID-19 cases using the weather integrated deep learning approach for India. Transbound Emerg Dis 2021; 69:1349-1363. [PMID: 33837675 PMCID: PMC8250893 DOI: 10.1111/tbed.14102] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/31/2021] [Accepted: 04/04/2021] [Indexed: 12/30/2022]
Abstract
Advanced and accurate forecasting of COVID‐19 cases plays a crucial role in planning and supplying resources effectively. Artificial Intelligence (AI) techniques have proved their capability in time series forecasting non‐linear problems. In the present study, the relationship between weather factor and COVID‐19 cases was assessed, and also developed a forecasting model using long short‐term memory (LSTM), a deep learning model. The study found that the specific humidity has a strong positive correlation, whereas there is a negative correlation with maximum temperature, and a positive correlation with minimum temperature was observed in various geographic locations of India. The weather data and COVID‐19 confirmed case data (1 April to 30 June 2020) were used to optimize univariate and multivariate LSTM time series forecast models. The optimized models were utilized to forecast the daily COVID‐19 cases for the period 1 July 2020 to 31 July 2020 with 1 to 14 days of lead time. The results showed that the univariate LSTM model was reasonably good for the short‐term (1 day lead) forecast of COVID‐19 cases (relative error <20%). Moreover, the multivariate LSTM model improved the medium‐range forecast skill (1–7 days lead) after including the weather factors. The study observed that the specific humidity played a crucial role in improving the forecast skill majorly in the West and northwest region of India. Similarly, the temperature played a significant role in model enhancement in the Southern and Eastern regions of India.
Collapse
Affiliation(s)
| | | | - Rajasekhar Mopuri
- ENVIS Resource Partner on Climate Change and Public Health, Applied Biology Division, CSIR-Indian Institute of Chemical Technology (CSIR-IICT), Hyderabad, Telegana, India
| | - Srinivasa Rao Mutheneni
- ENVIS Resource Partner on Climate Change and Public Health, Applied Biology Division, CSIR-Indian Institute of Chemical Technology (CSIR-IICT), Hyderabad, Telegana, India
| |
Collapse
|
9
|
Hobbs M, Moltchanova E, Wicks C, Pringle A, Griffiths C, Radley D, Zwolinsky S. Investigating the environmental, behavioural, and sociodemographic determinants of attendance at a city-wide public health physical activity intervention: Longitudinal evidence over one year from 185,245 visits. Prev Med 2021; 143:106334. [PMID: 33227345 DOI: 10.1016/j.ypmed.2020.106334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 09/23/2020] [Accepted: 11/18/2020] [Indexed: 10/23/2022]
Abstract
Understanding the determinants of attendance at public health interventions is critical for effective policy development. Most research focuses on individual-level determinants of attendance, while less is known about environmental-level determinants. Data were obtained from the Leeds Let's Get Active public health intervention in Leeds, England. Longitudinal data (April 2015-March 2016) on attendance were obtained for n = 25,745 individuals (n = 185,245 total visits) with baseline data on sociodemographic determinants and lifestyle practices obtained for n = 3621 individuals. This resulted in a total of n = 744,468 days of attendance and non-attendance. Random forests were used to explore the relative importance of the determinants on attendance, while generalised linear models were applied to examine specific associations (n = 3621). The probability that a person will attend more than once, the number of return visits, and the probability that a person will attend on a particular day were investigated. When considering if a person returned to the same leisure centre after one visit, the most influential determinant was the distance from their home. When considering number of return visits overall however, age group was the most influential. While distance to a leisure centre was less important for predicting the number of return visits, the difference between estimates for 300 m and 15,000 m was 7-10 visits per year. Finally, calendar month was the most important determinant of daily attendance. This longitudinal study highlights the importance of both individual and environmental determinants in predicting various aspects of attendance. It has implications for strategies aiming to increase attendance at public health interventions.
Collapse
Affiliation(s)
- M Hobbs
- GeoHealth Laboratory, Geospatial Research Institute, University of Canterbury, Christchurch, Canterbury, New Zealand; Health Sciences, University of Canterbury, Christchurch, Canterbury, New Zealand.
| | - E Moltchanova
- School of Mathematics & Statistics, University of Canterbury, Christchurch, Canterbury, New Zealand
| | - C Wicks
- School of Health and Social Care, University of Essex, Colchester, United Kingdom
| | - A Pringle
- Sport, Outdoor & Exercise Sciences, University of Derby, Derby, United Kingdom
| | - C Griffiths
- Leeds Beckett University, Leeds, United Kingdom
| | - D Radley
- Leeds Beckett University, Leeds, United Kingdom
| | - S Zwolinsky
- West Yorkshire & Harrogate Cancer Alliance, Wakefield, United Kingdom
| |
Collapse
|
10
|
Philibert R, Long JD, Mills JA, Beach SRH, Gibbons FX, Gerrard M, Simons R, Pinho PB, Ingle D, Dawes K, Dogan T, Dogan M. A simple, rapid, interpretable, actionable and implementable digital PCR based mortality index. Epigenetics 2020; 16:1135-1149. [PMID: 33138668 DOI: 10.1080/15592294.2020.1841874] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Mortality assessments are conducted for both civil and commercial purposes. Recent advances in epigenetics have resulted in DNA methylation tools to assess risk and aid in this task. However, widely available array-based algorithms are not readily translatable into clinical tools and do not provide a good foundation for clinical recommendations. Further, recent work shows evidence of heritability and possible racial bias in these indices. Using a publicly available array data set, the Framingham Heart Study (FHS), we develop and test a five-locus mortality-risk algorithm using only previously validated methylation biomarkers that have been shown to be free of racial bias, and that provide specific assessments of smoking, alcohol consumption, diabetes and heart disease. We show that a model using age, sex and methylation measurements at these five loci outperforms the 513 probe Levine index and approximates the predictive power of the 1030 probe GrimAge index. We then show each of the five loci in our algorithm can be assessed using a more powerful, reference-free digital PCR approach, further demonstrating that it is readily clinically translatable. Finally, we show the loci do not reflect ethnically specific variation. We conclude that this algorithm is a simple, yet powerful tool for assessing mortality risk. We further suggest that the output from this or similarly derived algorithms using either array or digital PCR can be used to provide powerful feedback to patients, guide recommendations for additional medical assessments, and help monitor the effect of public health prevention interventions.
Collapse
Affiliation(s)
- Robert Philibert
- Department of Psychiatry, University of Iowa, Iowa City, IA, USA.,Behavioral Diagnostics LLC, Coralville, IA, USA
| | - Jeffrey D Long
- Department of Psychiatry, University of Iowa, Iowa City, IA, USA.,Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | - James A Mills
- Department of Psychiatry, University of Iowa, Iowa City, IA, USA
| | - S R H Beach
- Center for Family Research, University of Georgia, Athens, GA USA
| | | | - Meg Gerrard
- Department of Psychology, University of Connecticut, Storrs, CT, USA
| | - Ron Simons
- Department of Sociology, University of Georgia, Athens, GA, USA
| | | | - Douglas Ingle
- Association of Home Office Underwriters, Washington, DC, USA
| | - Kelsey Dawes
- Department of Psychiatry, University of Iowa, Iowa City, IA, USA
| | - Timur Dogan
- Behavioral Diagnostics LLC, Coralville, IA, USA.,Cardio Diagnostics Inc, Coralville, IA, USA
| | - Meeshanthini Dogan
- Department of Psychiatry, University of Iowa, Iowa City, IA, USA.,Behavioral Diagnostics LLC, Coralville, IA, USA.,Cardio Diagnostics Inc, Coralville, IA, USA
| |
Collapse
|
11
|
Using OpenStreetMap Data and Machine Learning to Generate Socio-Economic Indicators. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2020. [DOI: 10.3390/ijgi9090498] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Socio-economic indicators are key to understanding societal challenges. They disassemble complex phenomena to gain insights and deepen understanding. Specific subsets of indicators have been developed to describe sustainability, human development, vulnerability, risk, resilience and climate change adaptation. Nonetheless, insufficient quality and availability of data often limit their explanatory power. Spatial and temporal resolution are often not at a scale appropriate for monitoring. Socio-economic indicators are mostly provided by governmental institutions and are therefore limited to administrative boundaries. Furthermore, different methodological computation approaches for the same indicator impair comparability between countries and regions. OpenStreetMap (OSM) provides an unparalleled standardized global database with a high spatiotemporal resolution. Surprisingly, the potential of OSM seems largely unexplored in this context. In this study, we used machine learning to predict four exemplary socio-economic indicators for municipalities based on OSM. By comparing the predictive power of neural networks to statistical regression models, we evaluated the unhinged resources of OSM for indicator development. OSM provides prospects for monitoring across administrative boundaries, interdisciplinary topics, and semi-quantitative factors like social cohesion. Further research is still required to, for example, determine the impact of regional and international differences in user contributions on the outputs. Nonetheless, this database can provide meaningful insight into otherwise unknown spatial differences in social, environmental or economic inequalities.
Collapse
|
12
|
Missing Value Imputation in Stature Estimation by Learning Algorithms Using Anthropometric Data: A Comparative Study. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10145020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Estimating stature is essential in the process of personal identification. Because it is difficult to find human remains intact at crime scenes and disaster sites, for instance, methods are needed for estimating stature based on different body parts. For instance, the upper and lower limbs may vary depending on ancestry and sex, and it is of great importance to design adequate methodology for incorporating these in estimating stature. In addition, it is necessary to use machine learning rather than simple linear regression to improve the accuracy of stature estimation. In this study, the accuracy of statures estimated based on anthropometric data was compared using three imputation methods. In addition, by comparing the accuracy among linear and nonlinear classification methods, the best method was derived for estimating stature based on anthropometric data. For both sexes, multiple imputation was superior when the missing data ratio was low, and mean imputation performed well when the ratio was high. The support vector machine recorded the highest accuracy in all ratios of missing data. The findings of this study showed appropriate imputation methods for estimating stature with missing anthropometric data. In particular, the machine learning algorithms can be effectively used for estimating stature in humans.
Collapse
|
13
|
Salami D, Sousa CA, Martins MDRO, Capinha C. Predicting dengue importation into Europe, using machine learning and model-agnostic methods. Sci Rep 2020; 10:9689. [PMID: 32546771 PMCID: PMC7298036 DOI: 10.1038/s41598-020-66650-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 05/21/2020] [Indexed: 01/28/2023] Open
Abstract
The geographical spread of dengue is a global public health concern. This is largely mediated by the importation of dengue from endemic to non-endemic areas via the increasing connectivity of the global air transport network. The dynamic nature and intrinsic heterogeneity of the air transport network make it challenging to predict dengue importation. Here, we explore the capabilities of state-of-the-art machine learning algorithms to predict dengue importation. We trained four machine learning classifiers algorithms, using a 6-year historical dengue importation data for 21 countries in Europe and connectivity indices mediating importation and air transport network centrality measures. Predictive performance for the classifiers was evaluated using the area under the receiving operating characteristic curve, sensitivity, and specificity measures. Finally, we applied practical model-agnostic methods, to provide an in-depth explanation of our optimal model's predictions on a global and local scale. Our best performing model achieved high predictive accuracy, with an area under the receiver operating characteristic score of 0.94 and a maximized sensitivity score of 0.88. The predictor variables identified as most important were the source country's dengue incidence rate, population size, and volume of air passengers. Network centrality measures, describing the positioning of European countries within the air travel network, were also influential to the predictions. We demonstrated the high predictive performance of a machine learning model in predicting dengue importation and the utility of the model-agnostic methods to offer a comprehensive understanding of the reasons behind the predictions. Similar approaches can be utilized in the development of an operational early warning surveillance system for dengue importation.
Collapse
Affiliation(s)
- Donald Salami
- Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, Global Health and Tropical Medicine, Lisbon, 1349-008, Portugal.
| | - Carla Alexandra Sousa
- Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, Global Health and Tropical Medicine, Lisbon, 1349-008, Portugal.
| | - Maria do Rosário Oliveira Martins
- Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, Global Health and Tropical Medicine, Lisbon, 1349-008, Portugal
| | - César Capinha
- Centro de Estudos Geográficos, Instituto de Geografia e Ordenamento do Território - IGOT, Universidade de Lisboa, 1600-276, Lisboa, Portugal.
| |
Collapse
|
14
|
Lo KC, Lin HH, Lin CS. A novel method for assessing oral mixing ability based on the spatial clusters quantified by variogram. J Oral Rehabil 2020; 47:951-960. [PMID: 32347574 DOI: 10.1111/joor.12954] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 01/11/2020] [Accepted: 02/15/2020] [Indexed: 12/11/2022]
Abstract
BACKGROUND The two-colour chewing test (TCCT) has been widely used for assessing oral mixing ability, a critical component of masticatory performance. Most studies focused on quantifying the evenness of colour distribution. It remained unknown if the variation of colour clustering was a valid index of oral mixing ability. OBJECTIVE The study aims to investigate the oral mixing ability based on the spatial clusters quantified by variogram. METHODS Fifty older people (15 male/35 female, age: 66.0 ± 7.8 years) were assessed for the TCCT and the colour-changeable chewing gum test (CCGT). For the CCGT, we quantified the degree of colour change (ΔE). For the TCCT, the highest peak in colour histogram (HP), the standard deviation of colour values (SDC) and the range of variogram from colour spatial distribution (VARG) were quantified. The participants were grouped according to the contacts of posterior teeth, as assessed by Eichner Index (EI). RESULTS Highest peak, SDC and VARG showed statistically significant differences between the EI groups (two-tailed independent t test P < .05). Higher VARG (ie a lower degree of clustering) was significantly negatively correlated with ΔE (r = -.36, one-tailed P < .01). The binary logistic regression revealed that among the spatial indices (HP, SDC and VARG), only VARG achieved statistical significance in prediction to the EI group. Eliminating other indices was insignificant to the model performance. CONCLUSIONS Our results show that the averaged cluster sizes, quantified by variogram, are a valid index for quantifying the TCCT. Compared with other spatial indices, it had the best predictability to the condition of posterior contact.
Collapse
Affiliation(s)
- Kuang-Chuan Lo
- Department of Dentistry, School of Dentistry, National Yang-Ming University, Taipei, Taiwan
| | - Hsiao-Han Lin
- Department of Dentistry, School of Dentistry, National Yang-Ming University, Taipei, Taiwan
| | - Chia-Shu Lin
- Department of Dentistry, School of Dentistry, National Yang-Ming University, Taipei, Taiwan.,Institute of Brain Science, National Yang-Ming University, Taipei, Taiwan.,Brain Research Center, National Yang-Ming University, Taipei, Taiwan
| |
Collapse
|
15
|
Rawashdeh H, Awawdeh S, Shannag F, Henawi E, Faris H, Obeid N, Hyett J. Intelligent system based on data mining techniques for prediction of preterm birth for women with cervical cerclage. Comput Biol Chem 2020; 85:107233. [PMID: 32106071 DOI: 10.1016/j.compbiolchem.2020.107233] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 02/07/2020] [Accepted: 02/08/2020] [Indexed: 02/02/2023]
Abstract
Preterm birth, defined as a delivery before 37 weeks' gestation, continues to affect 8-15% of all pregnancies and is associated with significant neonatal morbidity and mortality. Effective prediction of timing of delivery among women identified to be at significant risk for preterm birth would allow proper implementation of prophylactic therapeutic interventions. This paper aims first to develop a model that acts as a decision support system for pregnant women at high risk of delivering prematurely before having cervical cerclage. The model will predict whether the pregnancy will continue beyond 26 weeks' gestation and the potential value of adding the cerclage in prolonging the pregnancy. The second aim is to develop a model that predicts the timing of spontaneous delivery in this high risk cohort after cerclage. The model will help treating physicians to define the chronology of management in relation to the risk of preterm birth, reducing the neonatal complications associated with it. Data from 274 pregnancies managed with cervical cerclage were included. 29 of the procedures involved multiple pregnancies. To build the first model, a data balancing technique called SMOTE was applied to overcome the problem of highly imbalanced class distribution in the dataset. After that, four classification models, namely Decision Tree, Random Forest, K-Nearest Neighbors (K-NN), and Neural Network (NN) were used to build the prediction model. The results showed that Random Forest classifier gave the best results in terms of G-mean and sensitivity with values of 0.96 and 1.00, respectively. These results were achieved at an oversampling ratio of 200%. For the second prediction model, five classification models were used to predict the time of spontaneous delivery; linear regression, Gaussian process, Random Forest, K-star, and LWL classifier. The Random Forest classifier performed best, with 0.752 correlation value. In conclusion, computational models can be developed to predict the need for cerclage and the gestation of delivery after this procedure. These models have moderate/high sensitivity for clinical application.
Collapse
Affiliation(s)
- Hasan Rawashdeh
- Department of Obstetrics and Gynaecology, Jordan University of Science and Technology, Jordan.
| | - Shatha Awawdeh
- King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.
| | - Fatima Shannag
- King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.
| | - Esraa Henawi
- King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.
| | - Hossam Faris
- King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.
| | - Nadim Obeid
- King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.
| | - Jon Hyett
- Discipline of Obstetrics, Gynaecology and Neonatology, University of Sydney, Sydney, Australia.
| |
Collapse
|
16
|
Stark GF, Hart GR, Nartowt BJ, Deng J. Predicting breast cancer risk using personal health data and machine learning models. PLoS One 2019; 14:e0226765. [PMID: 31881042 PMCID: PMC6934281 DOI: 10.1371/journal.pone.0226765] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 12/03/2019] [Indexed: 12/23/2022] Open
Abstract
Among women, breast cancer is a leading cause of death. Breast cancer risk predictions can inform screening and preventative actions. Previous works found that adding inputs to the widely-used Gail model improved its ability to predict breast cancer risk. However, these models used simple statistical architectures and the additional inputs were derived from costly and / or invasive procedures. By contrast, we developed machine learning models that used highly accessible personal health data to predict five-year breast cancer risk. We created machine learning models using only the Gail model inputs and models using both Gail model inputs and additional personal health data relevant to breast cancer risk. For both sets of inputs, six machine learning models were trained and evaluated on the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial data set. The area under the receiver operating characteristic curve metric quantified each model’s performance. Since this data set has a small percentage of positive breast cancer cases, we also reported sensitivity, specificity, and precision. We used Delong tests (p < 0.05) to compare the testing data set performance of each machine learning model to that of the Breast Cancer Risk Prediction Tool (BCRAT), an implementation of the Gail model. None of the machine learning models with only BCRAT inputs were significantly stronger than the BCRAT. However, the logistic regression, linear discriminant analysis, and neural network models with the broader set of inputs were all significantly stronger than the BCRAT. These results suggest that relative to the BCRAT, additional easy-to-obtain personal health inputs can improve five-year breast cancer risk prediction. Our models could be used as non-invasive and cost-effective risk stratification tools to increase early breast cancer detection and prevention, motivating both immediate actions like screening and long-term preventative measures such as hormone replacement therapy and chemoprevention.
Collapse
Affiliation(s)
- Gigi F. Stark
- Department of Therapeutic Radiology, Yale University, New Haven, CT, United States of America
| | - Gregory R. Hart
- Department of Therapeutic Radiology, Yale University, New Haven, CT, United States of America
| | - Bradley J. Nartowt
- Department of Therapeutic Radiology, Yale University, New Haven, CT, United States of America
| | - Jun Deng
- Department of Therapeutic Radiology, Yale University, New Haven, CT, United States of America
- * E-mail:
| |
Collapse
|
17
|
Huber M, Kurz C, Leidl R. Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning. BMC Med Inform Decis Mak 2019; 19:3. [PMID: 30621670 PMCID: PMC6325823 DOI: 10.1186/s12911-018-0731-6] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 12/27/2018] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Machine-learning classifiers mostly offer good predictive performance and are increasingly used to support shared decision-making in clinical practice. Focusing on performance and practicability, this study evaluates prediction of patient-reported outcomes (PROs) by eight supervised classifiers including a linear model, following hip and knee replacement surgery. METHODS NHS PRO data (130,945 observations) from April 2015 to April 2017 were used to train and test eight classifiers to predict binary postoperative improvement based on minimal important differences. Area under the receiver operating characteristic, J-statistic and several other metrics were calculated. The dependent outcomes were generic and disease-specific improvement based on the EQ-5D-3L visual analogue scale (VAS) as well as the Oxford Hip and Knee Score (Q score). RESULTS The area under the receiver operating characteristic of the best training models was around 0.87 (VAS) and 0.78 (Q score) for hip replacement, while it was around 0.86 (VAS) and 0.70 (Q score) for knee replacement surgery. Extreme gradient boosting, random forests, multistep elastic net and linear model provided the highest overall J-statistics. Based on variable importance, the most important predictors for post-operative outcomes were preoperative VAS, Q score and single Q score dimensions. Sensitivity analysis for hip replacement VAS evaluated the influence of minimal important difference, patient selection criteria as well as additional data years. Together with a small benchmark of the NHS prediction model, robustness of our results was confirmed. CONCLUSIONS Supervised machine-learning implementations, like extreme gradient boosting, can provide better performance than linear models and should be considered, when high predictive performance is needed. Preoperative VAS, Q score and specific dimensions like limping are the most important predictors for postoperative hip and knee PROMs.
Collapse
Affiliation(s)
- Manuel Huber
- German Research Center for Environmental Health, Institute for Health Economics and Health Care Management, Helmholtz Zentrum München, Postfach 1129, 85758 Neuherberg, Germany
| | - Christoph Kurz
- German Research Center for Environmental Health, Institute for Health Economics and Health Care Management, Helmholtz Zentrum München, Postfach 1129, 85758 Neuherberg, Germany
| | - Reiner Leidl
- German Research Center for Environmental Health, Institute for Health Economics and Health Care Management, Helmholtz Zentrum München, Postfach 1129, 85758 Neuherberg, Germany
- Munich Center of Health Sciences, Ludwig-Maximilians-University, Ludwigstr. 28, 80539 Munich, RG Germany
| |
Collapse
|
18
|
Pan Y, Gao H, Lin H, Liu Z, Tang L, Li S. Identification of Bacteriophage Virion Proteins Using Multinomial Naïve Bayes with g-Gap Feature Tree. Int J Mol Sci 2018; 19:E1779. [PMID: 29914091 PMCID: PMC6032154 DOI: 10.3390/ijms19061779] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 06/12/2018] [Accepted: 06/12/2018] [Indexed: 01/29/2023] Open
Abstract
Bacteriophages, which are tremendously important to the ecology and evolution of bacteria, play a key role in the development of genetic engineering. Bacteriophage virion proteins are essential materials of the infectious viral particles and in charge of several of biological functions. The correct identification of bacteriophage virion proteins is of great importance for understanding both life at the molecular level and genetic evolution. However, few computational methods are available for identifying bacteriophage virion proteins. In this paper, we proposed a new method to predict bacteriophage virion proteins using a Multinomial Naïve Bayes classification model based on discrete feature generated from the g-gap feature tree. The accuracy of the proposed model reaches 98.37% with MCC of 96.27% in 10-fold cross-validation. This result suggests that the proposed method can be a useful approach in identifying bacteriophage virion proteins from sequence information. For the convenience of experimental scientists, a web server (PhagePred) that implements the proposed predictor is available, which can be freely accessed on the Internet.
Collapse
Affiliation(s)
- Yanyuan Pan
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hui Gao
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Zhen Liu
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Lixia Tang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Songtao Li
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|