1
|
Idris NF, Ismail MA, Jaya MIM, Ibrahim AO, Abulfaraj AW, Binzagr F. Stacking with Recursive Feature Elimination-Isolation Forest for classification of diabetes mellitus. PLoS One 2024; 19:e0302595. [PMID: 38718024 PMCID: PMC11078423 DOI: 10.1371/journal.pone.0302595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 04/08/2024] [Indexed: 05/12/2024] Open
Abstract
Diabetes Mellitus is one of the oldest diseases known to humankind, dating back to ancient Egypt. The disease is a chronic metabolic disorder that heavily burdens healthcare providers worldwide due to the steady increment of patients yearly. Worryingly, diabetes affects not only the aging population but also children. It is prevalent to control this problem, as diabetes can lead to many health complications. As evolution happens, humankind starts integrating computer technology with the healthcare system. The utilization of artificial intelligence assists healthcare to be more efficient in diagnosing diabetes patients, better healthcare delivery, and more patient eccentric. Among the advanced data mining techniques in artificial intelligence, stacking is among the most prominent methods applied in the diabetes domain. Hence, this study opts to investigate the potential of stacking ensembles. The aim of this study is to reduce the high complexity inherent in stacking, as this problem contributes to longer training time and reduces the outliers in the diabetes data to improve the classification performance. In addressing this concern, a novel machine learning method called the Stacking Recursive Feature Elimination-Isolation Forest was introduced for diabetes prediction. The application of stacking with Recursive Feature Elimination is to design an efficient model for diabetes diagnosis while using fewer features as resources. This method also incorporates the utilization of Isolation Forest as an outlier removal method. The study uses accuracy, precision, recall, F1 measure, training time, and standard deviation metrics to identify the classification performances. The proposed method acquired an accuracy of 79.077% for PIMA Indians Diabetes and 97.446% for the Diabetes Prediction dataset, outperforming many existing methods and demonstrating effectiveness in the diabetes domain.
Collapse
Affiliation(s)
- Nur Farahaina Idris
- Faculty of Computing, Universiti Malaysia Pahang Al-Sultan Abdullah, Pekan, Pahang, Malaysia
| | - Mohd Arfian Ismail
- Faculty of Computing, Universiti Malaysia Pahang Al-Sultan Abdullah, Pekan, Pahang, Malaysia
- Centre of Excellence for Artificial Intelligence & Data Science, Universiti, Al-Sultan Pahang, Lebuhraya Tun Razak, Gambang, Malaysia
| | - Mohd Izham Mohd Jaya
- Faculty of Computing, Universiti Malaysia Pahang Al-Sultan Abdullah, Pekan, Pahang, Malaysia
| | - Ashraf Osman Ibrahim
- Creative Advanced Machine Intelligence Research Centre, Faculty of Computing and Informatics, Universiti Malaysia Sabah, Sabah, Malaysia
| | - Anas W. Abulfaraj
- Department of Information Systems, King Abdulaziz University, Rabigh, Saudi Arabia
| | - Faisal Binzagr
- Department of Computer Science, King Abdulaziz University, Rabigh, Saudi Arabia
| |
Collapse
|
2
|
Nian Y, Su X, Yue H, Zhu Y, Li J, Wang W, Sheng Y, Ma Q, Liu J, Li X. Estimation of the rice aboveground biomass based on the first derivative spectrum and Boruta algorithm. FRONTIERS IN PLANT SCIENCE 2024; 15:1396183. [PMID: 38726299 PMCID: PMC11079175 DOI: 10.3389/fpls.2024.1396183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 04/11/2024] [Indexed: 05/12/2024]
Abstract
Aboveground biomass (AGB) is regarded as a critical variable in monitoring crop growth and yield. The use of hyperspectral remote sensing has emerged as a viable method for the rapid and precise monitoring of AGB. Due to the extensive dimensionality and volume of hyperspectral data, it is crucial to effectively reduce data dimensionality and select sensitive spectral features to enhance the accuracy of rice AGB estimation models. At present, derivative transform and feature selection algorithms have become important means to solve this problem. However, few studies have systematically evaluated the impact of derivative spectrum combined with feature selection algorithm on rice AGB estimation. To this end, at the Xiaogang Village (Chuzhou City, China) Experimental Base in 2020, this study used an ASD FieldSpec handheld 2 ground spectrometer (Analytical Spectroscopy Devices, Boulder, Colorado, USA) to obtain canopy spectral data at the critical growth stage (tillering, jointing, booting, heading, and maturity stages) of rice, and evaluated the performance of the recursive feature elimination (RFE) and Boruta feature selection algorithm through partial least squares regression (PLSR), principal component regression (PCR), support vector machine (SVM) and ridge regression (RR). Moreover, we analyzed the importance of the optimal derivative spectrum. The findings indicate that (1) as the growth stage progresses, the correlation between rice canopy spectrum and AGB shows a trend from high to low, among which the first derivative spectrum (FD) has the strongest correlation with AGB. (2) The number of feature bands selected by the Boruta algorithm is 19~35, which has a good dimensionality reduction effect. (3) The combination of FD-Boruta-PCR (FB-PCR) demonstrated the best performance in estimating rice AGB, with an increase in R² of approximately 10% ~ 20% and a decrease in RMSE of approximately 0.08% ~ 14%. (4) The best estimation stage is the booting stage, with R2 values between 0.60 and 0.74 and RMSE values between 1288.23 and 1554.82 kg/hm2. This study confirms the accuracy of hyperspectral remote sensing in estimating vegetation biomass and further explores the theoretical foundation and future direction for monitoring rice growth dynamics.
Collapse
Affiliation(s)
- Ying Nian
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Xiangxiang Su
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Hu Yue
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Yongji Zhu
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Jun Li
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Weiqiang Wang
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Yali Sheng
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Qiang Ma
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
| | - Jikai Liu
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
- Anhui Province Crop Intelligent Planting and Processing Technology Engineering Research Center, Anhui Science and Technology University, Chuzhou, Anhui, China
| | - Xinwei Li
- College of Resource and Environment, Anhui Science and Technology University, Chuzhou, China
- Anhui Province Crop Intelligent Planting and Processing Technology Engineering Research Center, Anhui Science and Technology University, Chuzhou, Anhui, China
- Anhui Province Agricultural Waste Fertilizer Utilization and Cultivated Land Quality Improvement Engineering Research Center, Anhui Science and Technology University, Chuzhou, China
| |
Collapse
|
3
|
Shvetcov A, Thomson S, Spathos J, Cho AN, Wilkins HM, Andrews SJ, Delerue F, Couttas TA, Issar JK, Isik F, Kaur S, Drummond E, Dobson-Stone C, Duffy SL, Rogers NM, Catchpoole D, Gold WA, Swerdlow RH, Brown DA, Finney CA. Blood-Based Transcriptomic Biomarkers Are Predictive of Neurodegeneration Rather Than Alzheimer's Disease. Int J Mol Sci 2023; 24:15011. [PMID: 37834458 PMCID: PMC10573468 DOI: 10.3390/ijms241915011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/06/2023] [Accepted: 10/07/2023] [Indexed: 10/15/2023] Open
Abstract
Alzheimer's disease (AD) is a growing global health crisis affecting millions and incurring substantial economic costs. However, clinical diagnosis remains challenging, with misdiagnoses and underdiagnoses being prevalent. There is an increased focus on putative, blood-based biomarkers that may be useful for the diagnosis as well as early detection of AD. In the present study, we used an unbiased combination of machine learning and functional network analyses to identify blood gene biomarker candidates in AD. Using supervised machine learning, we also determined whether these candidates were indeed unique to AD or whether they were indicative of other neurodegenerative diseases, such as Parkinson's disease (PD) and amyotrophic lateral sclerosis (ALS). Our analyses showed that genes involved in spliceosome assembly, RNA binding, transcription, protein synthesis, mitoribosomes, and NADH dehydrogenase were the best-performing genes for identifying AD patients relative to cognitively healthy controls. This transcriptomic signature, however, was not unique to AD, and subsequent machine learning showed that this signature could also predict PD and ALS relative to controls without neurodegenerative disease. Combined, our results suggest that mRNA from whole blood can indeed be used to screen for patients with neurodegeneration but may be less effective in diagnosing the specific neurodegenerative disease.
Collapse
Affiliation(s)
- Artur Shvetcov
- Department of Psychological Medicine, Sydney Children’s Hospitals Network, Sydney, NSW 2031, Australia
- Discipline of Psychiatry and Mental Health, School of Clinical Medicine, Faculty of Medicine and Health, University of New South Wales, Sydney, NSW 2052, Australia
| | - Shannon Thomson
- Neuroinflammation Research Group, Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
| | - Jessica Spathos
- Neuroinflammation Research Group, Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
| | - Ann-Na Cho
- Dementia Research Centre, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Heather M. Wilkins
- University of Kansas Alzheimer’s Disease Research Centre, Kansas City, KS 66160, USA
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Centre, Kansas City, KS 66160, USA
- Department of Neurology, University of Kansas Medical Centre, Kansas City, KS 66160, USA
| | - Shea J. Andrews
- Department of Psychiatry & Behavioral Sciences, University of California San Francisco, San Francisco, CA 94143, USA
| | - Fabien Delerue
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Timothy A. Couttas
- Brain and Mind Centre, Translational Research Collective, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2050, Australia
| | - Jasmeen Kaur Issar
- Molecular Neurobiology Research Laboratory, Kids Research, Children’s Medical Research Institute, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
- Kids Neuroscience Centre, Kids Research, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
- Sydney Medical School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2050, Australia
| | - Finula Isik
- Neuroinflammation Research Group, Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
| | - Simranpreet Kaur
- Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, VIC 3052, Australia
- Department of Pediatrics, University of Melbourne, Parkville, VIC 3010, Australia
| | - Eleanor Drummond
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
- Brain and Mind Centre, The University of Sydney, Sydney, NSW 2050, Australia
| | - Carol Dobson-Stone
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
- Brain and Mind Centre, The University of Sydney, Sydney, NSW 2050, Australia
| | - Shantel L. Duffy
- Allied Health, Research and Strategic Partnerships, Nepean Blue Mountains Local Health District, Penrith, NSW 2750, Australia
| | - Natasha M. Rogers
- Centre for Transplant and Renal Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
- Renal and Transplant Medicine Unit, Westmead Hospital, Westmead, NSW 2145, Australia
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2050, Australia
| | - Daniel Catchpoole
- The Tumor Bank, Kids Research, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
- Children’s Cancer Research Institute, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
| | - Wendy A. Gold
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
- Molecular Neurobiology Research Laboratory, Kids Research, Children’s Medical Research Institute, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
- Kids Neuroscience Centre, Kids Research, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
| | - Russell H. Swerdlow
- University of Kansas Alzheimer’s Disease Research Centre, Kansas City, KS 66160, USA
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Centre, Kansas City, KS 66160, USA
- Department of Neurology, University of Kansas Medical Centre, Kansas City, KS 66160, USA
- Department of Molecular and Integrative Physiology, University of Kansas Medical Centre, Kansas City, KS 66160, USA
| | - David A. Brown
- Neuroinflammation Research Group, Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2050, Australia
- Department of Immunopathology, Institute for Clinical Pathology and Medical Research-New South Wales Health Pathology, Sydney, NSW 2145, Australia
| | - Caitlin A. Finney
- Neuroinflammation Research Group, Centre for Immunology and Allergy Research, Westmead Institute for Medical Research, Sydney, NSW 2145, Australia
- School of Medical Sciences, Faculty of Medicine Health, The University of Sydney, Sydney, NSW 2050, Australia
| |
Collapse
|
4
|
Yu L, Yu Z, Sun L, Zhu L, Geng D. A brain tumor computer-aided diagnosis method with automatic lesion segmentation and ensemble decision strategy. Front Med (Lausanne) 2023; 10:1232496. [PMID: 37841015 PMCID: PMC10576559 DOI: 10.3389/fmed.2023.1232496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 09/08/2023] [Indexed: 10/17/2023] Open
Abstract
Objectives Gliomas and brain metastases (Mets) are the most common brain malignancies. The treatment strategy and clinical prognosis of patients are different, requiring accurate diagnosis of tumor types. However, the traditional radiomics diagnostic pipeline requires manual annotation and lacks integrated methods for segmentation and classification. To improve the diagnosis process, a gliomas and Mets computer-aided diagnosis method with automatic lesion segmentation and ensemble decision strategy on multi-center datasets was proposed. Methods Overall, 1,022 high-grade gliomas and 775 Mets patients' preoperative MR images were adopted in the study, including contrast-enhanced T1-weighted (T1-CE) and T2-fluid attenuated inversion recovery (T2-flair) sequences from three hospitals. Two segmentation models trained on the gliomas and Mets datasets, respectively, were used to automatically segment tumors. Multiple radiomics features were extracted after automatic segmentation. Several machine learning classifiers were used to measure the impact of feature selection methods. A weight soft voting (RSV) model and ensemble decision strategy based on prior knowledge (EDPK) were introduced in the radiomics pipeline. Accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC) were used to evaluate the classification performance. Results The proposed pipeline improved the diagnosis of gliomas and Mets with ACC reaching 0.8950 and AUC reaching 0.9585 after automatic lesion segmentation, which was higher than those of the traditional radiomics pipeline (ACC:0.8850, AUC:0.9450). Conclusion The proposed model accurately classified gliomas and Mets patients using MRI radiomics. The novel pipeline showed great potential in diagnosing gliomas and Mets with high generalizability and interpretability.
Collapse
Affiliation(s)
- Liheng Yu
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Center for Shanghai Intelligent Imaging for Critical Brain Diseases Engineering and Technology Research, Huashan Hospital, Fudan University, Shanghai, China
- Greater BayArea Institute of Precision Medicine (Guangzhou), Fudan University, Nansha District, Guangzhou, Guangdong, China
| | - Zekuan Yu
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Center for Shanghai Intelligent Imaging for Critical Brain Diseases Engineering and Technology Research, Huashan Hospital, Fudan University, Shanghai, China
- Greater BayArea Institute of Precision Medicine (Guangzhou), Fudan University, Nansha District, Guangzhou, Guangdong, China
| | - Linlin Sun
- Department of Radiology, Shanghai Chest Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Li Zhu
- Department of Radiology, Shanghai Chest Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Daoying Geng
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Center for Shanghai Intelligent Imaging for Critical Brain Diseases Engineering and Technology Research, Huashan Hospital, Fudan University, Shanghai, China
- Greater BayArea Institute of Precision Medicine (Guangzhou), Fudan University, Nansha District, Guangzhou, Guangdong, China
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, China
- Institute of Functional and Molecular Medical Imaging, Fudan University, Shanghai, China
| |
Collapse
|
5
|
Abbas Q, Hina S, Sajjad H, Zaidi KS, Akbar R. Optimization of predictive performance of intrusion detection system using hybrid ensemble model for secure systems. PeerJ Comput Sci 2023; 9:e1552. [PMID: 37705624 PMCID: PMC10496009 DOI: 10.7717/peerj-cs.1552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 08/03/2023] [Indexed: 09/15/2023]
Abstract
Network intrusion is one of the main threats to organizational networks and systems. Its timely detection is a profound challenge for the security of networks and systems. The situation is even more challenging for small and medium enterprises (SMEs) of developing countries where limited resources and investment in deploying foreign security controls and development of indigenous security solutions are big hurdles. A robust, yet cost-effective network intrusion detection system is required to secure traditional and Internet of Things (IoT) networks to confront such escalating security challenges in SMEs. In the present research, a novel hybrid ensemble model using random forest-recursive feature elimination (RF-RFE) method is proposed to increase the predictive performance of intrusion detection system (IDS). Compared to the deep learning paradigm, the proposed machine learning ensemble method could yield the state-of-the-art results with lower computational cost and less training time. The evaluation of the proposed ensemble machine leaning model shows 99%, 98.53% and 99.9% overall accuracy for NSL-KDD, UNSW-NB15 and CSE-CIC-IDS2018 datasets, respectively. The results show that the proposed ensemble method successfully optimizes the performance of intrusion detection systems. The outcome of the research is significant and contributes to the performance efficiency of intrusion detection systems and developing secure systems and applications.
Collapse
Affiliation(s)
- Qaiser Abbas
- University of Engineering and Technology, Lahore, Pakistan
| | | | - Hamza Sajjad
- University of Engineering and Technology Lahore, Lahore, Pakistan
| | | | - Rehan Akbar
- Computer and Information Sciences Department, Universiti Teknologi PETRONAS, Seri Iskandar, Malaysia
| |
Collapse
|
6
|
Ebrahimi A, Wiil UK, Baskaran R, Peimankar A, Andersen K, Nielsen AS. AUD-DSS: a decision support system for early detection of patients with alcohol use disorder. BMC Bioinformatics 2023; 24:329. [PMID: 37658294 PMCID: PMC10474761 DOI: 10.1186/s12859-023-05450-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 08/21/2023] [Indexed: 09/03/2023] Open
Abstract
BACKGROUND Alcohol use disorder (AUD) causes significant morbidity, mortality, and injuries. According to reports, approximately 5% of all registered deaths in Denmark could be due to AUD. The problem is compounded by the late identification of patients with AUD, a situation that can cause enormous problems, from psychological to physical to economic problems. Many individuals suffering from AUD never undergo specialist treatment during their addiction due to obstacles such as taboo and the poor performance of current screening tools. Therefore, there is a lack of rapid intervention. This can be mitigated by the early detection of patients with AUD. A clinical decision support system (DSS) powered by machine learning (ML) methods can be used to diagnose patients' AUD status earlier. METHODS This study proposes an effective AUD prediction model (AUDPM), which can be used in a DSS. The proposed model consists of four distinct components: (1) imputation to address missing values using the k-nearest neighbours approach, (2) recursive feature elimination with cross validation to select the most relevant subset of features, (3) a hybrid synthetic minority oversampling technique-edited nearest neighbour approach to remove noise and balance the distribution of the training data, and (4) an ML model for the early detection of patients with AUD. Two data sources, including a questionnaire and electronic health records of 2571 patients, were collected from Odense University Hospital in the Region of Southern Denmark for the AUD-Dataset. Then, the AUD-Dataset was used to build ML models. The results of different ML models, such as support vector machine, K-nearest neighbour, decision tree, random forest, and extreme gradient boosting, were compared. Finally, a combination of all these models in an ensemble learning approach was selected for the AUDPM. RESULTS The results revealed that the proposed ensemble AUDPM outperformed other single models and our previous study results, achieving 0.96, 0.94, 0.95, and 0.97 precision, recall, F1-score, and accuracy, respectively. In addition, we designed and developed an AUD-DSS prototype. CONCLUSION It was shown that our proposed AUDPM achieved high classification performance. In addition, we identified clinical factors related to the early detection of patients with AUD. The designed AUD-DSS is intended to be integrated into the existing Danish health care system to provide novel information to clinical staff if a patient shows signs of harmful alcohol use; in other words, it gives staff a good reason for having a conversation with patients for whom a conversation is relevant.
Collapse
Affiliation(s)
- Ali Ebrahimi
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark.
| | - Uffe Kock Wiil
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Ruben Baskaran
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Abdolrahman Peimankar
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Kjeld Andersen
- Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| | - Anette Søgaard Nielsen
- Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
7
|
Wertis L, Sugg MM, Runkle JD, Rao D. Socio-Environmental Determinants of Mental and Behavioral Disorders in Youth: A Machine Learning Approach. GEOHEALTH 2023; 7:e2023GH000839. [PMID: 37711362 PMCID: PMC10499369 DOI: 10.1029/2023gh000839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 08/17/2023] [Accepted: 08/22/2023] [Indexed: 09/16/2023]
Abstract
Growing evidence indicates that extreme environmental conditions in summer months have an adverse impact on mental and behavioral disorders (MBD), but there is limited research looking at youth populations. The objective of this study was to apply machine learning approaches to identify key variables that predict MBD-related emergency room (ER) visits in youths in select North Carolina cities among adolescent populations. Daily MBD-related ER visits, which totaled over 42,000 records, were paired with daily environmental conditions, as well as sociodemographic variables to determine if certain conditions lead to higher vulnerability to exacerbated mental health disorders. Four machine learning models (i.e., generalized linear model, generalized additive model, extreme gradient boosting, random forest) were used to assess the predictive performance of multiple environmental and sociodemographic variables on MBD-related ER visits for all cities. The best-performing machine learning model was then applied to each of the six individual cities. As a subanalysis, a distributed lag nonlinear model was used to confirm results. In the all cities scenario, sociodemographic variables contributed the greatest to the overall MBD prediction. In the individual cities scenario, four cities had a 24-hr difference in the maximum temperature, and two of the cities had a 24-hr difference in the minimum temperature, maximum temperature, or Normalized Difference Vegetation Index as a leading predictor of MBD ER visits. Results can inform the use of machine learning models for predicting MBD during high-temperature events and identify variables that affect youth MBD responses during these events.
Collapse
Affiliation(s)
- Luke Wertis
- Department of Geography and PlanningAppalachian State UniversityBooneNCUSA
| | - Margaret M. Sugg
- Department of Geography and PlanningAppalachian State UniversityBooneNCUSA
| | | | - Douglas Rao
- NC Institute for Climate StudiesNC State UniversityRaleighNCUSA
| |
Collapse
|
8
|
Chen R, Wang R, Fei J, Huang L, Wang J. Quantitative identification of daily mental fatigue levels based on multimodal parameters. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2023; 94:095106. [PMID: 37695118 DOI: 10.1063/5.0162312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 08/23/2023] [Indexed: 09/12/2023]
Abstract
Fatigue has become an important health problem in modern life; excessive mental fatigue may induce various cardiovascular diseases. Most current mental fatigue recognition is based only on specific scenarios and tasks. To improve the accuracy of daily mental fatigue recognition, this paper proposes a multimodal fatigue grading method that combines three signals of electrocardiogram (ECG), photoplethysmography (PPG), and blood pressure (BP). We collected ECG, PPG, and BP from 22 subjects during three time periods: morning, afternoon, and evening. Based on these three signals, 56 characteristic parameters were extracted from multiple dimensions, which comprehensively covered the physiological information in different fatigue states. The extracted parameters were compared with the feature optimization ability of recursive feature elimination (RFE), maximal information coefficient, and joint mutual information, and the optimum feature matrix selected was input into random forest (RF) for a three-level classification. The results showed that the accuracy of classification of fatigue using only one physiological feature was 88.88%, 92.72% using a combination of two physiological features, and 94.87% using all three physiological features. This study indicates that the fusion of multiple physiological traits contains more comprehensive information and better identifies the level of mental fatigue, and the RFE-RF model performs best in fatigue identification. The BP variability index is useful for fatigue classification.
Collapse
Affiliation(s)
- Ruijuan Chen
- School of Life Sciences, TianGong University, Tianjin 300387, China
| | - Rui Wang
- School of Electrical and Information Engineering, TianGong University, Tianjin 300387, China
| | - Jieying Fei
- School of Electrical and Information Engineering, TianGong University, Tianjin 300387, China
| | - Lengjie Huang
- School of Electrical and Information Engineering, TianGong University, Tianjin 300387, China
| | - Jinhai Wang
- School of Life Sciences, TianGong University, Tianjin 300387, China
| |
Collapse
|
9
|
Wang H, Doumard E, Soule-Dupuy C, Kemoun P, Aligon J, Monsarrat P. Explanations as a New Metric for Feature Selection: A Systematic Approach. IEEE J Biomed Health Inform 2023; 27:4131-4142. [PMID: 37220033 DOI: 10.1109/jbhi.2023.3279340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
With the extensive use of Machine Learning (ML) in the biomedical field, there was an increasing need for Explainable Artificial Intelligence (XAI) to improve transparency and reveal complex hidden relationships between variables for medical practitioners, while meeting regulatory requirements. Feature Selection (FS) is widely used as a part of a biomedical ML pipeline to significantly reduce the number of variables while preserving as much information as possible. However, the choice of FS methods affects the entire pipeline including the final prediction explanations, whereas very few works investigate the relationship between FS and model explanations. Through a systematic workflow performed on 145 datasets and an illustration on medical data, the present work demonstrated the promising complementarity of two metrics based on explanations (using ranking and influence changes) in addition to accuracy and retention rate to select the most appropriate FS/ML models. Measuring how much explanations differ with/without FS are particularly promising for FS methods recommendation. While reliefF generally performs the best on average, the optimal choice may vary for each dataset. Positioning FS methods in a tridimensional space, integrating explanations-based metrics, accuracy and retention rate, would allow the user to choose the priorities to be given on each of the dimensions. In biomedical applications, where each medical condition may have its own preferences, this framework will make it possible to offer the healthcare professional the appropriate FS technique, to select the variables that have an important explainable impact, even if this comes at the expense of a limited drop of accuracy.
Collapse
|
10
|
Osadchiy V, Bal R, Mayer EA, Kunapuli R, Dong T, Vora P, Petrasek D, Liu C, Stains J, Gupta A. Machine learning model to predict obesity using gut metabolite and brain microstructure data. Sci Rep 2023; 13:5488. [PMID: 37016129 PMCID: PMC10073225 DOI: 10.1038/s41598-023-32713-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 03/31/2023] [Indexed: 04/06/2023] Open
Abstract
A growing body of preclinical and clinical literature suggests that brain-gut-microbiota interactions may contribute to obesity pathogenesis. In this study, we use a machine learning approach to leverage the enormous amount of microstructural neuroimaging and fecal metabolomic data to better understand key drivers of the obese compared to overweight phenotype. Our findings reveal that although gut-derived factors play a role in this distinction, it is primarily brain-directed changes that differentiate obese from overweight individuals. Of the key gut metabolites that emerged from our model, many are likely at least in part derived or influenced by the gut-microbiota, including some amino-acid derivatives. Remarkably, key regions outside of the central nervous system extended reward network emerged as important differentiators, suggesting a role for previously unexplored neural pathways in the pathogenesis of obesity.
Collapse
Affiliation(s)
- Vadim Osadchiy
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- UCLA Microbiome Center, Los Angeles, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA
- Department of Urology, David Geffen School of Medicine, Los Angeles, USA
| | - Roshan Bal
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
| | - Emeran A Mayer
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- UCLA Microbiome Center, Los Angeles, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA
| | - Rama Kunapuli
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
| | - Tien Dong
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- UCLA Microbiome Center, Los Angeles, USA
| | - Priten Vora
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA
- Division of Gastroenterology, Hepatology and Parenteral Nutrition, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA
| | - Danny Petrasek
- Department of Mathematics, California Institute of Technology, Pasadena, USA
| | - Cathy Liu
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA
| | - Jean Stains
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA
- Division of Gastroenterology, Hepatology and Parenteral Nutrition, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA
| | - Arpana Gupta
- Vatche and Tamar Manoukian Division of Digestive Diseases, Los Angeles, USA.
- UCLA Microbiome Center, Los Angeles, USA.
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, Los Angeles, USA.
- G. Oppenheimer Family Center for Neurobiology of Stress and Resilience, Vatche and Tamar Manoukian Division of Digestive Diseases, David Geffen School of Medicine at UCLA, CHS 42-210 MC737818, 10833 Le Conte Avenue, Los Angeles, CA, USA.
| |
Collapse
|
11
|
Cavalcante CHL, Primo PEO, Sales CAF, Caldas WL, Silva JHM, Souza AH, Marinho ES, Pedrosa RC, Marques JAL, Santos HS, Madeiro JPV. Sudden cardiac death multiparametric classification system for Chagas heart disease's patients based on clinical data and 24-hours ECG monitoring. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:9159-9178. [PMID: 37161238 DOI: 10.3934/mbe.2023402] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
About 6.5 million people are infected with Chagas disease (CD) globally, and WHO estimates that $ > million people worldwide suffer from ChHD. Sudden cardiac death (SCD) represents one of the leading causes of death worldwide and affects approximately 65% of ChHD patients at a rate of 24 per 1000 patient-years, much greater than the SCD rate in the general population. Its occurrence in the specific context of ChHD needs to be better exploited. This paper provides the first evidence supporting the use of machine learning (ML) methods within non-invasive tests: patients' clinical data and cardiac restitution metrics (CRM) features extracted from ECG-Holter recordings as an adjunct in the SCD risk assessment in ChHD. The feature selection (FS) flows evaluated 5 different groups of attributes formed from patients' clinical and physiological data to identify relevant attributes among 57 features reported by 315 patients at HUCFF-UFRJ. The FS flow with FS techniques (variance, ANOVA, and recursive feature elimination) and Naive Bayes (NB) model achieved the best classification performance with 90.63% recall (sensitivity) and 80.55% AUC. The initial feature set is reduced to a subset of 13 features (4 Classification; 1 Treatment; 1 CRM; and 7 Heart Tests). The proposed method represents an intelligent diagnostic support system that predicts the high risk of SCD in ChHD patients and highlights the clinical and CRM data that most strongly impact the final outcome.
Collapse
Affiliation(s)
- Carlos H L Cavalcante
- Federal Institute of Education and Technology of Ceara, Maracanau, Ceara, Brazil
- State University of Ceara - Center for Science and Technology, Fortaleza, Ceara, Brazil
| | - Pedro E O Primo
- Computer Science Department - Federal University of Ceara, Fortaleza, Ceara, Brazil
| | - Carlos A F Sales
- Federal Institute of Education and Technology of Ceara, Maracanau, Ceara, Brazil
| | - Weslley L Caldas
- Computer Science Department - Federal University of Ceara, Fortaleza, Ceara, Brazil
| | - João H M Silva
- Oswaldo Cruz Foundation (Fiocruz), Eusebio, Ceara, Brazil
| | - Amauri H Souza
- Federal Institute of Education and Technology of Ceara, Maracanau, Ceara, Brazil
| | - Emmanuel S Marinho
- State University of Ceara - Center for Science and Technology, Fortaleza, Ceara, Brazil
| | - Roberto C Pedrosa
- Edson Saad Heart Institute - Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - João A L Marques
- Laboratory of Applied Neurosciences -University of Saint Joseph, Macau SAR, China
| | - Hélcio S Santos
- State University of Ceara - Center for Science and Technology, Fortaleza, Ceara, Brazil
| | - João P V Madeiro
- Computer Science Department - Federal University of Ceara, Fortaleza, Ceara, Brazil
| |
Collapse
|
12
|
Thanh NN, Chotpantarat S, Ha NT, Trung NH. Determination of conditioning factors for mapping nickel contamination susceptibility in groundwater in Kanchanaburi Province, Thailand, using random forest and maximum entropy. ENVIRONMENTAL GEOCHEMISTRY AND HEALTH 2023:10.1007/s10653-023-01512-z. [PMID: 36881245 DOI: 10.1007/s10653-023-01512-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 02/10/2023] [Indexed: 05/17/2023]
Abstract
Groundwater pollution from nickel (Ni) has been a severe concern in Kanchanaburi Province, Thailand. Recent assessments revealed that the Ni concentration in groundwater, particularly in urban areas, often exceeded the permissible limit. The challenge for groundwater agencies is therefore to delineate regions with high susceptibility to Ni contamination. In this study, a novel modeling approach was applied to a dataset of 117 groundwater samples collected from Kanchanaburi Province between April and July 2021. Twenty site-specific initial variables were considered as influencing factors to Ni contamination. The Random Forest (RF) algorithm with Recursive Feature Elimination (RFE) function was used to select the fourteen most influencing variables. These variables were then used as input features to train a ME model to delineate the Ni contamination susceptibility at a high confidence (Area Under the Curve (AUC) validation value of 0.845). Ten input variables of the altitude, geology, land use, slope, soil type, distance to industrial areas, distance to mining areas, electric conductivity, oxidation-reduction potential, and groundwater depth were discovered in the most explaining the variation of spatial Ni contamination at very high (95.47 km2) and high (86.65 km2) susceptibility. This study devises the novel machine learning approach to identify the conditioning factors and map Ni contamination susceptibility in the groundwater, which provides a baseline dataset and reliable methods for the development of a sustainable groundwater management strategy.
Collapse
Affiliation(s)
- Nguyen Ngoc Thanh
- Interdisciplinary Program in Environmental Science, Graduate School, Chulalongkorn University, Bangkok, 10330, Thailand
- University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Thua Thien Hue, 53000, Vietnam
| | - Srilert Chotpantarat
- Department of Geology, Faculty of Science, Chulalongkorn University, Bangkok, 10330, Thailand.
- Center of Excellence in Environmental Innovation and Management of Metals (EnvIMM), Environmental Research Institute, Chulalongkorn University (ERIC), Bangkok, 10330, Thailand.
| | - Nam-Thang Ha
- University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Thua Thien Hue, 53000, Vietnam
| | - Nguyen H Trung
- Centre for Agriculture and the Bioeconomy, Queensland University of Technology, 2 George St, Brisbane, QLD, 4000, Australia
| |
Collapse
|
13
|
kumari S, Singh K, Khan T, Ariffin MM, Mohan SK, Baleanu D, Ahmadian A. A Novel Approach for Continuous Authentication of Mobile Users Using Reduce Feature Elimination (RFE): A Machine Learning Approach. MOBILE NETWORKS AND APPLICATIONS 2023. [DOI: 10.1007/s11036-023-02103-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/27/2022] [Indexed: 09/15/2023]
|
14
|
Liu K, Chen Q, Huang GH. An Efficient Feature Selection Algorithm for Gene Families Using NMF and ReliefF. Genes (Basel) 2023; 14:421. [PMID: 36833348 PMCID: PMC9957060 DOI: 10.3390/genes14020421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/24/2023] [Accepted: 01/25/2023] [Indexed: 02/10/2023] Open
Abstract
Gene families, which are parts of a genome's information storage hierarchy, play a significant role in the development and diversity of multicellular organisms. Several studies have focused on the characteristics of gene families, such as function, homology, or phenotype. However, statistical and correlation analyses on the distribution of gene family members in the genome have yet to be conducted. Here, a novel framework incorporating gene family analysis and genome selection based on NMF-ReliefF is reported. Specifically, the proposed method starts by obtaining gene families from the TreeFam database and determining the number of gene families within the feature matrix. Then, NMF-ReliefF is used to select features from the gene feature matrix, which is a new feature selection algorithm that overcomes the inefficiencies of traditional methods. Finally, a support vector machine is utilized to classify the acquired features. The results show that the framework achieved an accuracy of 89.1% and an AUC of 0.919 on the insect genome test set. We also employed four microarray gene data sets to evaluate the performance of the NMF-ReliefF algorithm. The outcomes show that the proposed method may strike a delicate balance between robustness and discrimination. Additionally, the proposed method's categorization is superior to state-of-the-art feature selection approaches.
Collapse
Affiliation(s)
- Kai Liu
- College of Plant Protection, Hunan Agricultural University, Changsha 410128, China
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Hunan Agricultural University, Nongda Road, Furong District, Changsha 410128, China
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China
| | - Qi Chen
- College of Plant Protection, Hunan Agricultural University, Changsha 410128, China
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Hunan Agricultural University, Nongda Road, Furong District, Changsha 410128, China
| | - Guo-Hua Huang
- College of Plant Protection, Hunan Agricultural University, Changsha 410128, China
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Hunan Agricultural University, Nongda Road, Furong District, Changsha 410128, China
| |
Collapse
|
15
|
Biosecurity and antimicrobial use in broiler farms across nine European countries: toward identifying farm-specific options for reducing antimicrobial usage. Epidemiol Infect 2022; 151:e13. [PMID: 36573356 PMCID: PMC9990406 DOI: 10.1017/s0950268822001960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Broiler chickens are among the main livestock sectors worldwide. With individual treatments being inapplicable, contrary to many other animal species, the need for antimicrobial use (AMU) is relatively high. AMU in animals is known to drive the emergence and spread of antimicrobial resistance (AMR). High farm biosecurity is a cornerstone for animal health and welfare, as well as food safety, as it protects animals from the introduction and spread of pathogens and therefore the need for AMU. The goal of this study was to identify the main biosecurity practices associated with AMU in broiler farms and to develop a statistical model that produces customised recommendations as to which biosecurity measures could be implemented on a farm to reduce its AMU, including a cost-effectiveness analysis of the recommended measures. AMU and biosecurity data were obtained cross-sectionally in 2014 from 181 broiler farms across nine European countries (Belgium, Bulgaria, Denmark, France, Germany, Italy, the Netherlands, Poland and Spain). Using mixed-effects random forest analysis (Mix-RF), recursive feature elimination was implemented to determine the biosecurity measures that best predicted AMU at the farm level. Subsequently, an algorithm was developed to generate AMU reduction scenarios based on the implementation of these measures. In the final Mix-RF model, 21 factors were present: 10 about internal biosecurity, 8 about external biosecurity and 3 about farm size and productivity, with the latter showing the largest (Gini) importance. Other AMU predictors, in order of importance, were the number of depopulation steps, compliance with a vaccination protocol for non-officially controlled diseases, and requiring visitors to check in before entering the farm. K-means clustering on the proximity matrix of the final Mix-RF model revealed that several measures interacted with each other, indicating that high AMU levels can arise for various reasons depending on the situation. The algorithm utilised the AMU predictive power of biosecurity measures while accounting also for their interactions, representing a first step toward aiding the decision-making process of veterinarians and farmers who are in need of implementing on-farm biosecurity measures to reduce their AMU.
Collapse
|
16
|
Sümer E, Tek E, Türe OA, Şengöz M, Dinçer A, Özcan A, Pamir MN, Özduman K, Ozturk-Isik E. The effect of tumor shape irregularity on Gamma Knife treatment plan quality and treatment outcome: an analysis of 234 vestibular schwannomas. Sci Rep 2022; 12:21809. [PMID: 36528740 PMCID: PMC9759589 DOI: 10.1038/s41598-022-25422-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 11/29/2022] [Indexed: 12/23/2022] Open
Abstract
The primary aim of Gamma Knife (GK) radiosurgery is to deliver high-dose radiation precisely to a target while conforming to the target shape. In this study, the effects of tumor shape irregularity (TSI) on GK dose-plan quality and treatment outcomes were analyzed in 234 vestibular schwannomas. TSI was quantified using seven different metrics including volumetric index of sphericity (VioS). GK treatment plans were created on a single GK-Perfexion/ICON platform. The plan quality was measured using selectivity index (SI), gradient index (GI), Paddick's conformity index (PCI), and efficiency index (EI). Correlation and linear regression analyses were conducted between shape irregularity features and dose plan indices. Machine learning was employed to identify the shape feature that predicted dose plan quality most effectively. The treatment outcome analysis including tumor growth control and serviceable hearing preservation at 2 years, were conducted using Cox regression analyses. All TSI features correlated significantly with the dose plan indices (P < 0.0012). With increasing tumor volume, vestibular schwannomas became more spherical (P < 0.05) and the dose plan indices varied significantly between tumor volume subgroups (P < 0.001 and P < 0.01). VioS was the most effective predictor of GK indices (P < 0.001) and we obtained 89.36% accuracy (79.17% sensitivity and 100% specificity) for predicting PCI. Our results indicated that TSI had significant effects on the plan quality however did not adversely affect treatment outcomes.
Collapse
Affiliation(s)
- Esra Sümer
- grid.11220.300000 0001 2253 9056Institute of Biomedical Engineering, Boğaziçi University, Kandilli Campus, Rasathane Cad, 34684 Üsküdar, Istanbul Turkey
| | - Ece Tek
- grid.411117.30000 0004 0369 7552Department of Radiation Oncology, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - O. Artunç Türe
- grid.411117.30000 0004 0369 7552Department of Radiation Oncology, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - Meriç Şengöz
- grid.411117.30000 0004 0369 7552Department of Neurosurgery, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - Alp Dinçer
- grid.411117.30000 0004 0369 7552Department of Radiology, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - Alpay Özcan
- grid.11220.300000 0001 2253 9056Department of Electrical and Electronics Engineering, Boğaziçi University, Istanbul, Turkey
| | - M. Necmettin Pamir
- grid.411117.30000 0004 0369 7552Department of Neurosurgery, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - Koray Özduman
- grid.411117.30000 0004 0369 7552Department of Neurosurgery, School of Medicine, Acıbadem Mehmet Ali Aydınlar University, Istanbul, Turkey
| | - Esin Ozturk-Isik
- grid.11220.300000 0001 2253 9056Institute of Biomedical Engineering, Boğaziçi University, Kandilli Campus, Rasathane Cad, 34684 Üsküdar, Istanbul Turkey
| |
Collapse
|
17
|
Ebrahimi A, Wiil UK, Naemi A, Mansourvar M, Andersen K, Nielsen AS. Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods. BMC Med Inform Decis Mak 2022; 22:304. [PMID: 36424597 PMCID: PMC9686074 DOI: 10.1186/s12911-022-02051-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 11/16/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND High dimensionality in electronic health records (EHR) causes a significant computational problem for any systematic search for predictive, diagnostic, or prognostic patterns. Feature selection (FS) methods have been indicated to be effective in feature reduction as well as in identifying risk factors related to prediction of clinical disorders. This paper examines the prediction of patients with alcohol use disorder (AUD) using machine learning (ML) and attempts to identify risk factors related to the diagnosis of AUD. METHODS A FS framework consisting of two operational levels, base selectors and ensemble selectors. The first level consists of five FS methods: three filter methods, one wrapper method, and one embedded method. Base selector outputs are aggregated to develop four ensemble FS methods. The outputs of FS method were then fed into three ML algorithms: support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to compare and identify the best feature subset for the prediction of AUD from EHRs. RESULTS In terms of feature reduction, the embedded FS method could significantly reduce the number of features from 361 to 131. In terms of classification performance, RF based on 272 features selected by our proposed ensemble method (Union FS) with the highest accuracy in predicting patients with AUD, 96%, outperformed all other models in terms of AUROC, AUPRC, Precision, Recall, and F1-Score. Considering the limitations of embedded and wrapper methods, the best overall performance was achieved by our proposed Union Filter FS, which reduced the number of features to 223 and improved Precision, Recall, and F1-Score in RF from 0.77, 0.65, and 0.71 to 0.87, 0.81, and 0.84, respectively. Our findings indicate that, besides gender, age, and length of stay at the hospital, diagnosis related to digestive organs, bones, muscles and connective tissue, and the nervous systems are important clinical factors related to the prediction of patients with AUD. CONCLUSION Our proposed FS method could improve the classification performance significantly. It could identify clinical factors related to prediction of AUD from EHRs, thereby effectively helping clinical staff to identify and treat AUD patients and improving medical knowledge of the AUD condition. Moreover, the diversity of features among female and male patients as well as gender disparity were investigated using FS methods and ML techniques.
Collapse
Affiliation(s)
- Ali Ebrahimi
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Uffe Kock Wiil
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Amin Naemi
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Marjan Mansourvar
- grid.10825.3e0000 0001 0728 0170Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Kjeld Andersen
- grid.10825.3e0000 0001 0728 0170Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| | - Anette Søgaard Nielsen
- grid.10825.3e0000 0001 0728 0170Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
18
|
Adams J, Agyenkwa-Mawuli K, Agyapong O, Wilson MD, Kwofie SK. EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus. Comput Biol Chem 2022; 101:107766. [DOI: 10.1016/j.compbiolchem.2022.107766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 08/10/2022] [Accepted: 08/29/2022] [Indexed: 11/03/2022]
|
19
|
Goh KL, Goto A, Lu Y. LGB-Stack: Stacked Generalization with LightGBM for Highly Accurate Predictions of Polymer Bandgap. ACS OMEGA 2022; 7:29787-29793. [PMID: 36061712 PMCID: PMC9434625 DOI: 10.1021/acsomega.2c02554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 07/12/2022] [Indexed: 06/15/2023]
Abstract
Recently, the Ramprasad group reported a quantitative structure-property relationship (QSPR) model for predicting the E gap values of 4209 polymers, which yielded a test set R 2 score of 0.90 and a test set root-mean-square error (RMSE) score of 0.44 at a train/test split ratio of 80/20. In this paper, we present a new QSPR model named LGB-Stack, which performs a two-level stacked generalization using the light gradient boosting machine. At level 1, multiple weak models are trained, and at level 2, they are combined into a strong final model. Four molecular fingerprints were generated from the simplified molecular input line entry system notations of the polymers. They were trimmed using recursive feature elimination and used as the initial input features for training the weak models. The output predictions of the weak models were used as the new input features for training the final model, which completes the LGB-Stack model training process. Our results show that the best test set R 2 and the RMSE scores of LGB-Stack at the train/test split ratio of 80/20 were 0.92 and 0.41, respectively. The accuracy scores further improved to 0.94 and 0.34, respectively, when the train/test split ratio of 95/5 was used.
Collapse
|
20
|
Hosseinpour Z, Jonkman L, Oladosu O, Pridham G, Pike GB, Inglese M, Geurts JJ, Zhang Y. Texture analysis in brain T2 and diffusion MRI differentiates histology-verified grey and white matter pathology types in multiple sclerosis. J Neurosci Methods 2022; 379:109671. [PMID: 35820450 DOI: 10.1016/j.jneumeth.2022.109671] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 06/19/2022] [Accepted: 07/07/2022] [Indexed: 11/18/2022]
Abstract
BACKGROUND Multiple sclerosis (MS) is a co mplex disease of the central nervous system involving several types of brain pathology that are difficult to characterize using conventional imaging methods. NEW METHOD We originated novel texture analysis and machine learning approaches for classifying MS pathology subtypes as compared with 2 common advanced MRI measures: magnetization transfer ratio (MTR) and fractional anisotropy (FA). Texture analysis used an optimized grey level co-occurrence matrix method with histology-informed 7T T2-weighted magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) from 15 MS and 12 control brain specimens. DTI analysis took an innovative approach that assessed the texture across diffusion directions upsampled from 30 to 90. Tissue types included de- and re-myelinated lesions and normal-appearing areas in both grey and white matter, and diffusely abnormal white matter. Data analyses were stepwise, including: (1) group-wise classification using random forest algorithms based on all or individual imaging parameters; (2) parameter importance ranking; and (3) pairwise analysis using top-ranked features. RESULTS Texture analysis performed better than MTR and FA, with T2 texture performed the best. T2 texture measures ranked the highest in classifying most grey and white matter tissue types, including de- versus re-myelinated lesions and among grey matter lesion subtypes (accuracy=0.86-0.59; kappa=0.60-0.41). Diffusion texture best differentiated normal appearing and control white matter. COMPARISON WITH EXISTING METHODS There is no established method in imaging for differentiating MS pathology subtypes. In combined texture analysis and machine learning studies, there is also no direct evidence comparing conventional with advanced MRI measures for assessing MS pathology. Further, this study is unique in conducting innovative texture analysis with DTI following data-augmentation using robust methods. CONCLUSIONS T2 and diffusion MRI texture analysis integrated with machine learning may be valuable approaches for characterizing MS pathology.
Collapse
Affiliation(s)
- Zahra Hosseinpour
- Biomedical Engineering Graduate Program, University of Calgary, Alberta T2N 4N, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 4N1, Canada
| | - Laura Jonkman
- Department of Anatomy & Neuroscience, Amsterdam Neuroscience, Amsterdam UMC, Vrije Universiteit, Amsterdam, the Netherlands
| | - Olayinka Oladosu
- Department of Neuroscience, University of Calgary, Alberta T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 4N1, Canada
| | - Glen Pridham
- Department of Clinical Neurosciences, University of Calgary, Alberta T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 4N1, Canada
| | - G Bruce Pike
- Department of Clinical Neurosciences, University of Calgary, Alberta T2N 4N1, Canada; Department of Radiology, University of Calgary, Alberta T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 4N1, Canada
| | - Matilde Inglese
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA 10029; Department of Neurosciences, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health (DiNOGMI) and Center of Excellence for Biomedical Research (CEBR), University of Genoa, Genoa, Italy
| | - Jeroen J Geurts
- Department of Anatomy & Neuroscience, Amsterdam Neuroscience, Amsterdam UMC, Vrije Universiteit, Amsterdam, the Netherlands
| | - Yunyan Zhang
- Department of Clinical Neurosciences, University of Calgary, Alberta T2N 4N1, Canada; Department of Radiology, University of Calgary, Alberta T2N 4N1, Canada; Hotchkiss Brain Institute, University of Calgary, Alberta T2N 4N1, Canada.
| |
Collapse
|
21
|
Jiang X, Zhang Y, Li Y, Zhang B. Forecast and analysis of aircraft passenger satisfaction based on RF-RFE-LR model. Sci Rep 2022; 12:11174. [PMID: 35778429 PMCID: PMC9247921 DOI: 10.1038/s41598-022-14566-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Accepted: 06/08/2022] [Indexed: 11/09/2022] Open
Abstract
Airplanes have always been one of the first choices for people to travel because of their convenience and safety. However, due to the outbreak of the new coronavirus epidemic in 2020, the civil aviation industry of various countries in the world has encountered severe challenges. Predicting aircraft passenger satisfaction and excavating the main influencing factors can help airlines improve their services and gain advantages in difficult situations and competition. This paper proposes a RF-RFE-Logistic feature selection model to extract the influencing factors of passenger satisfaction. First, preliminary feature selection is performed using recursive feature elimination based on random forest (RF-RFE). Second, based on different classification models, KNN, logistic regression, random forest, Gaussian Naive Bayes, and BP neural network, the classification performance of the models before and after feature selection is compared, and the prediction model with the best classification performance is selected. Finally, based on the RF-RFE feature selection, combined with the logistic model, the factors affecting customer satisfaction are further extracted. The experimental results show that the RF-RFE model selects a feature subset containing 17 variables. In the classification prediction model, the random forest after RF-RFE feature selection shows the best classification performance. Finally, combined with the four important variables extracted by RF-RFE and logistic regression, further discussion is carried out, and suggestions are given for airlines to improve passenger satisfaction.
Collapse
Affiliation(s)
- Xuchu Jiang
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, China
| | - Ying Zhang
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, China
| | - Ying Li
- Department of Scientific Research, Zhongnan University of Economics and Law, Wuhan, 430073, China
| | - Biao Zhang
- School of Computer Science, Liaocheng University, Liaocheng, 252059, China.
| |
Collapse
|
22
|
Peng F, Chen C, Lv D, Zhang N, Wang X, Zhang X, Wang Z. Gesture Recognition by Ensemble Extreme Learning Machine Based on Surface Electromyography Signals. Front Hum Neurosci 2022; 16:911204. [PMID: 35782048 PMCID: PMC9243223 DOI: 10.3389/fnhum.2022.911204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 05/18/2022] [Indexed: 11/13/2022] Open
Abstract
In the recent years, gesture recognition based on the surface electromyography (sEMG) signals has been extensively studied. However, the accuracy and stability of gesture recognition through traditional machine learning algorithms are still insufficient to some actual application scenarios. To enhance this situation, this paper proposed a method combining feature selection and ensemble extreme learning machine (EELM) to improve the recognition performance based on sEMG signals. First, the input sEMG signals are preprocessed and 16 features are then extracted from each channel. Next, features that mostly contribute to the gesture recognition are selected from the extracted features using the recursive feature elimination (RFE) algorithm. Then, several independent ELM base classifiers are established using the selected features. Finally, the recognition results are determined by integrating the results obtained by ELM base classifiers using the majority voting method. The Ninapro DB5 dataset containing 52 different hand movements captured from 10 able-bodied subjects was used to evaluate the performance of the proposed method. The results showed that the proposed method could perform the best (overall average accuracy 77.9%) compared with decision tree (DT), ELM, and random forest (RF) methods.
Collapse
|
23
|
Deng J, He Z. Characterizing Risk of In-Hospital Mortality Following Subarachnoid Hemorrhage Using Machine Learning: A Retrospective Study. Front Surg 2022; 9:891984. [PMID: 36034376 PMCID: PMC9407038 DOI: 10.3389/fsurg.2022.891984] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 05/16/2022] [Indexed: 11/30/2022] Open
Abstract
Background Subarachnoid hemorrhage has a high rate of disability and mortality, and the ability to use existing disease severity scores to estimate the risk of adverse outcomes is limited. Collect relevant information of patients during hospitalization to develop more accurate risk prediction models, using logistic regression (LR) and machine learning (ML) technologies, combined with biochemical information. Methods Patient-level data were extracted from MIMIC-IV data. The primary outcome was in-hospital mortality. The models were trained and tested on a data set (ratio 70:30) including age and key past medical history. The recursive feature elimination (RFE) algorithm was used to screen the characteristic variables; then, the ML algorithm was used to analyze and establish the prediction model, and the validation set was used to further verify the effectiveness of the model. Result Of the 1,787 patients included in the mimic database, a total of 379 died during hospitalization. Recursive feature abstraction (RFE) selected 20 variables. After simplification, we determined 10 features, including the Glasgow coma score (GCS), glucose, sodium, chloride, SPO2, bicarbonate, temperature, white blood cell (WBC), heparin use, and sepsis-related organ failure assessment (SOFA) score. The validation set and Delong test showed that the simplified RF model has a high AUC of 0.949, which is not significantly different from the best model. Furthermore, in the DCA curve, the simplified GBM model has relatively higher net benefits. In the subgroup analysis of non-traumatic subarachnoid hemorrhage, the simplified GBM model has a high AUC of 0.955 and relatively higher net benefits. Conclusions ML approaches significantly enhance predictive discrimination for mortality following subarachnoid hemorrhage compared to existing illness severity scores and LR. The discriminative ability of these ML models requires validation in external cohorts to establish generalizability.
Collapse
|
24
|
Wang Y, Fu Y, Luo X. Identification of Pathogenetic Brain Regions via Neuroimaging Data for Diagnosis of Autism Spectrum Disorders. Front Neurosci 2022; 16:900330. [PMID: 35655751 PMCID: PMC9152096 DOI: 10.3389/fnins.2022.900330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 04/11/2022] [Indexed: 11/13/2022] Open
Abstract
Autism spectrum disorder (ASD) is a kind of neurodevelopmental disorder that often occurs in children and has a hidden onset. Patients usually have lagged development of communication ability and social behavior and thus suffer an unhealthy physical and mental state. Evidence has indicated that diseases related to ASD have commonalities in brain imaging characteristics. This study aims to study the pathogenesis of ASD based on brain imaging data to locate the ASD-related brain regions. Specifically, we collected the functional magnetic resonance image data of 479 patients with ASD and 478 normal subjects matched in age and gender and used a machine-learning framework named random support vector machine cluster to extract distinctive brain regions from the preprocessed data. According to the experimental results, compared with other existing approaches, the method used in this study can more accurately distinguish patients from normal individuals based on brain imaging data. At the same time, this study found that the development of ASD was highly correlated with certain brain regions, e.g., lingual gyrus, superior frontal gyrus, medial gyrus, insular lobe, and olfactory cortex. This study explores the effectiveness of a novel machine-learning approach in the study of ASD brain imaging and provides a reference brain area for the medical research and clinical treatment of ASD.
Collapse
Affiliation(s)
- Yu Wang
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| | - Yu Fu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
- *Correspondence: Yu Fu
| | - Xun Luo
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| |
Collapse
|
25
|
Intelligent Identification and Features Attribution of Saline–Alkali-Tolerant Rice Varieties Based on Raman Spectroscopy. PLANTS 2022; 11:plants11091210. [PMID: 35567210 PMCID: PMC9101781 DOI: 10.3390/plants11091210] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 04/24/2022] [Accepted: 04/27/2022] [Indexed: 11/30/2022]
Abstract
Planting rice in saline–alkali land can effectively improve saline–alkali soil and increase grain yield, but traditional identification methods for saline–alkali-tolerant rice varieties require tedious and time-consuming field investigations based on growth indicators by rice breeders. In this study, the Python machine deep learning method was used to analyze the Raman molecular spectroscopy of rice and assist in feature attribution, in order to study a fast and efficient identification method of saline–alkali-tolerant rice varieties. A total of 156 Raman spectra of four rice varieties (two saline–alkali-tolerant rice varieties and two saline–alkali-sensitive rice varieties) were analyzed, and the wave crests were extracted by an improved signal filtering difference method and the feature information of the wave crest was automatically extracted by scipy.signal.find_peaks. Select K Best (SKB), Recursive Feature Elimination (RFE) and Select F Model (SFM) were used to select useful molecular features. Based on these feature selection methods, a Logistic Regression Model (LRM) and Random Forests Model (RFM) were established for discriminant analysis. The experimental results showed that the RFM identification model based on the RFE method reached a higher recognition rate of 89.36%. According to the identification results of RFM and the identification of feature attribution materials, amylum was the most significant substance in the identification of saline–alkali-tolerant rice varieties. Therefore, an intelligent method for the identification of saline–alkali-tolerant rice varieties based on Raman molecular spectroscopy is proposed.
Collapse
|
26
|
Al-Nafjan A. Feature selection of EEG signals in neuromarketing. PeerJ Comput Sci 2022; 8:e944. [PMID: 35634118 PMCID: PMC9138093 DOI: 10.7717/peerj-cs.944] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 03/16/2022] [Indexed: 06/15/2023]
Abstract
Brain-computer interface (BCI) technology uses electrophysiological (EEG) signals to detect user intent. Research on BCI has seen rapid advancement, with researchers proposing and implementing several signal processing and machine learning approaches for use in different contexts. BCI technology is also used in neuromarketing to study the brain's responses to marketing stimuli. This study sought to detect two preference states (like and dislike) in EEG neuromarketing data using the proposed EEG-based consumer preference recognition system. This study investigated the role of feature selection in BCI to improve the accuracy of preference detection for neuromarketing. Several feature selection methods were used for benchmark testing in multiple BCI studies. Four feature selection approaches, namely, principal component analysis (PCA), minimum redundancy maximum relevance (mRMR), recursive feature elimination (RFE), and ReliefF, were used with five different classifiers: deep neural network (DNN), support vector machine (SVM), k-nearest neighbors (KNN), linear discriminant analysis (LDA), and random forest (RF). The four approaches were compared to evaluate the importance of feature selection. Moreover, the performance of classification algorithms was evaluated before and after feature selection. It was found that feature selection for EEG signals improves the performance of all classifiers.
Collapse
|
27
|
Research on Classification of Open-Pit Mineral Exploiting Information Based on OOB RFE Feature Optimization. SENSORS 2022; 22:s22051948. [PMID: 35271096 PMCID: PMC8914673 DOI: 10.3390/s22051948] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/22/2022] [Accepted: 02/28/2022] [Indexed: 01/09/2023]
Abstract
Mineral exploiting information is an important indicator to reflect regional mineral activities. Accurate extraction of this information is essential to mineral management and environmental protection. In recent years, there are an increasingly large number of pieces of research on land surface information classification by conducting multi-source remote sensing data. However, in order to achieve the best classification result, how to select the optimal feature combination is the key issue. This study creatively combines Out of Bag data with Recursive Feature Elimination (OOB RFE) to optimize the feature combination of the mineral exploiting information of non-metallic building materials in Fujian province, China. We acquired and integrated Ziyuan-1-02D (ZY-1-02D) hyperspectral imagery, landsat-8 multispectral imagery, and Sentinel-1 Synthetic Aperture Radar (SAR) imagery to gain spectrum, heat, polarization, and texture features; also, two machine learning methods were adopted to classify the mineral exploiting information in our study area. After assessment and comparison on accuracy, it proves that the classification generated from our new OOB RFE method, which combine with random forest (RF), can achieve the highest overall accuracy 93.64% (with a kappa coefficient of 0.926). Comparing with Recursive Feature Elimination (RFE) alone, OOB REF can precisely filter the feature combination and lead to optimal result. Under the same feature scheme, RF is effective on classifying the mineral exploiting information of the research field. The feature optimization method and optimal feature combination proposed in our study can provide technical support and theoretical reference for extraction and classification of mineral exploiting information applied in other regions.
Collapse
|
28
|
Salem M, Cowan MJ, Mpourmpakis G. Predicting Segregation Energy in Single Atom Alloys Using Physics and Machine Learning. ACS OMEGA 2022; 7:4471-4481. [PMID: 35155939 PMCID: PMC8830057 DOI: 10.1021/acsomega.1c06337] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 01/11/2022] [Indexed: 06/14/2023]
Abstract
Single atom alloys (SAAs) show great promise as catalysts for a wide variety of reactions due to their tunable properties, which can enhance the catalytic activity and selectivity. To design SAAs, it is imperative for the heterometal dopant to be stable on the surface as an active catalytic site. One main approach to probe SAA stability is to calculate surface segregation energy. Density functional theory (DFT) can be applied to investigate the surface segregation energy in SAAs. However, DFT is computationally expensive and time-consuming; hence, there is a need for accelerated frameworks to screen metal segregation for new SAA catalysts across combinations of metal hosts and dopants. To this end, we developed a model that predicts surface segregation energy using machine learning for a series of SAA periodic slabs. The model leverages elemental descriptors and features inspired by the previously developed bond-centric model. The initial model accurately captures surface segregation energy across a diverse series of FCC-based SAAs with various surface facets and metal-host pairs. Following our machine learning methodology, we expanded our analysis to develop a new model for SAAs formed from FCC hosts with FCC, BCC, and HCP dopants. Our final, five-feature model utilizes second-order polynomial kernel ridge regression. The model is able to predict segregation energies with a high degree of accuracy, which is due to its physically motivated features. We then expanded our data set to test the accuracy of the five features used. We find that the retrained model can accurately capture E seg trends across different metal hosts and facets, confirming the significance of the features used in our final model. Finally, we apply our pretrained model to a series of Ir- and Pd-based SAA cuboctahedron nanoparticles (NPs), ranging in size and FCC dopants. Remarkably, our model (trained on periodic slabs) accurately predicts the DFT segregation energies of the SAA NPs. The results provide further evidence supporting the use of our model as a general tool for the rapid prediction of SAA segregation energies. By creating a framework to predict the metal segregation from bulk surfaces to NPs, we can accelerate the SAA catalyst design while simultaneously unraveling key physicochemical properties driving thermodynamic stabilization of SAAs.
Collapse
|
29
|
Chu H, Cao Y, Jiang J, Yang J, Huang M, Li Q, Jiang C, Jiao X. Optimized electroencephalogram and functional near-infrared spectroscopy-based mental workload detection method for practical applications. Biomed Eng Online 2022; 21:9. [PMID: 35109879 PMCID: PMC8812267 DOI: 10.1186/s12938-022-00980-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 01/21/2022] [Indexed: 11/14/2022] Open
Abstract
Background Mental workload is a critical consideration in complex man–machine systems design. Among various mental workload detection techniques, multimodal detection techniques integrating electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) signals have attracted considerable attention. However, existing EEG–fNIRS-based mental workload detection methods have certain defects, such as complex signal acquisition channels and low detection accuracy, which restrict their practical application. Methods The signal acquisition configuration was optimized by analyzing the feature importance in mental workload recognition model and a more accurate and convenient EEG–fNIRS-based mental workload detection method was constructed. A classical Multi-Task Attribute Battery (MATB) task was conducted with 20 participating volunteers. Subjective scale data, 64-channel EEG data, and two-channel fNIRS data were collected. Results A higher number of EEG channels correspond to higher detection accuracy. However, there is no obvious improvement in accuracy once the number of EEG channels reaches 26, with a four-level mental workload detection accuracy of 76.25 ± 5.21%. Partial results of physiological analysis verify the results of previous studies, such as that the θ power of EEG and concentration of O2Hb in the prefrontal region increase while the concentration of HHb decreases with task difficulty. It was further observed, for the first time, that the energy of each band of EEG signals was significantly different in the occipital lobe region, and the power of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\beta_{1}$$\end{document}β1 and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\beta_{2}$$\end{document}β2 bands in the occipital region increased significantly with task difficulty. The changing range and the mean amplitude of O2Hb in high-difficulty tasks were significantly higher compared with those in low-difficulty tasks. Conclusions The channel configuration of EEG–fNIRS-based mental workload detection was optimized to 26 EEG channels and two frontal fNIRS channels. A four-level mental workload detection accuracy of 76.25 ± 5.21% was obtained, which is higher than previously reported results. The proposed configuration can promote the application of mental workload detection technology in military, driving, and other complex human–computer interaction systems.
Collapse
Affiliation(s)
- Hongzuo Chu
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China.,Space Engineering University, Beijing, China
| | - Yong Cao
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China
| | - Jin Jiang
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China
| | - Jiehong Yang
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China.,Space Engineering University, Beijing, China
| | - Mengyin Huang
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China.,Space Engineering University, Beijing, China
| | - Qijie Li
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China
| | - Changhua Jiang
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China.
| | - Xuejun Jiao
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, China. .,Space Engineering University, Beijing, China.
| |
Collapse
|
30
|
Lim AJW, Lim LJ, Ooi BNS, Koh ET, Tan JWL, Chong SS, Khor CC, Tucker-Kellogg L, Leong KP, Lee CG. Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients. EBioMedicine 2022; 75:103800. [PMID: 35022146 PMCID: PMC8808170 DOI: 10.1016/j.ebiom.2021.103800] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 12/19/2021] [Accepted: 12/20/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Major challenges in large scale genetic association studies include not only the identification of causative single nucleotide polymorphisms (SNPs), but also accounting for SNP-SNP interactions. This study thus proposes a novel feature engineering approach integrating potentially functional coding haplotypes (pfcHap) with machine-learning (ML) feature selection to identify biologically meaningful, possibly causative genetic factors, that take into consideration potential SNP-SNP interactions within the pfcHap, to best predict for methotrexate (MTX) response in rheumatoid arthritis (RA) patients. METHODS Exome sequencing from 349 RA patients were analysed, of which they were split into training and unseen test set. Inferred pfcHaps were combined with 30 non-genetic features to undergo ML recursive feature elimination with cross-validation using the training set. Predictive capacity and robustness of the selected features were assessed using six popular machine learning models through a train set cross-validation and evaluated in an unseen test set. FINDINGS Significantly, 100 features (95 pfcHaps, 5 non-genetic factors) were identified to have good predictive performance (AUC: 0.776-0.828; Sensitivity: 0.656-0.813; Specificity: 0.684-0.868) across all six ML models in an unseen test dataset for the prediction of MTX response in RA patients. INTERPRETATION Majority of the predictive pfcHap SNPs were predicted to be potentially functional and some of the genes in which the pfcHap resides in were identified to be associated with previously reported MTX/RA pathways. FUNDING Singapore Ministry of Health's National Medical Research Council (NMRC) [NMRC/CBRG/0095/2015; CG12Aug17; CGAug16M012; NMRC/CG/017/2013]; National Cancer Center Research Fund and block funding Duke-NUS Medical School.; Singapore Ministry of Education Academic Research Fund Tier 2 grant MOE2019-T2-1-138.
Collapse
Affiliation(s)
- Ashley J W Lim
- Dept of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Lee Jin Lim
- Dept of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Brandon N S Ooi
- Dept of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Ee Tzun Koh
- Department of Rheumatology, Allergy and Immunology, Tan Tock Seng Hospital, Singapore
| | - Justina Wei Lynn Tan
- Department of Rheumatology, Allergy and Immunology, Tan Tock Seng Hospital, Singapore
| | - Samuel S Chong
- Dept of Pediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Chiea Chuen Khor
- Division of Human Genetics, Genome Institute of Singapore, Singapore
| | - Lisa Tucker-Kellogg
- Centre for Computational Biology, and Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore
| | - Khai Pang Leong
- Department of Rheumatology, Allergy and Immunology, Tan Tock Seng Hospital, Singapore; Clinical Research & Innovation Office, Tan Tock Seng Hospital, Singapore.
| | - Caroline G Lee
- Dept of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Div of Cellular & Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, Singapore; Duke-NUS Medical School, Singapore; NUS Graduate School, National University of Singapore, Singapore.
| |
Collapse
|
31
|
Antimalarial Drug Predictions Using Molecular Descriptors and Machine Learning against Plasmodium Falciparum. Biomolecules 2021; 11:biom11121750. [PMID: 34944394 PMCID: PMC8698534 DOI: 10.3390/biom11121750] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/12/2021] [Accepted: 11/17/2021] [Indexed: 11/16/2022] Open
Abstract
Malaria remains by far one of the most threatening and dangerous illnesses caused by the plasmodium falciparum parasite. Chloroquine (CQ) and first-line artemisinin-based combination treatment (ACT) have long been the drug of choice for the treatment and controlling of malaria; however, the emergence of CQ-resistant and artemisinin resistance parasites is now present in most areas where malaria is endemic. In this work, we developed five machine learning models to predict antimalarial bioactivities of a drug against plasmodium falciparum from the features (i.e., molecular descriptors values) obtained from PaDEL software from SMILES of compounds and compare the machine learning models by experiments with our collected data of 4794 instances. As a consequence, we found that three models amongst the five, namely artificial neural network (ANN), extreme gradient boost (XGB), and random forest (RF), outperform the others in terms of accuracy while observing that, using roughly a quarter of the promising descriptors picked by the feature selection algorithm, the five models achieved equivalent and comparable performance. Nevertheless, the contribution of all molecular descriptors in the models was investigated through the comparison of their rank values by the feature selection algorithm and found that the most potent and relevant descriptors which come from the ‘Autocorrelation’ module contributed more while the ‘Atom type electrotopological state’ contributed the least to the model.
Collapse
|
32
|
Carter S, van Rees CB, Hand BK, Muhlfeld CC, Luikart G, Kimball JS. Testing a Generalizable Machine Learning Workflow for Aquatic Invasive Species on Rainbow Trout ( Oncorhynchus mykiss) in Northwest Montana. Front Big Data 2021; 4:734990. [PMID: 34734177 PMCID: PMC8558495 DOI: 10.3389/fdata.2021.734990] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 09/17/2021] [Indexed: 11/13/2022] Open
Abstract
Biological invasions are accelerating worldwide, causing major ecological and economic impacts in aquatic ecosystems. The urgent decision-making needs of invasive species managers can be better met by the integration of biodiversity big data with large-domain models and data-driven products. Remotely sensed data products can be combined with existing invasive species occurrence data via machine learning models to provide the proactive spatial risk analysis necessary for implementing coordinated and agile management paradigms across large scales. We present a workflow that generates rapid spatial risk assessments on aquatic invasive species using occurrence data, spatially explicit environmental data, and an ensemble approach to species distribution modeling using five machine learning algorithms. For proof of concept and validation, we tested this workflow using extensive spatial and temporal hybridization and occurrence data from a well-studied, ongoing, and climate-driven species invasion in the upper Flathead River system in northwestern Montana, USA. Rainbow Trout (RBT; Oncorhynchus mykiss), an introduced species in the Flathead River basin, compete and readily hybridize with native Westslope Cutthroat Trout (WCT; O. clarkii lewisii), and the spread of RBT individuals and their alleles has been tracked for decades. We used remotely sensed and other geospatial data as key environmental predictors for projecting resultant habitat suitability to geographic space. The ensemble modeling technique yielded high accuracy predictions relative to 30-fold cross-validated datasets (87% 30-fold cross-validated accuracy score). Both top predictors and model performance relative to these predictors matched current understanding of the drivers of RBT invasion and habitat suitability, indicating that temperature is a major factor influencing the spread of invasive RBT and hybridization with native WCT. The congruence between more time-consuming modeling approaches and our rapid machine-learning approach suggest that this workflow could be applied more broadly to provide data-driven management information for early detection of potential invaders.
Collapse
Affiliation(s)
- S Carter
- Numerical Terradynamic Simulation Group, WA Franke College of Forestry and Conservation, University of Montana, Missoula, MT, United States
| | - C B van Rees
- Flathead Lake Biological Station, Division of Biological Sciences, University of Montana, Polson, MT, United States
| | - B K Hand
- Flathead Lake Biological Station, Division of Biological Sciences, University of Montana, Polson, MT, United States
| | - C C Muhlfeld
- Flathead Lake Biological Station, Division of Biological Sciences, University of Montana, Polson, MT, United States.,U.S. Geological Survey, Northern Rocky Mountain Science Center, Glacier National Park, West Glacier, MT, United States.,Department of Ecosystem and Conservation Sciences, WA Franke College of Forestry and Conservation, University of Montana, Missoula, MT, United States
| | - G Luikart
- Flathead Lake Biological Station, Division of Biological Sciences, University of Montana, Polson, MT, United States
| | - J S Kimball
- Numerical Terradynamic Simulation Group, WA Franke College of Forestry and Conservation, University of Montana, Missoula, MT, United States.,Department of Ecosystem and Conservation Sciences, WA Franke College of Forestry and Conservation, University of Montana, Missoula, MT, United States
| |
Collapse
|
33
|
Nakaoku Y, Ogata S, Murata S, Nishimori M, Ihara M, Iihara K, Takegami M, Nishimura K. AI-Assisted In-House Power Monitoring for the Detection of Cognitive Impairment in Older Adults. SENSORS 2021; 21:s21186249. [PMID: 34577455 PMCID: PMC8473035 DOI: 10.3390/s21186249] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 09/13/2021] [Accepted: 09/14/2021] [Indexed: 12/23/2022]
Abstract
In-home monitoring systems have been used to detect cognitive decline in older adults by allowing continuous monitoring of routine activities. In this study, we investigated whether unobtrusive in-house power monitoring technologies could be used to predict cognitive impairment. A total of 94 older adults aged ≥65 years were enrolled in this study. Generalized linear mixed models with subject-specific random intercepts were used to evaluate differences in the usage time of home appliances between people with and without cognitive impairment. Three independent power monitoring parameters representing activity behavior were found to be associated with cognitive impairment. Representative values of mean differences between those with cognitive impairment relative to those without were −13.5 min for induction heating in the spring, −1.80 min for microwave oven in the winter, and −0.82 h for air conditioner in the winter. We developed two prediction models for cognitive impairment, one with power monitoring data and the other without, and found that the former had better predictive ability (accuracy, 0.82; sensitivity, 0.48; specificity, 0.96) compared to the latter (accuracy, 0.76; sensitivity, 0.30; specificity, 0.95). In summary, in-house power monitoring technologies can be used to detect cognitive impairment.
Collapse
Affiliation(s)
- Yuriko Nakaoku
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan; (Y.N.); (S.O.); (S.M.); (M.T.)
| | - Soshiro Ogata
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan; (Y.N.); (S.O.); (S.M.); (M.T.)
| | - Shunsuke Murata
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan; (Y.N.); (S.O.); (S.M.); (M.T.)
| | - Makoto Nishimori
- Division of Epidemiology, Kobe University Graduate School of Medicine, Kobe 650-0017, Japan;
| | - Masafumi Ihara
- Department of Neurology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan;
| | - Koji Iihara
- National Cerebral and Cardiovascular Center, Suita 564-8565, Japan;
| | - Misa Takegami
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan; (Y.N.); (S.O.); (S.M.); (M.T.)
| | - Kunihiro Nishimura
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita 564-8565, Japan; (Y.N.); (S.O.); (S.M.); (M.T.)
- Correspondence: ; Tel.: +81-6-6170-1070
| |
Collapse
|
34
|
Preethish-Kumar V, Shah A, Polavarapu K, Kumar M, Safai A, Vengalil S, Nashi S, Deepha S, Govindaraj P, Afsar M, Rajeswaran J, Nalini A, Saini J, Ingalhalikar M. Disrupted structural connectome and neurocognitive functions in Duchenne muscular dystrophy: classifying and subtyping based on Dp140 dystrophin isoform. J Neurol 2021; 269:2113-2125. [PMID: 34505932 DOI: 10.1007/s00415-021-10789-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 08/31/2021] [Accepted: 08/31/2021] [Indexed: 01/22/2023]
Abstract
OBJECTIVE Neurocognitive disabilities in Duchenne muscular dystrophy (DMD) children beginning in early childhood and distal DMD gene deletions involving disruption of Dp140 isoform are more likely to manifest significant neurocognitive impairments. MRI data analysis techniques like brain-network metrics can provide information on microstructural integrity and underlying pathophysiology. METHODS A prospective study on 95 participants [DMD = 57, and healthy controls (HC) = 38]. The muscular dystrophy functional rating scale (MDFRS) scores, neuropsychology batteries, and multiplex ligand-dependent probe amplification (MLPA) testing were used for clinical assessment, IQ estimation, and genotypic classification. Diffusion MRI and network-based statistics were used to analyze structural connectomes at various levels and correlate with clinical markers. RESULTS Motor and executive sub-networks were extracted and analyzed. Out of 57 DMD children, 23 belong to Dp140 + and 34 to Dp140- subgroup. Motor disabilities are pronounced in Dp140- subgroup as reflected by lower MDFRS scores. IQ parameters are significantly low in all-DMD cases; however, the Dp140- has specifically lowest scores. Significant differences were observed in global efficiency, transitivity, and characteristic path length between HC and DMD. Subgroup analysis demonstrates that the significance is mainly driven by participants with Dp140- than Dp140 + isoform. Finally, a random forest classifier model illustrated an accuracy of 79% between HC and DMD and 90% between DMD- subgroups. CONCLUSIONS Current findings demonstrate structural network-based characterization of abnormalities in DMD, especially prominent in Dp140-. Our observations suggest that participants with Dp140 + have relatively intact connectivity while Dp140- show widespread connectivity alterations at global, nodal, and edge levels. This study provides valuable insights supporting the genotype-phenotype correlation of brain-behavior involvement in DMD children.
Collapse
Affiliation(s)
| | - Apurva Shah
- Symbiosis Centre for Medical Image Analysis, Symbiosis International University, Mulshi, Pune, Maharashtra, India
| | - Kiran Polavarapu
- Department of Neurology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Manoj Kumar
- Department of Neuroimaging and Interventional Radiology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Apoorva Safai
- Symbiosis Centre for Medical Image Analysis, Symbiosis International University, Mulshi, Pune, Maharashtra, India
| | - Seena Vengalil
- Department of Neurology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Saraswati Nashi
- Department of Neurology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Sekar Deepha
- Neuromuscular Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Periyasamy Govindaraj
- Neuromuscular Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Mohammad Afsar
- Department of Neuropsychology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Jamuna Rajeswaran
- Department of Neuropsychology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Atchayaram Nalini
- Department of Neurology, National Institute of Mental Health and Neurosciences, Bangalore, India
| | - Jitender Saini
- Department of Neuroimaging and Interventional Radiology, National Institute of Mental Health and Neurosciences, Bangalore, India.
| | - Madhura Ingalhalikar
- Symbiosis Centre for Medical Image Analysis, Symbiosis International University, Mulshi, Pune, Maharashtra, India.
| |
Collapse
|
35
|
Ferrando Chacón JL, Fernández de Barrena T, García A, Sáez de Buruaga M, Badiola X, Vicente J. A Novel Machine Learning-Based Methodology for Tool Wear Prediction Using Acoustic Emission Signals. SENSORS 2021; 21:s21175984. [PMID: 34502874 PMCID: PMC8434684 DOI: 10.3390/s21175984] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 08/31/2021] [Accepted: 09/01/2021] [Indexed: 11/16/2022]
Abstract
There is an increasing trend in the industry of knowing in real-time the condition of their assets. In particular, tool wear is a critical aspect, which requires real-time monitoring to reduce costs and scrap in machining processes. Traditionally, for the purpose of predicting tool wear conditions in machining, mathematical models have been developed to extract the information from the signal of sensors attached to the machines. To reduce the complexity of developing physical models, where an in-depth knowledge of the system being modelled is required, the current trend is to use machine-learning (ML) models based on data from the tool wear. The acoustic emission (AE) technique has been widely used to capture data from and understand the real-time condition of industrial assets such as cutting tools. However, AE signal interpretation and processing is rather complex. One of the most common features extracted from AE signals to predict the tool wear is the counts parameter, defined as the number of times that the amplitude of the signal exceeds a predefined threshold. A recurrent problem of this feature is to define the adequate threshold to obtain consistent wear prediction. Additionally, AE signal bandwidth is rather wide, and the selection of the optimum frequencies band for feature extraction has been pointed out as critical and complex by many authors. To overcome these problems, this paper proposes a methodology that applies multi-threshold count feature extraction at multiresolution level using wavelet packet transform, which extracts a redundant and non-optimal feature map from the AE signal. Next, recursive feature elimination is performed to reduce and optimize the vast number of predicting features generated in the previous step, and random forests regression provides the estimated tool wear. The methodology presented was tested using data captured when turning 19NiMoCr6 steel under pre-established cutting conditions. The results obtained were compared with several ML algorithms such as k-nearest neighbors, support vector machines, artificial neural networks and decision trees. Experimental results show that the proposed method can reduce the predicted root mean squared error by 36.53%.
Collapse
Affiliation(s)
- Juan Luis Ferrando Chacón
- Vicomtech Foundation, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastian, Spain; (T.F.d.B.); (A.G.)
- Correspondence:
| | - Telmo Fernández de Barrena
- Vicomtech Foundation, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastian, Spain; (T.F.d.B.); (A.G.)
| | - Ander García
- Vicomtech Foundation, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastian, Spain; (T.F.d.B.); (A.G.)
| | - Mikel Sáez de Buruaga
- Faculty of Engineering, Mondragon University, 20500 Mondragon, Spain; (M.S.d.B.); (X.B.); (J.V.)
| | - Xabier Badiola
- Faculty of Engineering, Mondragon University, 20500 Mondragon, Spain; (M.S.d.B.); (X.B.); (J.V.)
| | - Javier Vicente
- Faculty of Engineering, Mondragon University, 20500 Mondragon, Spain; (M.S.d.B.); (X.B.); (J.V.)
| |
Collapse
|
36
|
Bouba I, Visser B, Kemp B, Rodenburg TB, van den Brand H. Predicting hatchability of layer breeders and identifying effects of animal related and environmental factors. Poult Sci 2021; 100:101394. [PMID: 34428647 PMCID: PMC8385447 DOI: 10.1016/j.psj.2021.101394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/19/2021] [Accepted: 07/20/2021] [Indexed: 11/02/2022] Open
Abstract
In this study, a data driven approach was used by applying linear regression and machine learning methods to understand animal related and environmental factors affecting hatchability. Data was obtained from a parent stock and grand-parent stock hatchery, including 1,737 batches of eggs incubated in the years 2010-2018. Animal related factors taken into consideration were strain (white vs. brown strain), breeder age, and egg weight uniformity at the start of incubation, whereas environmental factors considered were length of egg storage before incubation, egg weight loss during incubation and season. Effects of these factors on hatchability were analyzed with 3 different models: a linear regression (LR) model, a random forest (RF) model and a gradient boosting machine (GBM) model. In part one of the study, hatchability was predicted and the performance of the models in terms of coefficient of determination (R2) and root mean square error (RMSE) was compared. The ensemble machine learning models (RF: R2 = 0.35, RMSE = 8.41; GBM: R2 = 0.31, RMSE = 8.67) appeared to be superior than the LR model (R2 = 0.27, RMSE = 8.92) as indicated by the higher R2 and lower RMSE. In part 2 of the study, effects of these factors on hatchability were investigated more into detail. Hatchability was affected by strain, breeder age, egg weight uniformity, length of egg storage and season, but egg weight loss didn't have a significant effect on hatchability. Additionally, four 2-way interactions (breeder age × egg weight uniformity, breeder age × length of egg storage, breeder age × strain, season × strain) were significant on hatchability. It can be concluded that hatchability of parent stock and grand-parent stock layer breeders is affected by several animal related and environmental factors, but the size of the predicted effects varies between the methods used. In this study, 3 models were used to predict hatchability and to analyze effects of animal related and environmental factors on hatchability. This opens new horizons for future studies on hatchery data by taking the advantage of applying machine learning methods, that can fit complex datasets better than LR and applying statistical analysis.
Collapse
Affiliation(s)
- I Bouba
- Hendrix Genetics, Boxmeer, 5831 CK, Netherlands; Animals in Science and Society, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands.
| | - B Visser
- Hendrix Genetics, Boxmeer, 5831 CK, Netherlands
| | - B Kemp
- Adaptation Physiology Group, Wageningen University & Research, Wageningen, Netherlands
| | - T B Rodenburg
- Animals in Science and Society, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands; Adaptation Physiology Group, Wageningen University & Research, Wageningen, Netherlands
| | - H van den Brand
- Adaptation Physiology Group, Wageningen University & Research, Wageningen, Netherlands
| |
Collapse
|
37
|
Li T, Xu Y, Yao L. Detecting urban landscape factors controlling seasonal land surface temperature: from the perspective of urban function zones. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:41191-41206. [PMID: 33779910 DOI: 10.1007/s11356-021-13695-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 03/24/2021] [Indexed: 06/12/2023]
Abstract
Understanding the impact on the thermal effect by urbanization is of great significance for urban thermal regulation and is essential for determining the relationship between the urban heat island (UHI) effect and the complexities of urban function and landscape structure. For this purpose, we conducted case research in the metropolitan region of Beijing, China, and nearly 5000 urban blocks assigned different urban function zones (UFZs) were identified as the basic spatial analysis units. The seasonal land surface temperature (LST) retrieved from remote sensing data was used to represent the UHI characteristics of the study area, and the surface biophysical parameters, building forms, and filtered landscape pattern metrics were selected as the urban landscape factors. Then, the effects of urban function and landscape structure on the UHI effect were examined based on the optimal results of the ordinary least squares and geographically weighted regression models. The results indicated that (1) Significant spatiotemporal heterogeneity of the LST was found in the study area, and there was an obvious temperature gradient with "working-living-resting" UFZs. (2) All types of urban landscape factors showed a significant contribution to the seasonal LST, in the order of surface biophysical factors > building forms > landscape factors; however, their contributions varied in different seasons. (3) The major contributing factors showed a certain difference due to the variation of urban function and landscape complexity. This study expands the understanding on the complex relationship among urban landscape, function, and thermal environment, which could benefit urban landscape planning for UHI alleviation.
Collapse
Affiliation(s)
- Tong Li
- College of Geography and Environment, Shandong Normal University, Jinan, 250358, China
| | - Ying Xu
- School of Civil Engineering, Shandong Jiaotong University, Jinan, 250023, China
| | - Lei Yao
- College of Geography and Environment, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
38
|
Ogata S, Takegami M, Ozaki T, Nakashima T, Onozuka D, Murata S, Nakaoku Y, Suzuki K, Hagihara A, Noguchi T, Iihara K, Kitazume K, Morioka T, Yamazaki S, Yoshida T, Yamagata Y, Nishimura K. Heatstroke predictions by machine learning, weather information, and an all-population registry for 12-hour heatstroke alerts. Nat Commun 2021; 12:4575. [PMID: 34321480 PMCID: PMC8319225 DOI: 10.1038/s41467-021-24823-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 07/08/2021] [Indexed: 11/09/2022] Open
Abstract
This study aims to develop and validate prediction models for the number of all heatstroke cases, and heatstrokes of hospital admission and death cases per city per 12 h, using multiple weather information and a population-based database for heatstroke patients in 16 Japanese cities (corresponding to around a 10,000,000 population size). In the testing dataset, mean absolute percentage error of generalized linear models with wet bulb globe temperature as the only predictor and the optimal models, respectively, are 43.0% and 14.8% for spikes in the number of all heatstroke cases, and 37.7% and 10.6% for spikes in the number of heatstrokes of hospital admission and death cases. The optimal models predict the spikes in the number of heatstrokes well by machine learning methods including non-linear multivariable predictors and/or under-sampling and bagging. Here, we develop prediction models whose predictive performances are high enough to be implemented in public health settings. In the context of climate change, heatstroke is expected to become an increasingly relevant public health concern. Here, the authors develop and validate prediction models for the number of all heatstroke cases in different cities in Japan.
Collapse
Affiliation(s)
- Soshiro Ogata
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Misa Takegami
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Taira Ozaki
- Department of Civil, Environmental and Applied Systems Engineering, Faculty of Environmental and Urban Engineering, Kansai University, Suita, Osaka, Japan
| | - Takahiro Nakashima
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Daisuke Onozuka
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Shunsuke Murata
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Yuriko Nakaoku
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Koyu Suzuki
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Akihito Hagihara
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Teruo Noguchi
- Department of Cardiovascular Medicine, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Koji Iihara
- Director General, National Cerebral and Cardiovascular Center Hospital, Suita, Osaka, Japan
| | - Keiichi Kitazume
- Department of Civil, Environmental and Applied Systems Engineering, Faculty of Environmental and Urban Engineering, Kansai University, Suita, Osaka, Japan
| | - Tohru Morioka
- Department of Civil, Environmental and Applied Systems Engineering, Faculty of Environmental and Urban Engineering, Kansai University, Suita, Osaka, Japan
| | - Shin Yamazaki
- Health and Environmental Risk Division, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan
| | - Takahiro Yoshida
- Earth System Division, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan.,Department of Urban Engineering, School of Engineering, The University of Tokyo, Tokyo, Japan
| | - Yoshiki Yamagata
- Earth System Division, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan.,Graduate School of System Design and Management, Keio University, Yokohama, Kanagawa, Japan
| | - Kunihiro Nishimura
- Department of Preventive Medicine and Epidemiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan.
| |
Collapse
|
39
|
Safai A, Shinde S, Jadhav M, Chougule T, Indoria A, Kumar M, Santosh V, Jabeen S, Beniwal M, Konar S, Saini J, Ingalhalikar M. Developing a Radiomics Signature for Supratentorial Extra-Ventricular Ependymoma Using Multimodal MR Imaging. Front Neurol 2021; 12:648092. [PMID: 34367044 PMCID: PMC8339322 DOI: 10.3389/fneur.2021.648092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 06/14/2021] [Indexed: 11/25/2022] Open
Abstract
Rationale and Objectives: To build a machine learning-based diagnostic model that can accurately distinguish adult supratentorial extraventricular ependymoma (STEE) from similarly appearing high-grade gliomas (HGG) using quantitative radiomic signatures from a multi-parametric MRI framework. Materials and Methods: We computed radiomic features on the preprocessed and segmented tumor masks from a pre-operative multimodal MRI dataset [contrast-enhanced T1 (T1ce), T2, fluid-attenuated inversion recovery (FLAIR), apparent diffusion coefficient (ADC)] from STEE (n = 15), HGG-Grade IV (HGG-G4) (n = 24), and HGG-Grade III (HGG-G3) (n = 36) patients, followed by an optimum two-stage feature selection and multiclass classification. Performance of multiple classifiers were evaluated on both unimodal and multimodal feature sets and most discriminative radiomic features involved in classification of STEE from HGG subtypes were obtained. Results: Multimodal features demonstrated higher classification performance over unimodal feature set in discriminating STEE and HGG subtypes with an accuracy of 68% on test data and above 80% on cross validation, along with an overall above 90% specificity. Among unimodal feature sets, those extracted from FLAIR demonstrated high classification performance in delineating all three tumor groups. Texture-based radiomic features particularly from FLAIR were most important in discriminating STEE from HGG-G4, whereas first-order features from T2 and ADC consistently ranked higher in differentiating multiple tumor groups. Conclusions: This study illustrates the utility of radiomics-based multimodal MRI framework in accurately discriminating similarly appearing adult STEE from HGG subtypes. Radiomic features from multiple MRI modalities could capture intricate and complementary information for a robust and highly accurate multiclass tumor classification.
Collapse
Affiliation(s)
- Apoorva Safai
- Symbiosis Center for Medical Image Analysis, Symbiosis Institute of Technology, Symbiosis International University, Pune, India
| | - Sumeet Shinde
- Symbiosis Center for Medical Image Analysis, Symbiosis Institute of Technology, Symbiosis International University, Pune, India
| | - Manali Jadhav
- Symbiosis Center for Medical Image Analysis, Symbiosis Institute of Technology, Symbiosis International University, Pune, India
| | - Tanay Chougule
- Symbiosis Center for Medical Image Analysis, Symbiosis Institute of Technology, Symbiosis International University, Pune, India
| | - Abhilasha Indoria
- Department of Neuroimaging & Interventional Radiology, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Manoj Kumar
- Department of Neuroimaging & Interventional Radiology, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Vani Santosh
- Department of Neuropathology, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Shumyla Jabeen
- Department of Neuroimaging & Interventional Radiology, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Manish Beniwal
- Department of Neurosurgery, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Subhash Konar
- Department of Neurosurgery, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Jitender Saini
- Department of Neuroimaging & Interventional Radiology, National Institute of Mental Health & Neurosciences, Bangalore, India
| | - Madhura Ingalhalikar
- Symbiosis Center for Medical Image Analysis, Symbiosis Institute of Technology, Symbiosis International University, Pune, India
| |
Collapse
|
40
|
Shi H, Pan Y, Yang F, Cao J, Tan X, Yuan B, Jiang J. Nano-SAR Modeling for Predicting the Cytotoxicity of Metal Oxide Nanoparticles to PaCa2. Molecules 2021; 26:molecules26082188. [PMID: 33920258 PMCID: PMC8069170 DOI: 10.3390/molecules26082188] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 04/03/2021] [Accepted: 04/06/2021] [Indexed: 11/16/2022] Open
Abstract
Nowadays, the impact of engineered nanoparticles (NPs) on human health and environment has aroused widespread attention. It is essential to assess and predict the biological activity, toxicity, and physicochemical properties of NPs. Computation-based methods have been developed to be efficient alternatives for understanding the negative effects of nanoparticles on the environment and human health. Here, a classification-based structure-activity relationship model for nanoparticles (nano-SAR) was developed to predict the cellular uptake of 109 functionalized magneto-fluorescent nanoparticles to pancreatic cancer cells (PaCa2). The norm index descriptors were employed for describing the structure characteristics of the involved nanoparticles. The Random forest algorithm (RF), combining with the Recursive Feature Elimination (RFE) was employed to develop the nano-SAR model. The resulted model showed satisfactory statistical performance, with the accuracy (ACC) of the test set and the training set of 0.950 and 0.966, respectively, demonstrating that the model had satisfactory classification effect. The model was rigorously verified and further extensively compared with models in the literature. The proposed model could be reasonably expected to predict the cellular uptakes of nanoparticles and provide some guidance for the design and manufacture of safer nanomaterials.
Collapse
Affiliation(s)
- Haihua Shi
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
| | - Yong Pan
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
- Correspondence: ; Tel.: +86-25-581-398-73
| | - Fan Yang
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
| | - Jiakai Cao
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
| | - Xinlong Tan
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
| | - Beilei Yuan
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
| | - Juncheng Jiang
- Jiangsu Key Laboratory of Hazardous Chemicals Safety and Control, College of Safety Science and Engineering, Nanjing Tech University, Nanjing 210009, China; (H.S.); (F.Y.); (J.C.); (X.T.); (B.Y.); (J.J.)
- School of Environment & Safety Engineering, Changzhou University, Changzhou 213164, China
| |
Collapse
|
41
|
Ho IMK, Cheong KY, Weldon A. Predicting student satisfaction of emergency remote learning in higher education during COVID-19 using machine learning techniques. PLoS One 2021; 16:e0249423. [PMID: 33798204 PMCID: PMC8018673 DOI: 10.1371/journal.pone.0249423] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 03/17/2021] [Indexed: 11/20/2022] Open
Abstract
Despite the wide adoption of emergency remote learning (ERL) in higher education during the COVID-19 pandemic, there is insufficient understanding of influencing factors predicting student satisfaction for this novel learning environment in crisis. The present study investigated important predictors in determining the satisfaction of undergraduate students (N = 425) from multiple departments in using ERL at a self-funded university in Hong Kong while Moodle and Microsoft Team are the key learning tools. By comparing the predictive accuracy between multiple regression and machine learning models before and after the use of random forest recursive feature elimination, all multiple regression, and machine learning models showed improved accuracy while the most accurate model was the elastic net regression with 65.2% explained variance. The results show only neutral (4.11 on a 7-point Likert scale) regarding the overall satisfaction score on ERL. Even majority of students are competent in technology and have no obvious issue in accessing learning devices or Wi-Fi, face-to-face learning is more preferable compared to ERL and this is found to be the most important predictor. Besides, the level of efforts made by instructors, the agreement on the appropriateness of the adjusted assessment methods, and the perception of online learning being well delivered are shown to be highly important in determining the satisfaction scores. The results suggest that the need of reviewing the quality and quantity of modified assessment accommodated for ERL and structured class delivery with the suitable amount of interactive learning according to the learning culture and program nature.
Collapse
Affiliation(s)
- Indy Man Kit Ho
- Technological and Higher Education Institute of Hong Kong (THEi), Chai Wan, Hong Kong
| | - Kai Yuen Cheong
- Technological and Higher Education Institute of Hong Kong (THEi), Chai Wan, Hong Kong
| | - Anthony Weldon
- Technological and Higher Education Institute of Hong Kong (THEi), Chai Wan, Hong Kong
| |
Collapse
|
42
|
Distinguishing Planting Structures of Different Complexity from UAV Multispectral Images. SENSORS 2021; 21:s21061994. [PMID: 33808967 PMCID: PMC8000794 DOI: 10.3390/s21061994] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 02/21/2021] [Accepted: 02/22/2021] [Indexed: 11/16/2022]
Abstract
This study explores the classification potential of a multispectral classification model for farmland with planting structures of different complexity. Unmanned aerial vehicle (UAV) remote sensing technology is used to obtain multispectral images of three study areas with low-, medium-, and high-complexity planting structures, containing three, five, and eight types of crops, respectively. The feature subsets of three study areas are selected by recursive feature elimination (RFE). Object-oriented random forest (OB-RF) and object-oriented support vector machine (OB-SVM) classification models are established for the three study areas. After training the models with the feature subsets, the classification results are evaluated using a confusion matrix. The OB-RF and OB-SVM models’ classification accuracies are 97.09% and 99.13%, respectively, for the low-complexity planting structure. The equivalent values are 92.61% and 99.08% for the medium-complexity planting structure and 88.99% and 97.21% for the high-complexity planting structure. For farmland with fragmentary plots and a high-complexity planting structure, as the planting structure complexity changed from low to high, both models’ overall accuracy levels decreased. The overall accuracy of the OB-RF model decreased by 8.1%, and that of the OB-SVM model only decreased by 1.92%. OB-SVM achieves an overall classification accuracy of 97.21%, and a single-crop extraction accuracy of at least 85.65%. Therefore, UAV multispectral remote sensing can be used for classification applications in highly complex planting structures.
Collapse
|
43
|
Validation of Visually Interpreted Corine Land Cover Classes with Spectral Values of Satellite Images and Machine Learning. REMOTE SENSING 2021. [DOI: 10.3390/rs13050857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
We analyzed the Corine Land Cover 2018 (CLC2018) dataset to reveal the correspondence between land cover categories of the CLC and the spectral information of Landsat-8, Sentinel-2 and PlanetScope images. Level 1 categories of the CLC2018 were analyzed in a 25 km × 25 km study area in Hungary. Spectral data were summarized by land cover polygons, and the dataset was evaluated with statistical tests. We then performed Linear Discriminant Analysis (LDA) and Random Forest classifications to reveal if CLC L1 level categories were confirmed by spectral values. Wetlands and water bodies were the most likely to be confused with other categories. The least mixture was observed when we applied the median to quantify the pixel variance of CLC polygons. RF outperformed the LDA’s accuracy, and PlanetScope’s data were the most accurate. Analysis of class level accuracies showed that agricultural areas and wetlands had the most issues with misclassification. We proved the representativeness of the results with a repeated randomized test, and only PlanetScope seemed to be ungeneralizable. Results showed that CLC polygons, as basic units of land cover, can ensure 71.1–78.5% OAs for the three satellite sensors; higher geometric resolution resulted in better accuracy. These results justified CLC polygons, in spite of visual interpretation, can hold relevant information about land cover considering the surface reflectance values of satellites. However, using CLC as ground truth data for land cover classifications can be questionable, at least in the L1 nomenclature.
Collapse
|
44
|
Singh S, Agrawal A, Kodamana H, Ramteke M. Multi-objective Optimization Based Recursive Feature Elimination for Process Monitoring. Neural Process Lett 2021. [DOI: 10.1007/s11063-021-10430-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
45
|
Uncertainty and Overfitting in Fluvial Landform Classification Using Laser Scanned Data and Machine Learning: A Comparison of Pixel and Object-Based Approaches. REMOTE SENSING 2020. [DOI: 10.3390/rs12213652] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Floodplains are valuable scenes of water management and nature conservation. A better understanding of their geomorphological characteristic helps to understand the main processes involved. We performed a classification of floodplain forms in a naturally developed area in Hungary using a Digital Terrain Model (DTM) of aerial laser scanning. We derived 60 geomorphometric variables from the DTM and prepared a geomorphological map of 265 forms (crevasse channels, point bars, swales, levees). Random Forest classification was conducted with Recursive Feature Elimination (RFE) on the objects (mean pixel values by forms) and on the pixels of the variables. We also evaluated the classification probabilities (CP), the spatial uncertainties (SU), and the overfitting in the function of the number of the variables. We found that the object-based method had a better performance (95%) than the pixel-based method (78%). RFE helped to identify the most important 13–20 variables, maintaining the high model performance and reducing the overfitting. However, CP and SU were not efficient measures of classification accuracy as they were not in accordance with the class level accuracy metric. Our results help to understand classification results and the specific limits of laser scanned DTMs. This methodology can be useful in geomorphologic mapping.
Collapse
|
46
|
Building Extraction Using Orthophotos and Dense Point Cloud Derived from Visual Band Aerial Imagery Based on Machine Learning and Segmentation. REMOTE SENSING 2020. [DOI: 10.3390/rs12152397] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Urban sprawl related increase of built-in areas requires reliable monitoring methods and remote sensing can be an efficient technique. Aerial surveys, with high spatial resolution, provide detailed data for building monitoring, but archive images usually have only visible bands. We aimed to reveal the efficiency of visible orthophotographs and photogrammetric dense point clouds in building detection with segmentation-based machine learning (with five algorithms) using visible bands, texture information, and spectral and morphometric indices in different variable sets. Usually random forest (RF) had the best (99.8%) and partial least squares the worst overall accuracy (~60%). We found that >95% accuracy can be gained even in class level. Recursive feature elimination (RFE) was an efficient variable selection tool, its result with six variables was like when we applied all the available 31 variables. Morphometric indices had 82% producer’s and 85% user’s Accuracy (PA and UA, respectively) and combining them with spectral and texture indices, it had the largest contribution in the improvement. However, morphometric indices are not always available but by adding texture and spectral indices to red-green-blue (RGB) bands the PA improved with 12% and the UA with 6%. Building extraction from visual aerial surveys can be accurate, and archive images can be involved in the time series of a monitoring.
Collapse
|
47
|
Muñoz B, Schobel SA, Lisboa FA, Khatri V, Grey SF, Dente CJ, Kirk AD, Buchman T, Elster EA. Clinical risk factors and inflammatory biomarkers of post-traumatic acute kidney injury in combat patients. Surgery 2020; 168:662-670. [PMID: 32600883 DOI: 10.1016/j.surg.2020.04.064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 04/29/2020] [Accepted: 04/30/2020] [Indexed: 12/23/2022]
Abstract
BACKGROUND Post-traumatic acute kidney injury has occurred in every major military conflict since its initial description during World War II. To ensure the proper treatment of combat casualties, early detection is critical. This study therefore aimed to investigate combat-related post-traumatic acute kidney injury in recent military conflicts, used machine learning algorithms to identify clinical and biomarker variables associated with the development of post-traumatic acute kidney injury, and evaluated the effects of post-traumatic acute kidney injury on wound healing and nosocomial infection. METHODS We conducted a retrospective clinical cohort review of 73 critically injured US military service members who sustained major combat-related extremity wounds and had collected injury characteristics, assayed serum and tissue biopsy samples for the expression of protein and messenger ribonucleic acid biomarkers. Bivariate analyses and random forest recursive feature elimination classification algorithms were used to identify associated injury characteristics and biomarker variables. RESULTS The incidence of post-traumatic acute kidney injury was 20.5%. Of that, 86% recovered baseline renal function and only 2 (15%) of the acute kidney injury group required renal replacement therapy. Random forest recursive feature elimination algorithms were able to estimate post-traumatic acute kidney injury with the area under the curve of 0.93, sensitivity of 0.91, and specificity of 0.91. Post-traumatic acute kidney injury was associated with injury severity score, serum epidermal growth factor, and tissue activin A type receptor 1, matrix metallopeptidase 10, and X-C motif chemokine ligand 1 expression. Patients with post-traumatic acute kidney injury exhibited poor wound healing and increased incidence of nosocomial infections. CONCLUSION The occurrence of acute kidney injury in combat casualties may be estimated using injury characteristics and serum and tissue biomarkers. External validations of these models are necessary to generalize for all trauma patients.
Collapse
Affiliation(s)
- Beau Muñoz
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD
| | - Seth A Schobel
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Henry Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD
| | - Felipe A Lisboa
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Henry Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD
| | - Vivek Khatri
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Henry Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD
| | - Scott F Grey
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Henry Jackson Foundation for the Advancement of Military Medicine, Inc, Bethesda, MD
| | - Christopher J Dente
- Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Emory University, Atlanta, GA
| | - Allan D Kirk
- Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Duke University, Durham, NC
| | - Timothy Buchman
- Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD; Emory University, Atlanta, GA
| | - Eric A Elster
- Department of Surgery at the Uniformed Services University of the Health Sciences and the Walter Reed National Military Medical Center, Bethesda, MD; Uniformed Services University Surgical Critical Care Initiative, Bethesda, MD.
| |
Collapse
|
48
|
Ma G, Hao Z, Wu X, Wang X. An Optimal Electrical Impedance Tomography Drive Pattern for Human-Computer Interaction Applications. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2020; 14:402-411. [PMID: 31976903 DOI: 10.1109/tbcas.2020.2967785] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we presented an optimal Electrical Impedance Tomography (EIT) drive pattern based on feature selection and model explanation, and proposed a portable EIT system for applications in human-computer interaction for gesture recognition and contact detection, which can reduce the measurement time and realize a performance trade-off between the accuracy and the time response. In our experiment, eleven hand gestures were designed to verify the proposed approach and EIT system. Compared to the traditional eight-electrode method, the optimal electrode drive pattern achieved a recognition accuracy of 97.5% with seven electrodes and the measurement time was reduced by 60%. To illustrate the universality of this method, we performed a contact detection experiment. By setting seven labels on the conductive panel and using optimal electrode drive pattern, the detection accuracy reached 100% with seven electrodes and the measurement time was reduced by 85%.
Collapse
|
49
|
Chen Q, Meng Z, Su R. WERFE: A Gene Selection Algorithm Based on Recursive Feature Elimination and Ensemble Strategy. Front Bioeng Biotechnol 2020; 8:496. [PMID: 32548100 PMCID: PMC7270206 DOI: 10.3389/fbioe.2020.00496] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Accepted: 04/28/2020] [Indexed: 12/11/2022] Open
Abstract
Gene selection algorithm in micro-array data classification problem finds a small set of genes which are most informative and distinctive. A well-performed gene selection algorithm should pick a set of genes that achieve high performance and the size of this gene set should be as small as possible. Many of the existing gene selection algorithms suffer from either low performance or large size. In this study, we propose a wrapper gene selection approach, named WERFE, within a recursive feature elimination (RFE) framework to make the classification more efficient. This WERFE employs an ensemble strategy, takes advantages of a variety of gene selection methods and assembles the top selected genes in each approach as the final gene subset. By integrating multiple gene selection algorithms, the optimal gene subset is determined through prioritizing the more important genes selected by each gene selection method and a more discriminative and compact gene subset can be selected. Experimental results show that the proposed method can achieve state-of-the-art performance.
Collapse
Affiliation(s)
- Qi Chen
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Military Transportation Command Department, Army Military Transportation University, Tianjin, China
| | - Zhaopeng Meng
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang University, Fuzhou, China
| |
Collapse
|
50
|
Pour SH, Wahab AKA, Shahid S. Physical-empirical models for prediction of seasonal rainfall extremes of Peninsular Malaysia. ATMOSPHERIC RESEARCH 2020; 233:104720. [DOI: 10.1016/j.atmosres.2019.104720] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|