1
|
Jain S, Malhotra KPK, Patiyal S, Raghava GPS. A Highly Accurate Model for Screening Prostate Cancer Using Propensity Index Panel of Ten Genes. J Comput Biol 2023; 30:1305-1314. [PMID: 37917795 DOI: 10.1089/cmb.2023.0040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2023] Open
Affiliation(s)
- Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi, India
| | - Kawal Preet Kaur Malhotra
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi, India
| | - Gajendra Pal Singh Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi, India
| |
Collapse
|
2
|
Wu Y, Xiao Q, Wang S, Xu H, Fang Y. Establishment and Analysis of an Artificial Neural Network Model for Early Detection of Polycystic Ovary Syndrome Using Machine Learning Techniques. J Inflamm Res 2023; 16:5667-5676. [PMID: 38050562 PMCID: PMC10693771 DOI: 10.2147/jir.s438838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/10/2023] [Indexed: 12/06/2023] Open
Abstract
Background To identify novel gene combinations and to develop an early diagnostic model for Polycystic Ovary Syndrome (PCOS) through the integration of artificial neural networks (ANN) and random forest (RF) methods. Methods We retrieved and processed gene expression datasets for PCOS from the Gene Expression Omnibus (GEO) database. Differential expression analysis of genes (DEGs) within the training set was performed using the "limma" R package. Enrichment analyses on DEGs using gene ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG), and immune cell infiltration. The identification of critical genes from DEGs was then performed using random forests, followed by the developing of new diagnostic models for PCOS using artificial neural networks. Results We identified 130 up-regulated genes and 132 down-regulated genes in PCOS compared to normal samples. Gene Ontology analysis revealed significant enrichment in myofibrils and highlighted crucial biological functions related to myofilament sliding, myofibril, and actin-binding. Compared with normal tissues, the types of immune cells expressed in PCOS samples are different. A random forest algorithm identified 10 significant genes proposed as potential PCOS-specific biomarkers. Using these genes, an artificial neural network diagnostic model accurately distinguished PCOS from normal samples. The diagnostic model underwent validation using the independent validation set, and the resulting area under the receiver operating characteristic curve (AUC) values was consistent with the anticipated outcomes. Conclusion Utilizing unique gene combinations, this research created a diagnostic model by merging random forest techniques with artificial neural networks. The AUC indicated a notably superior performance of the diagnostic model.
Collapse
Affiliation(s)
- Yumi Wu
- Institute of Acupuncture and Moxibustion of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
| | - QiWei Xiao
- Institute of Acupuncture and Moxibustion of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
| | - ShouDong Wang
- The Out-Patient Department of TCM of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
| | - Huanfang Xu
- Institute of Acupuncture and Moxibustion of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
- Acupuncture and Moxibustion Hospital of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
| | - YiGong Fang
- Institute of Acupuncture and Moxibustion of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
- Acupuncture and Moxibustion Hospital of China Academy of Chinese Medical Sciences, Beijing, People’s Republic of China
| |
Collapse
|
3
|
Towler L, Bondaronek P, Papakonstantinou T, Amlôt R, Chadborn T, Ainsworth B, Yardley L. Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques. Front Public Health 2023; 11:1268223. [PMID: 38026376 PMCID: PMC10644111 DOI: 10.3389/fpubh.2023.1268223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction Machine-assisted topic analysis (MATA) uses artificial intelligence methods to help qualitative researchers analyze large datasets. This is useful for researchers to rapidly update healthcare interventions during changing healthcare contexts, such as a pandemic. We examined the potential to support healthcare interventions by comparing MATA with "human-only" thematic analysis techniques on the same dataset (1,472 user responses from a COVID-19 behavioral intervention). Methods In MATA, an unsupervised topic-modeling approach identified latent topics in the text, from which researchers identified broad themes. In human-only codebook analysis, researchers developed an initial codebook based on previous research that was applied to the dataset by the team, who met regularly to discuss and refine the codes. Formal triangulation using a "convergence coding matrix" compared findings between methods, categorizing them as "agreement", "complementary", "dissonant", or "silent". Results Human analysis took much longer than MATA (147.5 vs. 40 h). Both methods identified key themes about what users found helpful and unhelpful. Formal triangulation showed both sets of findings were highly similar. The formal triangulation showed high similarity between the findings. All MATA codes were classified as in agreement or complementary to the human themes. When findings differed slightly, this was due to human researcher interpretations or nuance from human-only analysis. Discussion Results produced by MATA were similar to human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyze large datasets quickly. This approach can support intervention development and implementation, such as enabling rapid optimization during public health emergencies.
Collapse
Affiliation(s)
- Lauren Towler
- School of Psychology, University of Southampton, Southampton, United Kingdom
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
| | - Paulina Bondaronek
- Department of Health and Social Care, Office for Health Improvement and Disparities, London, United Kingdom
- Institute for Health Informatics, University College London, London, United Kingdom
| | - Trisevgeni Papakonstantinou
- Department of Health and Social Care, Office for Health Improvement and Disparities, London, United Kingdom
- Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, United Kingdom
| | - Richard Amlôt
- Behavioural Science and Insights Unit, UK Health Security Agency, London, United Kingdom
| | - Tim Chadborn
- Department of Health and Social Care, Office for Health Improvement and Disparities, London, United Kingdom
| | - Ben Ainsworth
- Department of Psychology, University of Bath, Bath, United Kingdom
- National Institute for Health Research Biomedical Research Centre, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Lucy Yardley
- School of Psychology, University of Southampton, Southampton, United Kingdom
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
4
|
Ahmad Hamdan AF, Abu Bakar A. Machine Learning Predictions on Outpatient No-Show Appointments in a Malaysia Major Tertiary Hospital. Malays J Med Sci 2023; 30:169-180. [PMID: 37928795 PMCID: PMC10624443 DOI: 10.21315/mjms2023.30.5.14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 11/12/2022] [Indexed: 11/07/2023] Open
Abstract
Introduction A no-show appointment occurs when a patient does not attend a previously booked appointment. This situation can cause other problems, such as discontinuity of patient treatments as well as a waste of both human and financial resources. One of the latest approaches to address this issue is predicting no-shows using machine learning techniques. This study aims to propose a predictive analytical approach for developing a patient no-show appointment model in Hospital Kuala Lumpur (HKL) using machine learning algorithms. Methods This study uses outpatient data from the HKL's Patient Management System (SPP) throughout 2019. The final data set has 246,943 appointment records with 13 attributes used for both descriptive and predictive analyses. The predictive analysis was carried out using seven machine learning algorithms, namely, logistic regression (LR), decision tree (DT), k-near neighbours (k-NN), Naïve Bayes (NB), random forest (RF), gradient boosting (GB) and multilayer perceptron (MLP). Results The descriptive analysis showed that the no-show rate was 28%, and attributes such as the month of the appointment and the gender of the patient seem to influence the possibility of a patient not showing up. Evaluation of the predictive model found that the GB model had the highest accuracy of 78%, F1 score of 0.76 and area under the curve (AUC) value of 0.65. Conclusion The predictive model could be used to formulate intervention steps to reduce no-shows, improving patient care quality.
Collapse
Affiliation(s)
| | - Azuraliza Abu Bakar
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Selangor, Malaysia
| |
Collapse
|
5
|
Obagbuwa IC, Danster S, Chibaya OC. Supervised machine learning models for depression sentiment analysis. Front Artif Intell 2023; 6:1230649. [PMID: 37538396 PMCID: PMC10394518 DOI: 10.3389/frai.2023.1230649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 06/29/2023] [Indexed: 08/05/2023] Open
Abstract
Introduction Globally, the prevalence of mental health problems, especially depression, is at an all-time high. The objective of this study is to utilize machine learning models and sentiment analysis techniques to predict the level of depression earlier in social media users' posts. Methods The datasets used in this research were obtained from Twitter posts. Four machine learning models, namely extreme gradient boost (XGB) Classifier, Random Forest, Logistic Regression, and support vector machine (SVM), were employed for the prediction task. Results The SVM and Logistic Regression models yielded the most accurate results when applied to the provided datasets. However, the Logistic Regression model exhibited a slightly higher level of accuracy compared to SVM. Importantly, the logistic regression model demonstrated the advantage of requiring less execution time. Discussion The findings of this study highlight the potential of utilizing machine learning models and sentiment analysis techniques for early detection of depression in social media users. The effectiveness of SVM and Logistic Regression models, with Logistic Regression being more efficient in terms of execution time, suggests their suitability for practical implementation in real-world scenarios.
Collapse
Affiliation(s)
- Ibidun Christiana Obagbuwa
- Department of Computer Science and Information Technology, School of Natural and Applied Sciences, Sol Plaatje University, Kimberley, South Africa
| | | | | |
Collapse
|
6
|
Alizargar A, Chang YL, Tan TH. Performance Comparison of Machine Learning Approaches on Hepatitis C Prediction Employing Data Mining Techniques. Bioengineering (Basel) 2023; 10:481. [PMID: 37106668 PMCID: PMC10135598 DOI: 10.3390/bioengineering10040481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/06/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
Hepatitis C is a liver infection caused by the hepatitis C virus (HCV). Due to the late onset of symptoms, early diagnosis is difficult in this disease. Efficient prediction can save patients before permeant liver damage. The main objective of this study is to employ various machine learning techniques to predict this disease based on common and affordable blood test data to diagnose and treat patients in the early stages. In this study, six machine learning algorithms (Support Vector Machine (SVM), K-nearest Neighbors (KNN), Logistic Regression, decision tree, extreme gradient boosting (XGBoost), artificial neural networks (ANN)) were utilized on two datasets. The performances of these techniques were compared in terms of confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC), and the area under the curve (AUC) to identify a method that is appropriate for predicting this disease. The analysis, on NHANES and UCI datasets, revealed that SVM and XGBoost (with the highest accuracy and AUC among the test models, >80%) can be effective tools for medical professionals using routine and affordable blood test data to predict hepatitis C.
Collapse
Affiliation(s)
| | | | - Tan-Hsu Tan
- Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Taipei University of Technology, Taipei 10608, Taiwan; (A.A.); (Y.-L.C.)
| |
Collapse
|
7
|
Banas AM, Banas K, Breese MBH. Classification of the Residues after High and Low Order Explosions Using Machine Learning Techniques on Fourier Transform Infrared (FTIR) Spectra. Molecules 2023; 28:molecules28052233. [PMID: 36903479 PMCID: PMC10004765 DOI: 10.3390/molecules28052233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/23/2023] [Accepted: 02/23/2023] [Indexed: 03/04/2023] Open
Abstract
Forensic science is a field that requires precise and reliable methods for the detection and analysis of evidence. One such method is Fourier Transform Infrared (FTIR) spectroscopy, which provides high sensitivity and selectivity in the detection of samples. In this study, the use of FTIR spectroscopy and statistical multivariate analysis to identify high explosive (HE) materials (C-4, TNT, and PETN) in the residues after high- and low-order explosions is demonstrated. Additionally, a detailed description of the data pre-treatment process and the use of various machine learning classification techniques to achieve successful identification is also provided. The best results were obtained with the hybrid LDA-PCA technique, which was implemented using the R environment, a code-driven open-source platform that promotes reproducibility and transparency.
Collapse
Affiliation(s)
- Agnieszka M. Banas
- Singapore Synchrotron Light Source, National University of Singapore, 5 Research Link, Singapore 117603, Singapore
- Correspondence:
| | - Krzysztof Banas
- Singapore Synchrotron Light Source, National University of Singapore, 5 Research Link, Singapore 117603, Singapore
| | - Mark B. H. Breese
- Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117542, Singapore
| |
Collapse
|
8
|
Mandal N, Adak S, Das DK, Sahoo RN, Mukherjee J, Kumar A, Chinnusamy V, Das B, Mukhopadhyay A, Rajashekara H, Gakhar S. Spectral characterization and severity assessment of rice blast disease using univariate and multivariate models. Front Plant Sci 2023; 14:1067189. [PMID: 36909416 PMCID: PMC9997726 DOI: 10.3389/fpls.2023.1067189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/06/2023] [Indexed: 06/18/2023]
Abstract
Rice is the staple food of more than half of the population of the world and India as well. One of the major constraints in rice production is frequent occurrence of pests and diseases and one of them is rice blast which often causes yield loss varying from 10 to 30%. Conventional approaches for disease assessment are time-consuming, expensive, and not real-time; alternately, sensor-based approach is rapid, non-invasive and can be scaled up in large areas with minimum time and effort. In the present study, hyperspectral remote sensing for the characterization and severity assessment of rice blast disease was exploited. Field experiments were conducted with 20 genotypes of rice having sensitive and resistant cultivars grown under upland and lowland conditions at Almora, Uttarakhand, India. The severity of the rice blast was graded from 0 to 9 in accordance to International Rice Research Institute (IRRI). Spectral observations in field were taken using a hand-held portable spectroradiometer in range of 350-2500 nm followed by spectral discrimination of different disease severity levels using Jeffires-Matusita (J-M) distance. Then, evaluation of 26 existing spectral indices (r≥0.8) was done corresponding to blast severity levels and linear regression prediction models were also developed. Further, the proposed ratio blast index (RBI) and normalized difference blast index (NDBI) were developed using all possible combinations of their correlations with severity level followed by their quantification to identify the best indices. Thereafter, multivariate models like support vector machine regression (SVM), partial least squares (PLS), random forest (RF), and multivariate adaptive regression spline (MARS) were also used to estimate blast severity. Jeffires-Matusita distance was separating almost all severity levels having values >1.92 except levels 4 and 5. The 26 prediction models were effective at predicting blast severity with R2 values from 0.48 to 0.85. The best developed spectral indices for rice blast were RBI (R1148, R1301) and NDBI (R1148, R1301) with R2 of 0.85 and 0.86, respectively. Among multivariate models, SVM was the best model with calibration R2=0.99; validation R2=0.94, RMSE=0.7, and RPD=4.10. The methodology developed paves way for early detection and large-scale monitoring and mapping using satellite remote sensors at farmers' fields for developing better disease management options.
Collapse
Affiliation(s)
- Nandita Mandal
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Sujan Adak
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Deb K. Das
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Rabi N. Sahoo
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Joydeep Mukherjee
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Andy Kumar
- Division of Plant Pathology, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Viswanathan Chinnusamy
- Division of Plant Physiology, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Bappa Das
- Natural Resources Management, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), Goa, India
| | - Arkadeb Mukhopadhyay
- Division of Agricultural Chemicals, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| | - Hosahatti Rajashekara
- Department of Plant Pathology, Directorate of Cashew Research, Indian Council of Agricultural Research (ICAR), Karnataka, India
| | - Shalini Gakhar
- Division of Agricultural Physics, Indian Agricultural Research Institute, Indian Council of Agricultural Research (ICAR), New Delhi, India
| |
Collapse
|
9
|
Pande A, Patiyal S, Lathwal A, Arora C, Kaur D, Dhall A, Mishra G, Kaur H, Sharma N, Jain S, Usmani SS, Agrawal P, Kumar R, Kumar V, Raghava GPS. Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. J Comput Biol 2023; 30:204-222. [PMID: 36251780 DOI: 10.1089/cmb.2022.0241] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
In the last three decades, a wide range of protein features have been discovered to annotate a protein. Numerous attempts have been made to integrate these features in a software package/platform so that the user may compute a wide range of features from a single source. To complement the existing methods, we developed a method, Pfeature, for computing a wide range of protein features. Pfeature allows to compute more than 200,000 features required for predicting the overall function of a protein, residue-level annotation of a protein, and function of chemically modified peptides. It has six major modules, namely, composition, binary profiles, evolutionary information, structural features, patterns, and model building. Composition module facilitates to compute most of the existing compositional features, plus novel features. The binary profile of amino acid sequences allows to compute the fraction of each type of residue as well as its position. The evolutionary information module allows to compute evolutionary information of a protein in the form of a position-specific scoring matrix profile generated using Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST); fit for annotation of a protein and its residues. A structural module was developed for computing of structural features/descriptors from a tertiary structure of a protein. These features are suitable to predict the therapeutic potential of a protein containing non-natural or chemically modified residues. The model-building module allows to implement various machine learning techniques for developing classification and regression models as well as feature selection. Pfeature also allows the generation of overlapping patterns and features from a protein. A user-friendly Pfeature is available as a web server python library and stand-alone package.
Collapse
Affiliation(s)
- Akshara Pande
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Anjali Lathwal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Chakit Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Dilraj Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Gaurav Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Department of Electrical Engineering, Shiv Nadar University, Greater Noida, India
| | - Harpreet Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Salman Sadullah Usmani
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Piyush Agrawal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Rajesh Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Vinod Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|
10
|
Patiyal S, Dhall A, Bajaj K, Sahu H, Raghava GPS. Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile. Brief Bioinform 2023; 24:6901899. [PMID: 36516298 DOI: 10.1093/bib/bbac538] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 09/28/2022] [Accepted: 11/08/2022] [Indexed: 12/15/2022] Open
Abstract
This paper describes a method Pprint2, which is an improved version of Pprint developed for predicting RNA-interacting residues in a protein. Training and independent/validation datasets used in this study comprises of 545 and 161 non-redundant RNA-binding proteins, respectively. All models were trained on training dataset and evaluated on the validation dataset. The preliminary analysis reveals that positively charged amino acids such as H, R and K, are more prominent in the RNA-interacting residues. Initially, machine learning based models have been developed using binary profile and obtain maximum area under curve (AUC) 0.68 on validation dataset. The performance of this model improved significantly from AUC 0.68 to 0.76, when evolutionary profile is used instead of binary profile. The performance of our evolutionary profile-based model improved further from AUC 0.76 to 0.82, when convolutional neural network has been used for developing model. Our final model based on convolutional neural network using evolutionary information achieved AUC 0.82 with Matthews correlation coefficient of 0.49 on the validation dataset. Our best model outperforms existing methods when evaluated on the independent/validation dataset. A user-friendly standalone software and web-based server named 'Pprint2' has been developed for predicting RNA-interacting residues (https://webs.iiitd.edu.in/raghava/pprint2 and https://github.com/raghavagps/pprint2).
Collapse
Affiliation(s)
- Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Khushboo Bajaj
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Harshita Sahu
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| |
Collapse
|
11
|
Abstract
Dementia is a general term used to indicate any disorder related to human memory. The various memory-related problems severely affect the human brain and so the individual feels difficulty in doing their normal physical as well as mental activities. There are different types of dementia that exist, but the commonly seen and fatal types of dementia are Alzheimer's disease (AD) and Parkinson's disease (PD). In this paper different efficient Machine Learning Techniques are selected analysed their behaviours in the diagnosis of AD and PD using Positron Emission Tomography (PET). The PET image dataset used in this work consists of 1050 images with AD, PD and Healthy Brain images. The total number of images is split into two different categories in the ratio of 7:3 for training and testing respectively. The different machine learning classifiers used are Bagged Ensemble, ID3, Naive Bayes and Multiclass Support Vector Machine. The classification of the AD and PD with the reference of a healthy brain is done by comparing the input image with the trained samples in the PET image database. In the comparison of trained samples with the input image for the PET images, the bagged ensemble learning classifier worked better than the other classification algorithms and yielded an accuracy of 90.3%.
Collapse
Affiliation(s)
- R S Nancy Noella
- School of Computer Science and Engineering, VIT University, Chennai, India
| | - J Priyadarshini
- School of Computer Science and Engineering, VIT University, Chennai, India
| |
Collapse
|
12
|
Alhakami H, Khan NA, Sulaiman M, Alhakami W, Baz A. On the Computational Study of a Fully Wetted Longitudinal Porous Heat Exchanger Using a Machine Learning Approach. Entropy (Basel) 2022; 24:1280. [PMID: 36141166 PMCID: PMC9497785 DOI: 10.3390/e24091280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/04/2022] [Accepted: 09/07/2022] [Indexed: 06/16/2023]
Abstract
The present study concerns the modeling of the thermal behavior of a porous longitudinal fin under fully wetted conditions with linear, quadratic, and exponential thermal conductivities surrounded by environments that are convective, conductive, and radiative. Porous fins are widely used in various engineering and everyday life applications. The Darcy model was used to formulate the governing non-linear singular differential equation for the heat transfer phenomenon in the fin. The universal approximation power of multilayer perceptron artificial neural networks (ANN) was applied to establish a model of approximate solutions for the singular non-linear boundary value problem. The optimization strategy of a sports-inspired meta-heuristic paradigm, the Tiki-Taka algorithm (TTA) with sequential quadratic programming (SQP), was utilized to determine the thermal performance and the effective use of fins for diverse values of physical parameters, such as parameter for the moist porous medium, dimensionless ambient temperature, radiation coefficient, power index, in-homogeneity index, convection coefficient, and dimensionless temperature. The results of the designed ANN-TTA-SQP algorithm were validated by comparison with state-of-the-art techniques, including the whale optimization algorithm (WOA), cuckoo search algorithm (CSA), grey wolf optimization (GWO) algorithm, particle swarm optimization (PSO) algorithm, and machine learning algorithms. The percentage of absolute errors and the mean square error in the solutions of the proposed technique were found to lie between 10-4 to 10-5 and 10-8 to 10-10, respectively. A comprehensive study of graphs, statistics of the solutions, and errors demonstrated that the proposed scheme's results were accurate, stable, and reliable. It was concluded that the pace at which heat is transferred from the surface of the fin to the surrounding environment increases in proportion to the degree to which the wet porosity parameter is increased. At the same time, inverse behavior was observed for increase in the power index. The results obtained may support the structural design of thermally effective cooling methods for various electronic consumer devices.
Collapse
Affiliation(s)
- Hosam Alhakami
- Department of Computer Science, College of Computer and Information Systems, Umm Al-Qura University, Makkah 21955, Saudi Arabia
| | - Naveed Ahmad Khan
- Department of Mathematics, Abdul Wali Khan University, Mardan 23200, Pakistan
| | - Muhammad Sulaiman
- Department of Mathematics, Abdul Wali Khan University, Mardan 23200, Pakistan
| | - Wajdi Alhakami
- Department of Information Technology, College of Computers and Information Technology, Taif University, Taif 26571, Saudi Arabia
| | - Abdullah Baz
- Department of Computer Engineering, College of Computer and Information Systems, Umm Al-Qura University, Makkah 21955, Saudi Arabia
| |
Collapse
|
13
|
Barbosa Dos Santos V, Moreno Ferreira Dos Santos A, da Silva Cabral de Moraes JR, de Oliveira Vieira IC, de Souza Rolim G. Machine learning algorithms for soybean yield forecasting in the Brazilian Cerrado. J Sci Food Agric 2022; 102:3665-3672. [PMID: 34893984 DOI: 10.1002/jsfa.11713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 12/03/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND We evaluated different machine learning (ML) models for predicting soybean productivity up to 1 month in advance for the Matopiba agricultural frontier (States of Maranhão, Tocantins, Piauí, and Bahia). We collected meteorological data on the NASA-POWER platform and soybean yield on the SIDRA/IBGE base between 2008 and 2017. The ML models evaluated were random forest (RF), artificial neural networks, radial base support vector machines (SVM_RBF), linear model and polynomial regression. To assess the performance of the models, cross-validation was used, obtaining the value of precision by R2 , accuracy by root mean square error (RMSE), and trend by the mean error of the estimate (EME). RESULTS The results showed that the RF algorithm achieves the highest precision and accuracy, with R2 of 0.81, RMSE of 176.93 kg ha-1 and trend (EME) of 1.99 kg ha-1 . On the other hand, the SVM_RBF algorithm showed the lowest performance, with R2 of 0.74, RMSE of 213.58 kg ha-1 and EME of -15.06 kg ha-1 . The average yield values predicted by the models were within the expected range for the region, which has a historical average value of 2.730 kg ha-1 . CONCLUSION All models had acceptable precision, accuracy and trend indices, which makes it possible to use all algorithms to be applied in the prediction of soybean crop yield, observing the particularities of the region to be studied, in addition to being a useful tool for agricultural planning and decision making in soy-producing regions such as the Brazilian Cerrado. © 2021 Society of Chemical Industry.
Collapse
Affiliation(s)
- Valter Barbosa Dos Santos
- Graduate Program in Agronomy (Soil Science), State University of Sao Paulo (FCAV/UNESP), Jaboticabal, Brazil
| | | | | | | | - Glauco de Souza Rolim
- School of Agricultural and Veterinarian Sciences, São Paulo State University (UNESP), Jaboticabal, Brazil
| |
Collapse
|
14
|
Sánchez-Gutiérrez ME, González-Pérez PP. Modeling and Simulation of Cell Signaling Networks for Subsequent Analytics Processes Using Big Data and Machine Learning. Bioinform Biol Insights 2022; 16:11779322221091739. [PMID: 35478994 PMCID: PMC9036331 DOI: 10.1177/11779322221091739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 03/16/2022] [Indexed: 01/06/2023] Open
Abstract
This work explores how much the traditional approach to modeling and simulation of biological systems, specifically cell signaling networks, can be increased and improved by integrating big data, data mining, and machine learning techniques. Specifically, we first model, simulate, validate, and calibrate the behavior of the PI3K/AKT/mTOR cancer-related signaling pathway. Subsequently, once the behavior of the simulated signaling network matches the expected behavior, the capacity of the computational simulation is increased to grow data (data farming). First, we use big data techniques to extract, collect, filter, and store large volumes of data describing all the interactions among the simulated cell signaling system components over time. Afterward, we apply data mining and machine learning techniques-specifically, exploratory data analysis, feature selection techniques, and supervised neural network models-to the resulting biological dataset to obtain new inferences and knowledge about this biological system. The results showed how the traditional approach to the simulation of biological systems could be enhanced and improved by incorporating big data, data mining, and machine learning techniques, which significantly contributed to increasing the predictive power of the simulation.
Collapse
Affiliation(s)
| | - Pedro Pablo González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana, Unidad Cuajimalpa, Ciudad de México, México,Pedro Pablo González-Pérez, Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana, Unidad Cuajimalpa, Avenida Vasco de Quiroga 4871, Col. Santa Fe Cuajimalpa, C.P. 05348, Ciudad de México, México.
| |
Collapse
|
15
|
Liao HH, Chang CC, Wang YX, Cheewakriangkrai C. Predicting the Risk Factors of Second Primary Cancer in Patients with Hepatocellular Carcinoma. Stud Health Technol Inform 2022; 289:93-96. [PMID: 35062100 DOI: 10.3233/shti210867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Screening for cancer and improved treatments have not only improved treatment outcomes and patient survival but have also led to an increase in the number of second primary cancers (SPCs). Hepatocellular carcinoma has been a common occurrence in Taiwan over the past decade. The mortality rate is second only to malignant tumors of lung cancer, and it also represents the fourth highest cancer medical expenditure. This study aimed to use machine learning to identify the risk factors for Hepatocellular carcinoma survivors. Of 378,445 datasets, including 15,251 from patients with SPCs, were collected; 18 predictive variables were considered risk factors for SPCs based on the physician panel discussion. The machine learning techniques employed included support vector machine, C5 decision tree, and random forest. SMOTE (Synthetic Minority Oversampling Technique) sampling method was used to resolve the imbalance problem. The results showed that the top 5 risk factors for SPCs were tumor size, clinical stage, surgery, total bilirubin, and BCLC Stage. The support vector machine method had the highest predicted accuracy (0.7673). The risk factors extracted from the classification models and association rules will be used to provide valuable information for HCC therapy.
Collapse
Affiliation(s)
- Hsien-Hua Liao
- Department of Surgery, Chung Shan Medical University Hospital, Taichung, Taiwan.,School of Medicine, Chung Shan Medical University, Taichung, Taiwan
| | - Chi-Chang Chang
- School of Medical Informatics, Chung Shan Medical University & IT office, Chung Shan Medical University Hospital, Taichung 40201, Taiwan.,Department of Information Management, Ming Chuan University, Taoyuan, Taiwan
| | - Yu-Xiang Wang
- School of Medical Informatics, Chung Shan Medical University & IT office, Chung Shan Medical University Hospital, Taichung 40201, Taiwan
| | - Chalong Cheewakriangkrai
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Faculty of Medicine, Chiang Mai University, Chiang Mai, Thailand
| |
Collapse
|
16
|
Kishore DM, Bindu S, Manjunath NK. Estimation of Yoga Postures Using Machine Learning Techniques. Int J Yoga 2022; 15:137-143. [PMID: 36329766 PMCID: PMC9623892 DOI: 10.4103/ijoy.ijoy_97_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 01/25/2023] Open
Abstract
Yoga is a traditional Indian way of keeping the mind and body fit, through physical postures (asanas), voluntarily regulated breathing (pranayama), meditation, and relaxation techniques. The recent pandemic has seen a huge surge in numbers of yoga practitioners, many practicing without proper guidance. This study was proposed to ease the work of such practitioners by implementing deep learning-based methods, which can estimate the correct pose performed by a practitioner. The study implemented this approach using four different deep learning architectures: EpipolarPose, OpenPose, PoseNet, and MediaPipe. These architectures were separately trained using the images obtained from S-VYASA Deemed to be University. This database had images for five commonly practiced yoga postures: tree pose, triangle pose, half-moon pose, mountain pose, and warrior pose. The use of this authentic database for training paved the way for the deployment of this model in real-time applications. The study also compared the estimation accuracy of all architectures and concluded that the MediaPipe architecture provides the best estimation accuracy.
Collapse
Affiliation(s)
- D. Mohan Kishore
- Division of Yoga and Life Sciences, Swami Vivekananda Yoga Anusandhana Samsthana (S-VYASA), Bengaluru, Karnataka, India,Address for correspondence: Mr. D. Mohan Kishore, Swami Vivekananda Yoga Anusandhana Samsthana (S-VYASA), Jigani, Bengaluru – 560105, Karnataka, India. E-mail:
| | - S. Bindu
- Department of Electronics and Communication Engineering, B N M Institute of Technology, Bengaluru, Karnataka, India
| | - Nandi Krishnamurthy Manjunath
- Division of Yoga and Life Sciences, Swami Vivekananda Yoga Anusandhana Samsthana (S-VYASA), Bengaluru, Karnataka, India
| |
Collapse
|
17
|
Das S, Panigrahi P, Chakrabarti S. Corpus Callosum Atrophy in Detection of Mild and Moderate Alzheimer's Disease Using Brain Magnetic Resonance Image Processing and Machine Learning Techniques. J Alzheimers Dis Rep 2021; 5:771-788. [PMID: 34870103 PMCID: PMC8609489 DOI: 10.3233/adr-210314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/24/2021] [Indexed: 01/25/2023] Open
Abstract
Background: The total number of people with dementia is projected to reach 82 million in 2030 and 152 in 2050. Early and accurate identification of the underlying causes of dementia, such as Alzheimer’s disease (AD) is of utmost importance. A large body of research has shown that imaging techniques are most promising technologies to improve subclinical and early diagnosis of dementia. Morphological changes, especially atrophy in various structures like cingulate gyri, caudate nucleus, hippocampus, frontotemporal lobe, etc., have been established as markers for AD. Being the largest white matter structure with a high demand of blood supply from several main arterial systems, anatomical alterations of the corpus callosum (CC) may serve as potential indication neurodegenerative disease. Objective: To detect mild and moderate AD using brain magnetic resonance image (MRI) processing and machine learning techniques. Methods: We have performed automatic detection and segmentation of the CC and calculated its morphological features to feed into a multivariate pattern analysis using support vector machine (SVM) learning techniques. Results: Our results using large patients’ cohort show CC atrophy-based features are capable of distinguishing healthy and mild/moderate AD patients. Our classifiers obtain more than 90%sensitivity and specificity in differentiating demented patients from healthy cohorts and importantly, achieved more than 90%sensitivity and > 80%specificity in detecting mild AD patients. Conclusion: Results from this analysis are encouraging and advocate development of an image analysis software package to detect dementia from brain MRI using morphological alterations of the CC.
Collapse
Affiliation(s)
- Subhrangshu Das
- Structural Biology and Bioinformatics Division, Council for Scientific and Industrial Research (CSIR) - Indian Institute of Chemical Biology (IICB), Kolkata, West Bengal, India
| | - Priyanka Panigrahi
- Structural Biology and Bioinformatics Division, Council for Scientific and Industrial Research (CSIR) - Indian Institute of Chemical Biology (IICB), Kolkata, West Bengal, India.,Academy of Scientific and Innovative Research, Ghaziabad, Uttar Pradesh, India
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, Council for Scientific and Industrial Research (CSIR) - Indian Institute of Chemical Biology (IICB), Kolkata, West Bengal, India.,Academy of Scientific and Innovative Research, Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
18
|
Kumar V, Patiyal S, Dhall A, Sharma N, Raghava GPS. B3Pred: A Random-Forest-Based Method for Predicting and Designing Blood-Brain Barrier Penetrating Peptides. Pharmaceutics 2021; 13:1237. [PMID: 34452198 DOI: 10.3390/pharmaceutics13081237] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/07/2021] [Accepted: 07/14/2021] [Indexed: 12/14/2022] Open
Abstract
The blood–brain barrier is a major obstacle in treating brain-related disorders, as it does not allow the delivery of drugs into the brain. We developed a method for predicting blood–brain barrier penetrating peptides to facilitate drug delivery into the brain. These blood–brain barrier penetrating peptides (B3PPs) can act as therapeutics, as well as drug delivery agents. We trained, tested, and evaluated our models on blood–brain barrier peptides obtained from the B3Pdb database. First, we computed a wide range of peptide features. Then, we selected relevant peptide features. Finally, we developed numerous machine-learning-based models for predicting blood–brain barrier peptides using the selected features. The random-forest-based model performed the best with respect to the top 80 selected features and achieved a maximal 85.08% accuracy with an AUROC of 0.93. We also developed a webserver, B3pred, that implements our best models. It has three major modules that allow users to predict/design B3PPs and scan B3PPs in a protein sequence.
Collapse
|
19
|
Comoretto RI, Azzolina D, Amigoni A, Stoppa G, Todino F, Wolfler A, Gregori D. Predicting Hemodynamic Failure Development in PICU Using Machine Learning Techniques. Diagnostics (Basel) 2021; 11:diagnostics11071299. [PMID: 34359385 PMCID: PMC8303657 DOI: 10.3390/diagnostics11071299] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/12/2021] [Accepted: 07/16/2021] [Indexed: 11/16/2022] Open
Abstract
The present work aims to identify the predictors of hemodynamic failure (HF) developed during pediatric intensive care unit (PICU) stay testing a set of machine learning techniques (MLTs), comparing their ability to predict the outcome of interest. The study involved patients admitted to PICUs between 2010 and 2020. Data were extracted from the Italian Network of Pediatric Intensive Care Units (TIPNet) registry. The algorithms considered were generalized linear model (GLM), recursive partition tree (RPART), random forest (RF), neural networks models, and extreme gradient boosting (XGB). Since the outcome is rare, upsampling and downsampling algorithms have been applied for imbalance control. For each approach, the main performance measures were reported. Among an overall sample of 29,494 subjects, only 399 developed HF during the PICU stay. The median age was about two years, and the male gender was the most prevalent. The XGB algorithm outperformed other MLTs in predicting HF development, with a median ROC measure of 0.780 (IQR 0.770-0.793). PIM 3, age, and base excess were found to be the strongest predictors of outcome. The present work provides insights for the prediction of HF development during PICU stay using machine-learning algorithms.
Collapse
Affiliation(s)
- Rosanna I. Comoretto
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy; (R.I.C.); (D.A.); (G.S.); (F.T.)
| | - Danila Azzolina
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy; (R.I.C.); (D.A.); (G.S.); (F.T.)
- Department of Medical Sciences, University of Ferrara, 44100 Ferrara, Italy
| | - Angela Amigoni
- Pediatric Intensive Care Unit, Department of Women’s and Children’s Health, University Hospital of Padua, Via Giustiniani 2, 35128 Padova, Italy;
| | - Giorgia Stoppa
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy; (R.I.C.); (D.A.); (G.S.); (F.T.)
| | - Federica Todino
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy; (R.I.C.); (D.A.); (G.S.); (F.T.)
| | - Andrea Wolfler
- Department of Anaesthesia, Gaslini Hospital, 16147 Genova, Italy;
| | - Dario Gregori
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy; (R.I.C.); (D.A.); (G.S.); (F.T.)
- Correspondence: ; Tel.: +39-049-8275-384; Fax: +39-02-700-445-089
| | | |
Collapse
|
20
|
Tezza F, Lorenzoni G, Azzolina D, Barbar S, Leone LAC, Gregori D. Predicting in-Hospital Mortality of Patients with COVID-19 Using Machine Learning Techniques. J Pers Med 2021; 11:343. [PMID: 33923332 DOI: 10.3390/jpm11050343] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 04/20/2021] [Accepted: 04/21/2021] [Indexed: 12/28/2022] Open
Abstract
The present work aims to identify the predictors of COVID-19 in-hospital mortality testing a set of Machine Learning Techniques (MLTs), comparing their ability to predict the outcome of interest. The model with the best performance will be used to identify in-hospital mortality predictors and to build an in-hospital mortality prediction tool. The study involved patients with COVID-19, proved by PCR test, admitted to the “Ospedali Riuniti Padova Sud” COVID-19 referral center in the Veneto region, Italy. The algorithms considered were the Recursive Partition Tree (RPART), the Support Vector Machine (SVM), the Gradient Boosting Machine (GBM), and Random Forest. The resampled performances were reported for each MLT, considering the sensitivity, specificity, and the Receiving Operative Characteristic (ROC) curve measures. The study enrolled 341 patients. The median age was 74 years, and the male gender was the most prevalent. The Random Forest algorithm outperformed the other MLTs in predicting in-hospital mortality, with a ROC of 0.84 (95% C.I. 0.78–0.9). Age, together with vital signs (oxygen saturation and the quick SOFA) and lab parameters (creatinine, AST, lymphocytes, platelets, and hemoglobin), were found to be the strongest predictors of in-hospital mortality. The present work provides insights for the prediction of in-hospital mortality of COVID-19 patients using a machine-learning algorithm.
Collapse
|
21
|
Zhang Y, D’Haeseleer I, Coelho J, Vanden Abeele V, Vanrumste B. Recognition of Bathroom Activities in Older Adults Using Wearable Sensors: A Systematic Review and Recommendations. Sensors (Basel) 2021; 21:s21062176. [PMID: 33804626 PMCID: PMC8003704 DOI: 10.3390/s21062176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 03/10/2021] [Accepted: 03/17/2021] [Indexed: 11/16/2022]
Abstract
This article provides a systematic review of studies on recognising bathroom activities in older adults using wearable sensors. Bathroom activities are an important part of Activities of Daily Living (ADL). The performance on ADL activities is used to predict the ability of older adults to live independently. This paper aims to provide an overview of the studied bathroom activities, the wearable sensors used, different applied methodologies and the tested activity recognition techniques. Six databases were screened up to March 2020, based on four categories of keywords: older adults, activity recognition, bathroom activities and wearable sensors. In total, 4262 unique papers were found, of which only seven met the inclusion criteria. This small number shows that few studies have been conducted in this field. Therefore, in addition, this critical review resulted in several recommendations for future studies. In particular, we recommend to (1) study complex bathroom activities, including multiple movements; (2) recruit participants, especially the target population; (3) conduct both lab and real-life experiments; (4) investigate the optimal number and positions of wearable sensors; (5) choose a suitable annotation method; (6) investigate deep learning models; (7) evaluate the generality of classifiers; and (8) investigate both detection and quality performance of an activity.
Collapse
Affiliation(s)
- Yiyuan Zhang
- KU Leuven, e-Media Research Lab, 3000 Leuven, Belgium; (I.D.); (V.V.A.); (B.V.)
- KU Leuven, Stadius, Department of Electrical Engineering, 3001 Leuven, Belgium
- Correspondence:
| | - Ine D’Haeseleer
- KU Leuven, e-Media Research Lab, 3000 Leuven, Belgium; (I.D.); (V.V.A.); (B.V.)
- KU Leuven, HCI, Department of Computer Science, 3001 Leuven, Belgium
| | - José Coelho
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal;
| | - Vero Vanden Abeele
- KU Leuven, e-Media Research Lab, 3000 Leuven, Belgium; (I.D.); (V.V.A.); (B.V.)
- KU Leuven, HCI, Department of Computer Science, 3001 Leuven, Belgium
| | - Bart Vanrumste
- KU Leuven, e-Media Research Lab, 3000 Leuven, Belgium; (I.D.); (V.V.A.); (B.V.)
- KU Leuven, Stadius, Department of Electrical Engineering, 3001 Leuven, Belgium
| |
Collapse
|
22
|
Chen YS, Cheng CH, Chen SF, Jhuang JY. Identification of the Framingham Risk Score by an Entropy-Based Rule Model for Cardiovascular Disease. Entropy (Basel) 2020; 22:E1406. [PMID: 33322122 DOI: 10.3390/e22121406] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 11/30/2020] [Accepted: 12/11/2020] [Indexed: 12/12/2022]
Abstract
Since 2001, cardiovascular disease (CVD) has had the second-highest mortality rate, about 15,700 people per year, in Taiwan. It has thus imposed a substantial burden on medical resources. This study was triggered by the following three factors. First, the CVD problem reflects an urgent issue. A high priority has been placed on long-term therapy and prevention to reduce the wastage of medical resources, particularly in developed countries. Second, from the perspective of preventive medicine, popular data-mining methods have been well learned and studied, with excellent performance in medical fields. Thus, identification of the risk factors of CVD using these popular techniques is a prime concern. Third, the Framingham risk score is a core indicator that can be used to establish an effective prediction model to accurately diagnose CVD. Thus, this study proposes an integrated predictive model to organize five notable classifiers: the rough set (RS), decision tree (DT), random forest (RF), multilayer perceptron (MLP), and support vector machine (SVM), with a novel use of the Framingham risk score for attribute selection (i.e., F-attributes first identified in this study) to determine the key features for identifying CVD. Verification experiments were conducted with three evaluation criteria-accuracy, sensitivity, and specificity-based on 1190 instances of a CVD dataset available from a Taiwan teaching hospital and 2019 examples from a public Framingham dataset. Given the empirical results, the SVM showed the best performance in terms of accuracy (99.67%), sensitivity (99.93%), and specificity (99.71%) in all F-attributes in the CVD dataset compared to the other listed classifiers. The RS showed the highest performance in terms of accuracy (85.11%), sensitivity (86.06%), and specificity (85.19%) in most of the F-attributes in the Framingham dataset. The above study results support novel evidence that no classifier or model is suitable for all practical datasets of medical applications. Thus, identifying an appropriate classifier to address specific medical data is important. Significantly, this study is novel in its calculation and identification of the use of key Framingham risk attributes integrated with the DT technique to produce entropy-based decision rules of knowledge sets, which has not been undertaken in previous research. This study conclusively yielded meaningful entropy-based knowledgeable rules in tree structures and contributed to the differentiation of classifiers from the two datasets with three useful research findings and three helpful management implications for subsequent medical research. In particular, these rules provide reasonable solutions to simplify processes of preventive medicine by standardizing the formats and codes used in medical data to address CVD problems. The specificity of these rules is thus significant compared to those of past research.
Collapse
|
23
|
Choi BY, Wang CP, Gelfond J. Machine learning outcome regression improves doubly robust estimation of average causal effects. Pharmacoepidemiol Drug Saf 2020; 29:1120-1133. [PMID: 32716126 PMCID: PMC8098857 DOI: 10.1002/pds.5074] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/17/2020] [Accepted: 06/18/2020] [Indexed: 11/06/2022]
Abstract
BACKGROUND Doubly robust estimation produces an unbiased estimator for the average treatment effect unless both propensity score (PS) and outcome models are incorrectly specified. Studies have shown that the doubly robust estimator is subject to more bias than the standard weighting estimator when both PS and outcome models are incorrectly specified. METHOD We evaluated whether various machine learning methods can be used for estimating conditional means of the potential outcomes to enhance the robustness of the doubly robust estimator to various degrees of model misspecification in terms of reducing bias and standard error. We considered four types of methods to predict the outcomes: least squares, tree-based methods, generalized additive models and shrinkage methods. We also considered an ensemble method called the Super Learner (SL), which is a linear combination of multiple learners. We conducted simulations considering different scenarios by the complexity of PS and outcome-generating models and some ranges of treatment prevalence. RESULTS The shrinkage methods performed well with robust doubly robust estimates in term of bias and mean squared error across the scenarios when the models became rich by including all 2-way interactions of the covariates. The SL performed similarly to the best method in each scenario. CONCLUSIONS Our findings indicate that machine learning methods such as the SL or the shrinkage methods using interaction models should be used for more accurate doubly robust estimators.
Collapse
Affiliation(s)
- Byeong Yeob Choi
- Department of Population Health Sciences, UT Health San Antonio, San Antonio, Texas, USA
| | - Chen-Pin Wang
- Department of Population Health Sciences, UT Health San Antonio, San Antonio, Texas, USA
| | - Jonathan Gelfond
- Department of Population Health Sciences, UT Health San Antonio, San Antonio, Texas, USA
| |
Collapse
|
24
|
Wang J, Bell M, Liu X, Liu G. Machine-Learning Techniques Can Enhance Dairy Cow Estrus Detection Using Location and Acceleration Data. Animals (Basel) 2020; 10:ani10071160. [PMID: 32650526 PMCID: PMC7401617 DOI: 10.3390/ani10071160] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 05/25/2020] [Accepted: 07/07/2020] [Indexed: 11/16/2022] Open
Abstract
The aim of this study was to assess combining location, acceleration and machine learning technologies to detect estrus in dairy cows. Data were obtained from 12 cows, which were monitored continuously for 12 days. A neck mounted device collected 25,684 records for location and acceleration. Four machine-learning approaches were tested (K-nearest neighbor (KNN), back-propagation neural network (BPNN), linear discriminant analysis (LDA), and classification and regression tree (CART)) to automatically identify cows in estrus from estrus indicators determined by principal component analysis (PCA) of twelve behavioral metrics, which were: duration of standing, duration of lying, duration of walking, duration of feeding, duration of drinking, switching times between activity and lying, steps, displacement, average velocity, walking times, feeding times, and drinking times. The study showed that the neck tag had a static and dynamic positioning accuracy of 0.25 ± 0.06 m and 0.45 ± 0.15 m, respectively. In the 0.5-h, 1-h, and 1.5-h time windows, the machine learning approaches ranged from 73.3 to 99.4% for sensitivity, from 50 to 85.7% for specificity, from 77.8 to 95.8% for precision, from 55.6 to 93.7% for negative predictive value (NPV), from 72.7 to 95.4% for accuracy, and from 78.6 to 97.5% for F1 score. We found that the BPNN algorithm with 0.5-h time window was the best predictor of estrus in dairy cows. Based on these results, the integration of location, acceleration, and machine learning methods can improve dairy cow estrus detection.
Collapse
Affiliation(s)
- Jun Wang
- School of Agricultural Equipment Engineering, Henan University of Science and Technology, Luoyang 471003, China;
- Correspondence:
| | - Matt Bell
- School of Biosciences, The University of Nottingham, Sutton Bonington, Loughborough LE12 5RD, UK;
| | - Xiaohang Liu
- School of Agricultural Equipment Engineering, Henan University of Science and Technology, Luoyang 471003, China;
| | - Gang Liu
- Key Laboratory for Modern Precision Agriculture System Integration Research, Ministry of Education, China Agricultural University, Beijing 100083, China;
| |
Collapse
|
25
|
Serafim MSM, Kronenberger T, Oliveira PR, Poso A, Honório KM, Mota BEF, Maltarollo VG. The application of machine learning techniques to innovative antibacterial discovery and development. Expert Opin Drug Discov 2020; 15:1165-1180. [PMID: 32552005 DOI: 10.1080/17460441.2020.1776696] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
INTRODUCTION After the initial wave of antibiotic discovery, few novel classes of antibiotics have emerged, with the latest dating back to the 1980's. Furthermore, the pace of antibiotic drug discovery is unable to keep up with the increasing prevalence of antibiotic drug resistance. However, the increasing amount of available data promotes the use of machine learning techniques (MLT) in drug discovery projects (e.g. construction of regression/classification models and ranking/virtual screening of compounds). AREAS COVERED In this review, the authors cover some of the applications of MLT in medicinal chemistry, focusing on the development of new antibiotics, the prediction of resistance and its mechanisms. The aim of this review is to illustrate the main advantages and disadvantages and the major trends from studies over the past 5 years. EXPERT OPINION The application of MLT to antibacterial drug discovery can aid the selection of new and potent lead compounds, with desirable pharmacokinetic and toxic profiles for further optimization. The increasing volume of available data along with the constant improvement in computational power and algorithms has meant that we are experiencing a transition in the way we face modern issues such as drug resistance, where our decisions are data-driven and experiments can be focused by data-suggested hypotheses.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany
| | | | - Antti Poso
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland , Kuopio, Finland
| | - Káthia Maria Honório
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP) , São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC , Santo André, Brazil
| | - Bruno Eduardo Fernandes Mota
- Departamento de Análises Clínicas e Toxicológicas, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| |
Collapse
|
26
|
Ivanenkov YA, Zhavoronkov A, Yamidanov RS, Osterman IA, Sergiev PV, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Veselov MS, Ayginin AA, Kartsev VG, Skvortsov DA, Chemeris AV, Baimiev AK, Sofronova AA, Malyshev AS, Filkov GI, Bezrukov DS, Zagribelnyy BA, Putin EO, Puchinina MM, Dontsova OA. Identification of Novel Antibacterials Using Machine Learning Techniques. Front Pharmacol 2019; 10:913. [PMID: 31507413 PMCID: PMC6719509 DOI: 10.3389/fphar.2019.00913] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 07/19/2019] [Indexed: 11/19/2022] Open
Abstract
Many pharmaceutical companies are avoiding the development of novel antibacterials due to a range of rational reasons and the high risk of failure. However, there is an urgent need for novel antibiotics especially against resistant bacterial strains. Available in silico models suffer from many drawbacks and, therefore, are not applicable for scoring novel molecules with high structural diversity by their antibacterial potency. Considering this, the overall aim of this study was to develop an efficient in silico model able to find compounds that have plenty of chances to exhibit antibacterial activity. Based on a proprietary screening campaign, we have accumulated a representative dataset of more than 140,000 molecules with antibacterial activity against Escherichia coli assessed in the same assay and under the same conditions. This intriguing set has no analogue in the scientific literature. We applied six in silico techniques to mine these data. For external validation, we used 5,000 compounds with low similarity towards training samples. The antibacterial activity of the selected molecules against E. coli was assessed using a comprehensive biological study. Kohonen-based nonlinear mapping was used for the first time and provided the best predictive power (av. 75.5%). Several compounds showed an outstanding antibacterial potency and were identified as translation machinery inhibitors in vitro and in vivo. For the best compounds, MIC and CC50 values were determined to allow us to estimate a selectivity index (SI). Many active compounds have a robust IP position.
Collapse
Affiliation(s)
- Yan A. Ivanenkov
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Alex Zhavoronkov
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Renat S. Yamidanov
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Ilya A. Osterman
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| | - Petr V. Sergiev
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Vladimir A. Aladinskiy
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Anastasia V. Aladinskaya
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Victor A. Terentiev
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Mark S. Veselov
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Insilico Medicine, Inc. Johns Hopkins University, Rockville, MD, United States
| | - Andrey A. Ayginin
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | | | - Dmitry A. Skvortsov
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Faculty of Biology and Biotechnologies, Higher School of Economics, Moscow, Russia
| | - Alexey V. Chemeris
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
| | - Alexey Kh. Baimiev
- Institute of Biochemistry and Genetics Russian Academy of Science (IBG RAS) Ufa Scientific Centre, Ufa, Russia
| | - Alina A. Sofronova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | | | - Gleb I. Filkov
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | - Dmitry S. Bezrukov
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| | | | - Evgeny O. Putin
- Computer Technologies Lab, ITMO University, St. Petersburg, Russia
| | - Maria M. Puchinina
- Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | - Olga A. Dontsova
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Department of Chemistry and A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
27
|
Lorenzoni G, Sabato SS, Lanera C, Bottigliengo D, Minto C, Ocagli H, De Paolis P, Gregori D, Iliceto S, Pisanò F. Comparison of Machine Learning Techniques for Prediction of Hospitalization in Heart Failure Patients. J Clin Med 2019; 8:E1298. [PMID: 31450546 DOI: 10.3390/jcm8091298] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Revised: 08/20/2019] [Accepted: 08/22/2019] [Indexed: 12/23/2022] Open
Abstract
The present study aims to compare the performance of eight Machine Learning Techniques (MLTs) in the prediction of hospitalization among patients with heart failure, using data from the Gestione Integrata dello Scompenso Cardiaco (GISC) study. The GISC project is an ongoing study that takes place in the region of Puglia, Southern Italy. Patients with a diagnosis of heart failure are enrolled in a long-term assistance program that includes the adoption of an online platform for data sharing between general practitioners and cardiologists working in hospitals and community health districts. Logistic regression, generalized linear model net (GLMN), classification and regression tree, random forest, adaboost, logitboost, support vector machine, and neural networks were applied to evaluate the feasibility of such techniques in predicting hospitalization of 380 patients enrolled in the GISC study, using data about demographic characteristics, medical history, and clinical characteristics of each patient. The MLTs were compared both without and with missing data imputation. Overall, models trained without missing data imputation showed higher predictive performances. The GLMN showed better performance in predicting hospitalization than the other MLTs, with an average accuracy, positive predictive value and negative predictive value of 81.2%, 87.5%, and 75%, respectively. Present findings suggest that MLTs may represent a promising opportunity to predict hospital admission of heart failure patients by exploiting health care information generated by the contact of such patients with the health care system.
Collapse
|
28
|
Hernández-Del-Olmo F, Gaudioso E, Duro N, Dormido R. Machine Learning Weather Soft-Sensor for Advanced Control of Wastewater Treatment Plants. Sensors (Basel) 2019; 19:E3139. [PMID: 31319478 DOI: 10.3390/s19143139] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 06/27/2019] [Accepted: 07/15/2019] [Indexed: 11/27/2022]
Abstract
Control of wastewater treatment plants (WWTPs) is challenging not only because of their high nonlinearity but also because of important external perturbations. One the most relevant of these perturbations is weather. In fact, different weather conditions imply different inflow rates and substance (e.g., N-ammonia, which is among the most important) concentrations. Therefore, weather has traditionally been an important signal that operators take into account to tune WWTP control systems. This signal cannot be directly measured with traditional physical sensors. Nevertheless, machine learning-based soft-sensors can be used to predict non-observable measures by means of available data. In this paper, we present novel research about a new soft-sensor that predicts the current weather signal. This weather prediction differs from traditional weather forecasting since this soft-sensor predicts the weather conditions as an operator does when controling the WWTP. This prediction uses a model based on past WWTP influent states measured by only a few physical and widely applied sensors. The results are encouraging, as we obtained a good accuracy level for a relevant and very useful signal when applied to advanced WWTP control systems.
Collapse
|
29
|
Fanizzi A, Losurdo L, Basile TMA, Bellotti R, Bottigli U, Delogu P, Diacono D, Didonna V, Fausto A, Lombardi A, Lorusso V, Massafra R, Tangaro S, La Forgia D. Fully Automated Support System for Diagnosis of Breast Cancer in Contrast-Enhanced Spectral Mammography Images. J Clin Med 2019; 8:jcm8060891. [PMID: 31234363 PMCID: PMC6616937 DOI: 10.3390/jcm8060891] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/08/2019] [Accepted: 06/17/2019] [Indexed: 12/24/2022] Open
Abstract
Contrast-Enhanced Spectral Mammography (CESM) is a novelty instrumentation for diagnosing of breast cancer, but it can still be considered operator dependent. In this paper, we proposed a fully automatic system as a diagnostic support tool for the clinicians. For each Region Of Interest (ROI), a features set was extracted from low-energy and recombined images by using different techniques. A Random Forest classifier was trained on a selected subset of significant features by a sequential feature selection algorithm. The proposed Computer-Automated Diagnosis system is tested on 48 ROIs extracted from 53 patients referred to Istituto Tumori “Giovanni Paolo II” of Bari (Italy) from the breast cancer screening phase between March 2017 and June 2018. The present method resulted highly performing in the prediction of benign/malignant ROIs with median values of sensitivity and specificity of 87.5% and 91.7%, respectively. The performance was high compared to the state-of-the-art, even with a moderate/marked level of parenchymal background. Our classification model outperformed the human reader, by increasing the specificity over 8%. Therefore, our system could represent a valid support tool for radiologists for interpreting CESM images, both reducing the false positive rate and limiting biopsies and surgeries.
Collapse
Affiliation(s)
- Annarita Fanizzi
- Dip. di Diagnosi e Terapia per Immagini, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| | - Liliana Losurdo
- Dip. di Diagnosi e Terapia per Immagini, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| | - Teresa Maria A Basile
- Dip. Interateneo di Fisica "M. Merlin", Università degli Studi di Bari "A. Moro", 70125 Bari, Italy.
| | - Roberto Bellotti
- Dip. Interateneo di Fisica "M. Merlin", Università degli Studi di Bari "A. Moro", 70125 Bari, Italy.
| | - Ubaldo Bottigli
- Dip. di Scienze Fisiche, della Terra e dell'Ambiente, Università degli Studi di Siena, 53100 Siena, Italy.
| | - Pasquale Delogu
- Dip. di Scienze Fisiche, della Terra e dell'Ambiente, Università degli Studi di Siena, 53100 Siena, Italy.
| | - Domenico Diacono
- INFN-Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy.
| | - Vittorio Didonna
- Dip. di Diagnosi e Terapia per Immagini, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| | - Alfonso Fausto
- Dip. di Diagnostica per Immagini, Azienda Ospedaliera Universitaria Senese, 53100 Siena, Italy.
| | - Angela Lombardi
- INFN-Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy.
| | - Vito Lorusso
- Dip. Area Medica, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| | - Raffaella Massafra
- Dip. di Diagnosi e Terapia per Immagini, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| | - Sabina Tangaro
- INFN-Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy.
| | - Daniele La Forgia
- Dip. di Diagnosi e Terapia per Immagini, I.R.C.C.S. Istituto Tumori "Giovanni Paolo II" di Bari, 70124 Bari, Italy.
| |
Collapse
|
30
|
Bottigliengo D, Berchialla P, Lanera C, Azzolina D, Lorenzoni G, Martinato M, Giachino D, Baldi I, Gregori D. The Role of Genetic Factors in Characterizing Extra-Intestinal Manifestations in Crohn's Disease Patients: Are Bayesian Machine Learning Methods Improving Outcome Predictions? J Clin Med 2019; 8:jcm8060865. [PMID: 31212952 PMCID: PMC6617350 DOI: 10.3390/jcm8060865] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 06/12/2019] [Accepted: 06/13/2019] [Indexed: 01/01/2023] Open
Abstract
(1) Background: The high heterogeneity of inflammatory bowel disease (IBD) makes the study of this condition challenging. In subjects affected by Crohn’s disease (CD), extra-intestinal manifestations (EIMs) have a remarkable potential impact on health status. Increasing numbers of patient characteristics and the small size of analyzed samples make EIMs prediction very difficult. Under such constraints, Bayesian machine learning techniques (BMLTs) have been proposed as a robust alternative to classical models for outcome prediction. This study aims to determine whether BMLT could improve EIM prediction and statistical support for the decision-making process of clinicians. (2) Methods: Three of the most popular BMLTs were employed in this study: Naϊve Bayes (NB), Bayesian Network (BN) and Bayesian Additive Regression Trees (BART). They were applied to a retrospective observational Italian study of IBD genetics. (3) Results: The performance of the model is strongly affected by the features of the dataset, and BMLTs poorly classify EIM appearance. (4) Conclusions: This study shows that BMLTs perform worse than expected in classifying the presence of EIMs compared to classical statistical tools in a context where mixed genetic and clinical data are available but relevant data are also missing, as often occurs in clinical practice.
Collapse
Affiliation(s)
- Daniele Bottigliengo
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Paola Berchialla
- Department of Clinical and Biological Sciences, University of Torino, 10126 Torino, Italy.
| | - Corrado Lanera
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Danila Azzolina
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Giulia Lorenzoni
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Matteo Martinato
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Daniela Giachino
- Department of Clinical and Biological Sciences, University of Torino, 10126 Torino, Italy.
| | - Ileana Baldi
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Dario Gregori
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| |
Collapse
|
31
|
Dean SN, Shriver-Lake LC, Stenger DA, Erickson JS, Golden JP, Trammell SA. Machine Learning Techniques for Chemical Identification Using Cyclic Square Wave Voltammetry. Sensors (Basel) 2019; 19:E2392. [PMID: 31130606 DOI: 10.3390/s19102392] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 05/16/2019] [Accepted: 05/22/2019] [Indexed: 11/17/2022]
Abstract
Electroanalytical techniques are useful for detection and identification because the instrumentation is simple and can support a wide variety of assays. One example is cyclic square wave voltammetry (CSWV), a practical detection technique for different classes of compounds including explosives, herbicides/pesticides, industrial compounds, and heavy metals. A key barrier to the widespread application of CSWV for chemical identification is the necessity of a high performance, generalizable classification algorithm. Here, machine and deep learning models were developed for classifying samples based on voltammograms alone. The highest performing models were Long Short-Term Memory (LSTM) and Fully Convolutional Networks (FCNs), depending on the dataset against which performance was assessed. When compared to other algorithms, previously used for classification of CSWV and other similar data, our LSTM and FCN-based neural networks achieve higher sensitivity and specificity with the area under the curve values from receiver operating characteristic (ROC) analyses greater than 0.99 for several datasets. Class activation maps were paired with CSWV scans to assist in understanding the decision-making process of the networks, and their ability to utilize this information was examined. The best-performing models were then successfully applied to new or holdout experimental data. An automated method for processing CSWV data, training machine learning models, and evaluating their prediction performance is described, and the tools generated provide support for the identification of compounds using CSWV from samples in the field.
Collapse
|
32
|
Alashwal H, El Halaby M, Crouse JJ, Abdalla A, Moustafa AA. The Application of Unsupervised Clustering Methods to Alzheimer's Disease. Front Comput Neurosci 2019; 13:31. [PMID: 31178711 PMCID: PMC6543980 DOI: 10.3389/fncom.2019.00031] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Accepted: 04/29/2019] [Indexed: 12/24/2022] Open
Abstract
Clustering is a powerful machine learning tool for detecting structures in datasets. In the medical field, clustering has been proven to be a powerful tool for discovering patterns and structure in labeled and unlabeled datasets. Unlike supervised methods, clustering is an unsupervised method that works on datasets in which there is no outcome (target) variable nor is anything known about the relationship between the observations, that is, unlabeled data. In this paper, we focus on studying and reviewing clustering methods that have been applied to datasets of neurological diseases, especially Alzheimer’s disease (AD). The aim is to provide insights into which clustering technique is more suitable for partitioning patients of AD based on their similarity. This is important as clustering algorithms can find patterns across patients that are difficult for medical practitioners to find. We further discuss the implications of the use of clustering algorithms in the treatment of AD. We found that clustering analysis can point to several features that underlie the conversion from early-stage AD to advanced AD. Furthermore, future work can apply semi-clustering algorithms on AD datasets, which will enhance clusters by including additional information.
Collapse
Affiliation(s)
- Hany Alashwal
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al-Ain, United Arab Emirates
| | - Mohamed El Halaby
- Department of Mathematics, Faculty of Science, Cairo University, Giza, Egypt
| | - Jacob J Crouse
- Brain and Mind Centre, The University of Sydney, Sydney, NSW, Australia
| | - Areeg Abdalla
- Department of Mathematics, Faculty of Science, Cairo University, Giza, Egypt
| | - Ahmed A Moustafa
- School of Social Sciences and Psychology, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
33
|
Rajput A, Kumar M. Anti-flavi: A Web Platform to Predict Inhibitors of Flaviviruses Using QSAR and Peptidomimetic Approaches. Front Microbiol 2018; 9:3121. [PMID: 30619195 PMCID: PMC6305493 DOI: 10.3389/fmicb.2018.03121] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Accepted: 12/03/2018] [Indexed: 01/27/2023] Open
Abstract
Flaviviruses are arboviruses, which comprises more than 70 viruses, covering broad geographic ranges, and responsible for significant mortality and morbidity globally. Due to the lack of efficient inhibitors targeting flaviviruses, the designing of novel and efficient anti-flavi agents is an important problem. Therefore, in the current study, we have developed a dedicated prediction algorithm anti-flavi, to identify inhibition ability of chemicals and peptides against flaviviruses through quantitative structure–activity relationship based method. We extracted the non-redundant 2168 chemicals and 117 peptides from ChEMBL and AVPpred databases, respectively, with reported IC50 values. The regression based model developed on training/testing datasets of 1952 chemicals and 105 peptides displayed the Pearson’s correlation coefficient (PCC) of 0.87, 0.84, and 0.87, 0.83 using support vector machine and random forest techniques correspondingly. We also explored the peptidomimetics approach, in which the most contributing descriptors of peptides were used to identify chemicals having anti-flavi potential. Conversely, the selected descriptors of chemicals performed well to predict anti-flavi peptides. Moreover, the developed model proved to be highly robust while checked through various approaches like independent validation and decoy datasets. We hope that our web server would prove a useful tool to predict and design the efficient anti-flavi agents. The anti-flavi webserver is freely available at URL http://bioinfo.imtech.res.in/manojk/antiflavi.
Collapse
Affiliation(s)
- Akanksha Rajput
- Virology Discovery Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific and Industrial Research (CSIR), Chandigarh, India
| | - Manoj Kumar
- Virology Discovery Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific and Industrial Research (CSIR), Chandigarh, India
| |
Collapse
|
34
|
Cilla M, Pérez-Rey I, Martínez MA, Peña E, Martínez J. On the use of machine learning techniques for the mechanical characterization of soft biological tissues. Int J Numer Method Biomed Eng 2018; 34:e3121. [PMID: 29935057 DOI: 10.1002/cnm.3121] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 05/24/2018] [Accepted: 06/17/2018] [Indexed: 06/08/2023]
Abstract
Motivated by the search for new strategies for fitting a material model, a new approach is explored in the present work. The use of numerical and complex algorithms based on machine learning techniques such as support vector machines for regression, bagged decision trees, and artificial neural networks is proposed for solving the parameter identification of constitutive laws for soft biological tissues. First, the mathematical tools were trained with analytical uniaxial data (circumferential and longitudinal directions) as inputs, and their corresponding material parameters of the Gasser, Ogden, and Holzapfel strain energy function as outputs. The train and test errors show great efficiency during the training process in finding correlations between inputs and outputs; besides, the correlation coefficients were very close to 1. Second, the tool was validated with unseen observations of analytical circumferential and longitudinal uniaxial data. The results show an excellent agreement between the prediction of the material parameters of the strain energy function and the analytical curves. Finally, data from real circumferential and longitudinal uniaxial tests on different cardiovascular tissues were fitted; thus, the material model of these tissues was predicted. We found that the method was able to consistently identify model parameters, and we believe that the use of these numerical tools could lead to an improvement in the characterization of soft biological tissues.
Collapse
Affiliation(s)
- Myriam Cilla
- Centro Universitario de la Defensa (CUD), Academia General Militar(AGM), Zaragoza, Spain
- Aragon Institute of Engineering Research (I3A), University of Zaragoza, Zaragoza, Spain
- CIBER's Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Spain
| | - Ignacio Pérez-Rey
- Department of Natural Resources and Environmental Engineering, University of Vigo, Vigo, Spain
| | - Miguel Angel Martínez
- Aragon Institute of Engineering Research (I3A), University of Zaragoza, Zaragoza, Spain
- CIBER's Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Spain
| | - Estefania Peña
- Aragon Institute of Engineering Research (I3A), University of Zaragoza, Zaragoza, Spain
- CIBER's Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Spain
| | - Javier Martínez
- Department of Natural Resources and Environmental Engineering, University of Vigo, Vigo, Spain
- Centro Universitario de la Defensa (CUD), Escuela Naval Militar, Marín, Spain
| |
Collapse
|
35
|
López-Valenciano A, Ayala F, Puerta JM, De Ste Croix M, Vera-García F, Hernández-Sánchez S, Ruiz-Pérez I, Myer G. A Preventive Model for Muscle Injuries: A Novel Approach based on Learning Algorithms. Med Sci Sports Exerc 2018; 50:915-927. [PMID: 29283933 PMCID: PMC6582363 DOI: 10.1249/mss.0000000000001535] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
INTRODUCTION The application of contemporary statistical approaches coming from Machine Learning and Data Mining environments to build more robust predictive models to identify athletes at high risk for injury might support injury prevention strategies of the future. PURPOSE The purpose was to analyze and compare the behavior of numerous machine learning methods to select the best-performing injury risk factor model to identify athlete at risk for lower extremity muscle injuries (MUSINJ). METHODS A total of 132 male professional soccer and handball players underwent a preseason screening evaluation that included personal, psychological, and neuromuscular measures. Furthermore, injury surveillance was used to capture all the MUSINJ occurring in the 2013/2014 seasons. The predictive ability of several models built by applying a range of learning techniques were analyzed and compared. RESULTS There were 32 MUSINJ over the follow-up period, 21 (65.6%) of which corresponded to the hamstrings, 3 to the quadriceps (9.3%), 4 to the adductors (12.5%), and 4 to the triceps surae (12.5%). A total of 13 injures occurred during training and 19 during competition. Three players were injured twice during the observation period so the first injury was used, leaving 29 MUSINJ that were used to develop the predictive models. The model generated by the SmooteBoost technique with a cost-sensitive ADTree as the base classifier reported the best evaluation criteria (area under the receiver operating characteristic curve score, 0.747; true positive rate, 65.9%; true negative rate, 79.1) and hence was considered the best for predicting MUSINJ. CONCLUSIONS The prediction model showed moderate accuracy for identifying professional soccer and handball players at risk for MUSINJ. Therefore, the model developed might help in the decision-making process for injury prevention.
Collapse
Affiliation(s)
| | - Francisco Ayala
- Sports Research Centre, Miguel Hernandez University of Elche, Alicante, Spain
| | - José Miguel Puerta
- Department of Computer Systems, University of Castilla-La Mancha, Albacete, Spain
| | - Mark De Ste Croix
- School of Physical Education, Faculty of Sport, Health and Social Care, University of Gloucestershire, Gloucester, United Kingdom
| | | | - Sergio Hernández-Sánchez
- Department of Pathology and Surgery, Physiotherapy Area, Miguel Hernandez University of Elche, Alicante, Spain
| | - Iñaki Ruiz-Pérez
- Sports Research Centre, Miguel Hernandez University of Elche, Alicante, Spain
| | - Gregory Myer
- Division of Sports Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
| |
Collapse
|
36
|
Karanasiou GS, Tripoliti EE, Papadopoulos TG, Kalatzis FG, Goletsis Y, Naka KK, Bechlioulis A, Errachid A, Fotiadis DI. Predicting adherence of patients with HF through machine learning techniques. Healthc Technol Lett 2016; 3:165-170. [PMID: 27733922 DOI: 10.1049/htl.2016.0041] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 08/26/2016] [Accepted: 09/07/2016] [Indexed: 12/12/2022] Open
Abstract
Heart failure (HF) is a chronic disease characterised by poor quality of life, recurrent hospitalisation and high mortality. Adherence of patient to treatment suggested by the experts has been proven a significant deterrent of the above-mentioned serious consequences. However, the non-adherence rates are significantly high; a fact that highlights the importance of predicting the adherence of the patient and enabling experts to adjust accordingly patient monitoring and management. The aim of this work is to predict the adherence of patients with HF, through the application of machine learning techniques. Specifically, it aims to classify a patient not only as medication adherent or not, but also as adherent or not in terms of medication, nutrition and physical activity (global adherent). Two classification problems are addressed: (i) if the patient is global adherent or not and (ii) if the patient is medication adherent or not. About 11 classification algorithms are employed and combined with feature selection and resampling techniques. The classifiers are evaluated on a dataset of 90 patients. The patients are characterised as medication and global adherent, based on clinician estimation. The highest detection accuracy is 82 and 91% for the first and the second classification problem, respectively.
Collapse
Affiliation(s)
- Georgia Spiridon Karanasiou
- Department of Biomedical Research , Institute of Molecular Biology and Biotechnology , FORTH, GR 45110 Ioannina , Greece
| | - Evanthia Eleftherios Tripoliti
- Department of Biomedical Research , Institute of Molecular Biology and Biotechnology , FORTH, GR 45110 Ioannina , Greece
| | | | - Fanis Georgios Kalatzis
- Department of Biomedical Research , Institute of Molecular Biology and Biotechnology , FORTH, GR 45110 Ioannina , Greece
| | - Yorgos Goletsis
- Department of Economics , University of Ioannina , GR 45110 Ioannina , Greece
| | - Katerina Kyriakos Naka
- Michaelidion Cardiac Center, University of Ioannina, GR 45110 Ioannina, Greece; Department of Cardiology, University of Ioannina, GR 45110 Ioannina, Greece
| | - Aris Bechlioulis
- Michaelidion Cardiac Center, University of Ioannina, GR 45110 Ioannina, Greece; Department of Cardiology, University of Ioannina, GR 45110 Ioannina, Greece
| | - Abdelhamid Errachid
- Université de Lyon, Institut de Sciences Analytiques, ISA , FR 69100 Villeurbanne , France
| | - Dimitrios Ioannis Fotiadis
- Department of Biomedical Research, Institute of Molecular Biology and Biotechnology, FORTH, GR 45110 Ioannina, Greece; Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, GR 45110 Ioannina, Greece
| |
Collapse
|
37
|
Özdemir AT. An Analysis on Sensor Locations of the Human Body for Wearable Fall Detection Devices: Principles and Practice. Sensors (Basel) 2016; 16:s16081161. [PMID: 27463719 PMCID: PMC5017327 DOI: 10.3390/s16081161] [Citation(s) in RCA: 99] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2016] [Revised: 07/03/2016] [Accepted: 07/20/2016] [Indexed: 12/03/2022]
Abstract
Wearable devices for fall detection have received attention in academia and industry, because falls are very dangerous, especially for elderly people, and if immediate aid is not provided, it may result in death. However, some predictive devices are not easily worn by elderly people. In this work, a huge dataset, including 2520 tests, is employed to determine the best sensor placement location on the body and to reduce the number of sensor nodes for device ergonomics. During the tests, the volunteer’s movements are recorded with six groups of sensors each with a triaxial (accelerometer, gyroscope and magnetometer) sensor, which is placed tightly on different parts of the body with special straps: head, chest, waist, right-wrist, right-thigh and right-ankle. The accuracy of individual sensor groups with their location is investigated with six machine learning techniques, namely the k-nearest neighbor (k-NN) classifier, Bayesian decision making (BDM), support vector machines (SVM), least squares method (LSM), dynamic time warping (DTW) and artificial neural networks (ANNs). Each technique is applied to single, double, triple, quadruple, quintuple and sextuple sensor configurations. These configurations create 63 different combinations, and for six machine learning techniques, a total of 63 × 6 = 378 combinations is investigated. As a result, the waist region is found to be the most suitable location for sensor placement on the body with 99.96% fall detection sensitivity by using the k-NN classifier, whereas the best sensitivity achieved by the wrist sensor is 97.37%, despite this location being highly preferred for today’s wearable applications.
Collapse
Affiliation(s)
- Ahmet Turan Özdemir
- Department of Electrical and Electronics Engineering, Erciyes University, Kayseri 38039, Turkey.
| |
Collapse
|
38
|
Yugandhar K, Gromiha MM. Feature selection and classification of protein-protein complexes based on their binding affinities using machine learning approaches. Proteins 2014; 82:2088-96. [PMID: 24648146 DOI: 10.1002/prot.24564] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 12/16/2022]
Abstract
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions.
Collapse
Affiliation(s)
- K Yugandhar
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | |
Collapse
|
39
|
Ventura C, Latino DA, Martins F. Comparison of Multiple Linear Regressions and Neural Networks based QSAR models for the design of new antitubercular compounds. Eur J Med Chem 2013; 70:831-45. [PMID: 24246731 DOI: 10.1016/j.ejmech.2013.10.029] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 07/26/2013] [Accepted: 10/11/2013] [Indexed: 01/29/2023]
Abstract
The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models.
Collapse
|