1
|
Dzięcioł J, Sas W. Perspective on the Application of Machine Learning Algorithms for Flow Parameter Estimation in Recycled Concrete Aggregate. Materials (Basel) 2023; 16:1500. [PMID: 36837130 PMCID: PMC9962052 DOI: 10.3390/ma16041500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 02/07/2023] [Accepted: 02/07/2023] [Indexed: 06/18/2023]
Abstract
The constantly expanding civilization and construction industry pose new challenges for a sustainable development economy. Aiming to protect the environment is often associated with waste management, thereby reducing the number of landfills. The management of recycled concrete aggregate (RCA) from building demolition and its reuse in construction perfectly fits into this trend. The characteristics of post-industrial and recycled materials are not homogeneous as is usually the case with natural materials. This leads to a search for solutions to determine the parameters in the simplest possible manner and with as few resources as possible, while eliminating estimation risks. This task can be solved using machine learning, whose algorithms are increasingly used and developed in many areas of life and industry. The research in this study is aimed at comparing the effectiveness of k-Nearest Neighbors (k-NN) and Artificial Neural Network (ANN) algorithms in determining the permeability coefficient to a linear regression model. This parameter has an important role from the perspective of the application of RCA in civil engineering, particularly in earth construction. Two different RCA materials with different origins and properties were used in the study. The filtration test for each sample was pre-prepared using different compaction energies of 0.17 and 0.59 J/cm3 and for loosely packed samples. Differences in the structures of the test results are presented for both materials. The lowest prediction errors were obtained for the k-NN model. This algorithm obtained for the training sample a coefficient of determination (R2) equal to 0.947 and for the test sample an R2 equal to 0.980. In the case of ANN, the coefficient of determination was in the range of 0.877-0.936. An important part of the study was the interpretation with SHAP of the obtained models, allowing insight into which parameters influenced the predictions. That is significant and novel, considering the heterogeneity of the materials studied, and provides a rationale for further research in this area.
Collapse
Affiliation(s)
- Justyna Dzięcioł
- Institute of Civil Engineering, Warsaw University of Life Sciences, 159 Nowoursynowska, 02-776 Warsaw, Poland
| | - Wojciech Sas
- Water Centre, Warsaw University of Life Sciences, 159 Nowoursynowska, 02-776 Warsaw, Poland
| |
Collapse
|
2
|
Mesquita EDM, Rodrigues FB, Rodrigues AP, Lemes TS, Andrade AO, Vieira MF. Discrimination capability of linear and nonlinear gait features in group classification. Med Eng Phys 2021; 93:59-71. [PMID: 34154776 DOI: 10.1016/j.medengphy.2021.05.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 02/26/2021] [Accepted: 05/25/2021] [Indexed: 11/17/2022]
Abstract
The variability of human movement can be defined as normal variations occurring in motor activity and quantified using linear statistics or nonlinear methods. In the human movement field, linear and nonlinear measures of variability have been used to discriminate groups and conditions in different contexts. Indeed, some authors support the idea that these gait features provide complementary information about movement. However, it is unclear which type of gait variability measure best discriminates different groups or conditions, as a comparison of the discrimination capacity between linear and nonlinear gait variability features in different groups has not been assessed. Therefore, the main objective of this study was to test the discrimination capacity of linear and nonlinear gait features to determine which type of feature would be the most efficient for discriminating older and younger adults and between lower limb amputees and nonamputees using classification algorithms. Data from previously published studies were used. The classification task was performed using the k-nearest neighbors and random forest algorithms. Our results showed that using a combination of linear and nonlinear features resulted in the highest mean accuracy rates (>90%) in group classification, reinforcing the idea that these features are complementary and express different aspects of movement.
Collapse
Affiliation(s)
- Eduardo de Mendonça Mesquita
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Avenida Esperança s/n, Campus Samambaia, 74690-900 Goiânia, Goiás, Brazil.
| | - Fábio Barbosa Rodrigues
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Avenida Esperança s/n, Campus Samambaia, 74690-900 Goiânia, Goiás, Brazil; State University of Goiás - UnU Trindade, Trindade, Brazil
| | - Adriano Péricles Rodrigues
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Avenida Esperança s/n, Campus Samambaia, 74690-900 Goiânia, Goiás, Brazil
| | - Thiago Santana Lemes
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Avenida Esperança s/n, Campus Samambaia, 74690-900 Goiânia, Goiás, Brazil
| | - Adriano O Andrade
- Center for Innovation and Technology Assessment in Health, Federal University of Uberlândia, Uberlândia, Brazil
| | - Marcus Fraga Vieira
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Avenida Esperança s/n, Campus Samambaia, 74690-900 Goiânia, Goiás, Brazil
| |
Collapse
|
3
|
Tocimáková Z, Pusztová L, Paralič J, Pella D. Case-Based Reasoning for Support of the Diagnostics of Cardiovascular Diseases. Stud Health Technol Inform 2020; 270:537-541. [PMID: 32570441 DOI: 10.3233/shti200218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper we present a decision support system, which has been designed and implemented on the case-based reasoning principles. Our decision support system is being implemented in tight cooperation with the cardiologist, who represents the main future users of the system. Our system enables its user to find the most similar historical cases to a new patient, suggest the most probable result of the potential coronary angiography examination and also provide various useful visualizations to the cardiologist, who is responsible for the final decision about recommending the coronary angiography or not for the new patient. The first response from the cardiologist about our system is very promising.
Collapse
Affiliation(s)
- Zuzana Tocimáková
- Department of Cybernetic and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, Letná 9, 042 00 Košice, Slovakia
| | - L'udmila Pusztová
- Department of Cybernetic and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, Letná 9, 042 00 Košice, Slovakia
| | - Ján Paralič
- Department of Cybernetic and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, Letná 9, 042 00 Košice, Slovakia
| | - Dominik Pella
- 1st Department of Cardiology, East Slovak Institute of Cardiovascular Diseases, Ondavská 8, 040 11 Košice, Slovakia
| |
Collapse
|
4
|
Carafini A, Sacco IC, Vieira MF. Pelvic floor pressure distribution profile in urinary incontinence: a classification study with feature selection. PeerJ 2019; 7:e8207. [PMID: 31844587 PMCID: PMC6907092 DOI: 10.7717/peerj.8207] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 11/13/2019] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Pelvic floor pressure distribution profiles, obtained by a novel instrumented non-deformable probe, were used as the input to a feature extraction, selection, and classification approach to test their potential for an automatic diagnostic system for objective female urinary incontinence assessment. We tested the performance of different feature selection approaches and different classifiers, as well as sought to establish the group of features that provides the greatest discrimination capability between continent and incontinent women. METHODS The available data for evaluation consisted of intravaginal spatiotemporal pressure profiles acquired from 24 continent and 24 incontinent women while performing four pelvic floor maneuvers: the maximum contraction maneuver, Valsalva maneuver, endurance maneuver, and wave maneuver. Feature extraction was guided by previous studies on the characterization of pressure profiles in the vaginal canal, where the extracted features were tested concerning their repeatability. Feature selection was achieved through a combination of a ranking method and a complete non-exhaustive subset search algorithm: branch and bound and recursive feature elimination. Three classifiers were tested: k-nearest neighbors (k-NN), support vector machine, and logistic regression. RESULTS Of the classifiers employed, there was not one that outperformed the others; however, k-NN presented statistical inferiority in one of the maneuvers. The best result was obtained through the application of recursive feature elimination on the features extracted from all the maneuvers, resulting in 77.1% test accuracy, 74.1% precision, and 83.3 recall, using SVM. Moreover, the best feature subset, obtained by observing the selection frequency of every single feature during the application of branch and bound, was directly employed on the classification, thus reaching 95.8% accuracy. Although not at the level required by an automatic system, the results show the potential use of pelvic floor pressure distribution profiles data and provide insights into the pelvic floor functioning aspects that contribute to urinary incontinence.
Collapse
Affiliation(s)
- Adriano Carafini
- Bioengineering and Biomechanics Laboratory, Universidade Federal de Goiás, Goiânia, Goiás, Brazil
| | - Isabel C.N. Sacco
- Physical Therapy, Speech and Occupational Therapy Department, School of Medicine, Universidade de São Paulo, São Paulo, São Paulo, Brazil
| | - Marcus Fraga Vieira
- Bioengineering and Biomechanics Laboratory, Universidade Federal de Goiás, Goiânia, Goiás, Brazil
| |
Collapse
|
5
|
Abstract
Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k-Nearest Neighbors (k-NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction, classifier performance (least to greatest error) was ranked as follows: SVM, Random Forest, Naïve Bayes, sPLS-DA, Neural Networks, PLS-DA and k-NN classifiers. When non-normal error distributions were introduced, the performance of PLS-DA and k-NN classifiers deteriorated further relative to the remaining techniques. Over the real datasets, a trend of better performance of SVM and Random Forest classifier performance was observed.
Collapse
Affiliation(s)
- Patrick J Trainor
- Division of Cardiovascular Medicine, Department of Medicine, University of Louisville, 580 S. Preston St., Louisville, KY 40202, USA.
| | - Andrew P DeFilippis
- Division of Cardiovascular Medicine, Department of Medicine, University of Louisville, 580 S. Preston St., Louisville, KY 40202, USA.
| | - Shesh N Rai
- Department of Bioinformatics and Biostatistics, University of Louisville, 505 S. Hancock St., Louisville, KY 40202, USA.
| |
Collapse
|
6
|
Cern A, Barenholz Y, Tropsha A, Goldblum A. Computer-aided design of liposomal drugs: In silico prediction and experimental validation of drug candidates for liposomal remote loading. J Control Release 2013; 173:125-31. [PMID: 24184343 DOI: 10.1016/j.jconrel.2013.10.029] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Revised: 10/17/2013] [Accepted: 10/22/2013] [Indexed: 11/26/2022]
Abstract
Previously we have developed and statistically validated Quantitative Structure Property Relationship (QSPR) models that correlate drugs' structural, physical and chemical properties as well as experimental conditions with the relative efficiency of remote loading of drugs into liposomes (Cern et al., J. Control. Release 160 (2012) 147-157). Herein, these models have been used to virtually screen a large drug database to identify novel candidate molecules for liposomal drug delivery. Computational hits were considered for experimental validation based on their predicted remote loading efficiency as well as additional considerations such as availability, recommended dose and relevance to the disease. Three compounds were selected for experimental testing which were confirmed to be correctly classified by our previously reported QSPR models developed with Iterative Stochastic Elimination (ISE) and k-Nearest Neighbors (kNN) approaches. In addition, 10 new molecules with known liposome remote loading efficiency that were not used by us in QSPR model development were identified in the published literature and employed as an additional model validation set. The external accuracy of the models was found to be as high as 82% or 92%, depending on the model. This study presents the first successful application of QSPR models for the computer-model-driven design of liposomal drugs.
Collapse
Affiliation(s)
- Ahuva Cern
- Laboratory of Membrane and Liposome Research, Department of Biochemistry, IMRIC, The Hebrew University - Hadassah Medical School, Jerusalem, Israel; Molecular Modeling and Drug Design Laboratory, The Institute for Drug Research, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Yechezkel Barenholz
- Laboratory of Membrane and Liposome Research, Department of Biochemistry, IMRIC, The Hebrew University - Hadassah Medical School, Jerusalem, Israel.
| | - Alexander Tropsha
- The Laboratory for Molecular Modeling, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Amiram Goldblum
- Molecular Modeling and Drug Design Laboratory, The Institute for Drug Research, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
7
|
Abstract
Machine learning techniques have been widely applied to solve the problem of predicting protein secondary structure from the amino acid sequence. They have gained substantial success in this research area. Many methods have been used including k-Nearest Neighbors (k-NNs), Hidden Markov Models (HMMs), Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which have attracted attention recently. Today, the main goal remains to improve the prediction quality of the secondary structure elements. The prediction accuracy has been continuously improved over the years, especially by using hybrid or ensemble methods and incorporating evolutionary information in the form of profiles extracted from alignments of multiple homologous sequences. In this paper, we investigate how best to combine k-NNs, ANNs and Multi-class SVMs (M-SVMs) to improve secondary structure prediction of globular proteins. An ensemble method which combines the outputs of two feed-forward ANNs, k-NN and three M-SVM classifiers has been applied. Ensemble members are combined using two variants of majority voting rule. An heuristic based filter has also been applied to refine the prediction. To investigate how much improvement the general ensemble method can give rather than the individual classifiers that make up the ensemble, we have experimented with the proposed system on the two widely used benchmark datasets RS126 and CB513 using cross-validation tests by including PSI-BLAST position-specific scoring matrix (PSSM) profiles as inputs. The experimental results reveal that the proposed system yields significant performance gains when compared with the best individual classifier.
Collapse
Affiliation(s)
- Hafida Bouziane
- Department of Computer Science, USTO-MB University, BP 1505 El Mnaouer, Oran, Algeria
| | | | | |
Collapse
|