201
|
Zhang Z, Zhu Q, Zhu F, Li J, Cheng D, Liu Y, Luo J. Density decay graph-based density peak clustering. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107075] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
202
|
Waterlogging Resistance Evaluation Index and Photosynthesis Characteristics Selection: Using Machine Learning Methods to Judge Poplar’s Waterlogging Resistance. MATHEMATICS 2021. [DOI: 10.3390/math9131542] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Flood disasters are the major natural disaster that affects the growth of agriculture and forestry crops. Due to rapid growth and strong waterlogging resistance characteristics, many studies have explained the waterlogging resistance mechanism of poplar from different perspectives. However, there is no accurate method to define the evaluation index of waterlogging resistance. In addition, there is also a lack of research on predicting the waterlogging resistance of poplars. Based on the changes of poplar biomass and seedling height, the evaluation index of poplar resistance to waterlogging was well determined, and the characteristics of photosynthesis were used to predict the waterlogging resistance of poplars. First, four methods of hierarchical clustering, lasso, stepwise regression and all-subsets regression were used to extract the photosynthesis characteristics. After that, the support vector regression model of poplar resistance to waterlogging was established by using the characteristic parameters of photosynthesis. Finally, the results show that the SVR model based on Stepwise regression and Lasso method has high precision. On the test set, the coefficient of determination (R2) was 0.8581 and 0.8492, the mean square error (MSE) was 0.0104 and 0.0341, and the mean relative error (MRE) was 9.78% and 9.85%, respectively. Therefore, using the characteristic parameters of photosynthesis to predict the waterlogging resistance of poplars is feasible.
Collapse
|
203
|
Enhanced Day-Ahead PV Power Forecast: Dataset Clustering for an Effective Artificial Neural Network Training. THE 7TH INTERNATIONAL CONFERENCE ON TIME SERIES AND FORECASTING 2021. [DOI: 10.3390/engproc2021005016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
204
|
A Search Method for Optimal Band Combination of Hyperspectral Imagery Based on Two Layers Selection Strategy. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:5592323. [PMID: 34239549 PMCID: PMC8241513 DOI: 10.1155/2021/5592323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/28/2021] [Accepted: 05/28/2021] [Indexed: 11/17/2022]
Abstract
A band selection method based on two layers selection (TLS) strategy, which forms an optimal subset from all-bands set to reconstitute the original hyperspectral imagery (HSI) and aims to cost a fewer bands for better performances, is proposed in this paper. As its name implies, TLS picks out the bands with low correlation and a large amount of information into the target set to reach dimensionality reduction for HSI via two phases. Specifically, the fast density peaks clustering (FDPC) algorithm is used to select the most representative node in each cluster to build a candidate set at first. During the implementation, we normalize the local density and relative distance and utilize the dynamic cutoff distance to weaken the influence of density so that the selection is more likely to be carried out in scattered clusters than in high-density ones. After that, we conduct a further selection in the candidate set using mRMR strategy and comprehensive measurement of information (CMI), and the eventual winners will be selected into the target set. Compared with other six state-of-the-art unsupervised algorithms on three real-world HSI data sets, the results show that TLS can group the bands with lower correlation and richer information and has obvious advantages in indicators of overall accuracy (OA), average accuracy (AA), and Kappa coefficient.
Collapse
|
205
|
van Allen Z, Bacon SL, Bernard P, Brown H, Desroches S, Kastner M, Lavoie K, Marques M, McCleary N, Straus S, Taljaard M, Thavorn K, Tomasone JR, Presseau J. Clustering of Unhealthy Behaviors: Protocol for a Multiple Behavior Analysis of Data From the Canadian Longitudinal Study on Aging. JMIR Res Protoc 2021; 10:e24887. [PMID: 34114962 PMCID: PMC8235290 DOI: 10.2196/24887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 03/08/2021] [Accepted: 04/19/2021] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Health behaviors such as physical inactivity, unhealthy eating, smoking tobacco, and alcohol use are leading risk factors for noncommunicable chronic diseases and play a central role in limiting health and life satisfaction. To date, however, health behaviors tend to be considered separately from one another, resulting in guidelines and interventions for healthy aging siloed by specific behaviors and often focused only on a given health behavior without considering the co-occurrence of family, social, work, and other behaviors of everyday life. OBJECTIVE The aim of this study is to understand how behaviors cluster and how such clusters are associated with physical and mental health, life satisfaction, and health care utilization may provide opportunities to leverage this co-occurrence to develop and evaluate interventions to promote multiple health behavior changes. METHODS Using cross-sectional baseline data from the Canadian Longitudinal Study on Aging, we will perform a predefined set of exploratory and hypothesis-generating analyses to examine the co-occurrence of health and everyday life behaviors. We will use agglomerative hierarchical cluster analysis to cluster individuals based on their behavioral tendencies. Multinomial logistic regression will then be used to model the relationships between clusters and demographic indicators, health care utilization, and general health and life satisfaction, and assess whether sex and age moderate these relationships. In addition, we will conduct network community detection analysis using the clique percolation algorithm to detect overlapping communities of behaviors based on the strength of relationships between variables. RESULTS Baseline data for the Canadian Longitudinal Study on Aging were collected from 51,338 participants aged between 45 and 85 years. Data were collected between 2010 and 2015. Secondary data analysis for this project was approved by the Ottawa Health Science Network Research Ethics Board (protocol ID #20190506-01H). CONCLUSIONS This study will help to inform the development of interventions tailored to subpopulations of adults (eg, physically inactive smokers) defined by the multiple behaviors that describe their everyday life experiences. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/24887.
Collapse
Affiliation(s)
- Zack van Allen
- School of Psychology, University of Ottawa, Ottawa, ON, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Simon L Bacon
- Department of Health, Kinesiology & Applied Physiology, Concordia University, Montreal, QC, Canada
- Montreal Behavioural Medicine Centre, Le Centre intégré universitaire de santé et de services sociaux du Nord-de-l'Île-de-Montréal, Montreal, QC, Canada
| | - Paquito Bernard
- Department of Physical Activity Sciences, University of Quebec in Montreal, Montreal, QC, Canada
- Research Center of the Montreal Mental Health University Institute, Montreal, QC, Canada
| | - Heather Brown
- Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sophie Desroches
- Department of Food and Nutrition Sciences, Laval University, Quebec City, QC, Canada
| | - Monika Kastner
- Department of Medicine, University of Toronto, Toronto, ON, Canada
- North York General Hospital, Toronto, ON, Canada
| | - Kim Lavoie
- Montreal Behavioural Medicine Centre, Le Centre intégré universitaire de santé et de services sociaux du Nord-de-l'Île-de-Montréal, Montreal, QC, Canada
- Department of Psychology, University of Quebec in Montreal, Montreal, QC, Canada
| | - Marta Marques
- ADAPT Science Foundation Ireland Research Centre, Trinity College Dublin, Dublin, Ireland
- Comprehensive Health Research Centre, NOVA Medical School, Lisbon, Portugal
| | - Nicola McCleary
- School of Psychology, University of Ottawa, Ottawa, ON, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Sharon Straus
- Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Monica Taljaard
- School of Psychology, University of Ottawa, Ottawa, ON, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Kednapa Thavorn
- School of Psychology, University of Ottawa, Ottawa, ON, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | | | - Justin Presseau
- School of Psychology, University of Ottawa, Ottawa, ON, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| |
Collapse
|
206
|
Kim SJ, Kim JG. Location-Based Resource Allocation in Ultra-Dense Network with Clustering. SENSORS 2021; 21:s21124022. [PMID: 34200890 PMCID: PMC8230466 DOI: 10.3390/s21124022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 05/20/2021] [Accepted: 06/08/2021] [Indexed: 11/16/2022]
Abstract
With the rapid deployment of present-day mobile communication systems, user traffic requirements have increased tremendously. An ultra-dense network is a configuration in which the density of small base stations is greater than or equal to that of the user equipment. Ultra-dense networks are considered as the key technology for 5th generation networks as they can improve the link quality and increase the system capacity. However, in an ultra-dense network, small base stations are densely positioned, so one user equipment may receive signals from two or more small base stations. This may cause a severe inter-cell interference problem. In this study, we considered a coordinated multi-point scenario, a cooperative technology between base stations to alleviate the interference. In addition, to suppress the occurrence of severe interference at the cell edges, link formation was carried out by considering the degree of cell load for each cluster. After the formation of links between all the base stations and user equipment, a subcarrier allocation procedure was performed. The subcarrier allocation method used in this study was based on the location of base stations with clustering to improve the data rate and reduce the interference between the clusters. Power allocation was based on the channel gain between the base station and user equipment. Simulation results showed that the proposed scheme delivered a higher sum rate than the other resource allocation methods reported previously for various types of user equipment.
Collapse
|
207
|
Zhu Y, Deng Q, Huang D, Jing B, Zhang B. Clustering based on Kolmogorov–Smirnov statistic with application to bank card transaction data. J R Stat Soc Ser C Appl Stat 2021. [DOI: 10.1111/rssc.12471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - Qiong Deng
- Renmin University of China Beijing China
| | | | - Bingyi Jing
- The Hong Kong University of Science and Technology Hong Kong China
| | - Bo Zhang
- Renmin University of China Beijing China
| |
Collapse
|
208
|
Hybrid classification of Android malware based on fuzzy clustering and the gradient boosting machine. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05450-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
209
|
A Novel Method for Effective Cell Segmentation and Tracking in Phase Contrast Microscopic Images. SENSORS 2021; 21:s21103516. [PMID: 34070081 PMCID: PMC8158140 DOI: 10.3390/s21103516] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 05/12/2021] [Accepted: 05/14/2021] [Indexed: 11/16/2022]
Abstract
Cell migration plays an important role in the identification of various diseases and physiological phenomena in living organisms, such as cancer metastasis, nerve development, immune function, wound healing, and embryo formulation and development. The study of cell migration with a real-time microscope generally takes several hours and involves analysis of the movement characteristics by tracking the positions of cells at each time interval in the images of the observed cells. Morphological analysis considers the shapes of the cells, and a phase contrast microscope is used to observe the shape clearly. Therefore, we developed a segmentation and tracking method to perform a kinetic analysis by considering the morphological transformation of cells. The main features of the algorithm are noise reduction using a block-matching 3D filtering method, k-means clustering to mitigate the halo signal that interferes with cell segmentation, and the detection of cell boundaries via active contours, which is an excellent way to detect boundaries. The reliability of the algorithm developed in this study was verified using a comparison with the manual tracking results. In addition, the segmentation results were compared to our method with unsupervised state-of-the-art methods to verify the proposed segmentation process. As a result of the study, the proposed method had a lower error of less than 40% compared to the conventional active contour method.
Collapse
|
210
|
Adams H, Moy M. Topology Applied to Machine Learning: From Global to Local. Front Artif Intell 2021; 4:668302. [PMID: 34056580 PMCID: PMC8160457 DOI: 10.3389/frai.2021.668302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 04/15/2021] [Indexed: 11/24/2022] Open
Abstract
Through the use of examples, we explain one way in which applied topology has evolved since the birth of persistent homology in the early 2000s. The first applications of topology to data emphasized the global shape of a dataset, such as the three-circle model for 3 × 3 pixel patches from natural images, or the configuration space of the cyclo-octane molecule, which is a sphere with a Klein bottle attached via two circles of singularity. In these studies of global shape, short persistent homology bars are disregarded as sampling noise. More recently, however, persistent homology has been used to address questions about the local geometry of data. For instance, how can local geometry be vectorized for use in machine learning problems? Persistent homology and its vectorization methods, including persistence landscapes and persistence images, provide popular techniques for incorporating both local geometry and global topology into machine learning. Our meta-hypothesis is that the short bars are as important as the long bars for many machine learning tasks. In defense of this claim, we survey applications of persistent homology to shape recognition, agent-based modeling, materials science, archaeology, and biology. Additionally, we survey work connecting persistent homology to geometric features of spaces, including curvature and fractal dimension, and various methods that have been used to incorporate persistent homology into machine learning.
Collapse
Affiliation(s)
- Henry Adams
- Department of Mathematics, Colorado State University, Fort Collins, CO, United States
| | - Michael Moy
- Department of Mathematics, Colorado State University, Fort Collins, CO, United States
| |
Collapse
|
211
|
Clustering based semi-supervised machine learning for DDoS attack classification. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2021. [DOI: 10.1016/j.jksuci.2019.02.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
212
|
Modified semi-supervised affinity propagation clustering with fuzzy density fruit fly optimization. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05431-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
213
|
Gupta A, Datta S, Das S. Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multiobjective Optimization Approach. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2601-2611. [PMID: 30998486 DOI: 10.1109/tcyb.2019.2907002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Fuzzy clustering methods identify naturally occurring clusters in a dataset, where the extent to which different clusters are overlapped can differ. Most methods have a parameter to fix the level of fuzziness. However, the appropriate level of fuzziness depends on the application at hand. This paper presents an entropy c -means (ECM), a method of fuzzy clustering that simultaneously optimizes two contradictory objective functions, resulting in the creation of fuzzy clusters with different levels of fuzziness. This allows ECM to identify clusters with different degrees of overlap. ECM optimizes the two objective functions using two multiobjective optimization methods, nondominated sorting genetic algorithm II (NSGA-II) and multiobjective evolutionary algorithm based on decomposition (MOEA/D). We also propose a method to select a suitable tradeoff clustering from the Pareto front. Experiments on challenging synthetic datasets as well as real-world datasets show that ECM leads to better cluster detection compared to the conventional fuzzy clustering methods as well as previously used multiobjective methods for fuzzy clustering.
Collapse
|
214
|
Liu N, Xu Z, Zeng XJ, Ren P. An agglomerative hierarchical clustering algorithm for linear ordinal rankings. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.12.056] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
215
|
Mandal P, Samanta S, Pal M. Large-scale group decision-making based on Pythagorean linguistic preference relations using experts clustering and consensus measure with non-cooperative behavior analysis of clusters. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00369-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
AbstractTo represent qualitative aspect of uncertainty and imprecise information, linguistic preference relation (LPR) is a powerful tool for experts expressing their opinions in group decision-making (GDM) according to linguistic variables (LVs). Since for an LV, it generally means that membership degree is one, and non-membership and hesitation degrees of the experts cannot be expressed. Pythagorean linguistic numbers/values (PLNs/PLVs) are novel choice to address this issue. The aim of this paper which we propose a GDM problem involved a large number of the experts is called large-scale GDM (LSGDM) based on Pythagorean linguistic preference relation (PLPR) with a consensus model. Sometimes, the experts do not modify their opinions to achieve consensus. Therefore, the experts’ proper opinions’ management with their non-cooperative behaviors (NCBs) is necessary to establish a consensus model. At the same time, it is essential to ensure the proper adjustment of the credibility information. The proposed model using grey clustering method is divided with the experts’ similar evaluations into a subgroup. Then, we aggregate the experts’ evaluations in each cluster. A cluster consensus index (CCI) and a group consensus index (GCI) are presented to measure consensus level among the clusters. Then, we provide a mechanism for managing the NCBs of the clusters, which contain two parts: (1) NCB degree is defined using CCI and GCI for identifying the NCBs of the clusters; (2) implemented the weight punishment mechanism of the NCBs clusters to consensus improvement. Finally, an example is offered for usefulness of the proposed approach.
Collapse
|
216
|
Regazzoni F, Palmieri P, Smailbegovic F, Cammarota R, Polian I. Protecting artificial intelligence IPs: a survey of watermarking and fingerprinting for machine learning. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2021. [DOI: 10.1049/cit2.12029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Francesco Regazzoni
- University of Amsterdam Amsterdam The Netherlands
- ALaRI – USI Lugano Switzerland
| | | | | | | | | |
Collapse
|
217
|
Farrahi V, Kangas M, Kiviniemi A, Puukka K, Korpelainen R, Jämsä T. Accumulation patterns of sedentary time and breaks and their association with cardiometabolic health markers in adults. Scand J Med Sci Sports 2021; 31:1489-1507. [PMID: 33811393 DOI: 10.1111/sms.13958] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 03/15/2021] [Accepted: 03/18/2021] [Indexed: 01/20/2023]
Abstract
Breaking up sedentary time with physical activity (PA) could modify the detrimental cardiometabolic health effects of sedentary time. Our aim was to identify profiles according to distinct accumulation patterns of sedentary time and breaks in adults, and to investigate how these profiles are associated with cardiometabolic outcomes. Participants (n = 4439) of the Northern Finland Birth Cohort 1966 at age 46 years wore a hip-worn accelerometer for 7 consecutive days during waking hours. Uninterrupted ≥1-min sedentary bouts were identified, and non-sedentary bouts in between two consecutive sedentary bouts were considered as sedentary breaks. K-means clustering was performed with 65 variables characterizing how sedentary time was accumulated and interrupted. Linear regression was used to determine the association of accumulation patterns with cardiometabolic health markers. Four distinct groups were formed as follows: "Couch potatoes" (n = 1222), "Prolonged sitters" (n = 1179), "Shortened sitters" (n = 1529), and "Breakers" (n = 509). Couch potatoes had the highest level of sedentariness and the shortest sedentary breaks. Prolonged sitters, accumulating sedentary time in bouts of ≥15-30 min, had no differences in cardiometabolic outcomes compared with Couch potatoes. Shortened sitters accumulated sedentary time in bouts lasting <15 min and performed more light-intensity PA in their sedentary breaks, and Breakers performed more light-intensity and moderate-to-vigorous PA. These latter two profiles had lower levels of adiposity, blood lipids, and insulin sensitivity, compared with Couch potatoes (1.1-25.0% lower values depending on the cardiometabolic health outcome, group, and adjustments for potential confounders). Avoiding uninterrupted sedentary time with any active behavior from light-intensity upwards could be beneficial for cardiometabolic health in adults.
Collapse
Affiliation(s)
- Vahid Farrahi
- Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland
| | - Maarit Kangas
- Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland.,Medical Research Center, Oulu University Hospital, University of Oulu, Oulu, Finland
| | - Antti Kiviniemi
- Medical Research Center, Oulu University Hospital, University of Oulu, Oulu, Finland.,Research Unit of Internal Medicine, University of Oulu, Oulu, Finland
| | - Katri Puukka
- Department of Clinical Chemistry, NordLab Oulu, Medical Research Center Oulu, Oulu University Hospital, University of Oulu, Oulu, Finland
| | - Raija Korpelainen
- Medical Research Center, Oulu University Hospital, University of Oulu, Oulu, Finland.,Center for Life Course Health Research, University of Oulu, Oulu, Finland.,Department of Sports and Exercise Medicine, Oulu Deaconess Institute Foundation sr, Oulu, Finland
| | - Timo Jämsä
- Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland.,Medical Research Center, Oulu University Hospital, University of Oulu, Oulu, Finland.,Diagnostic Radiology, Oulu University Hospital, Oulu, Finland
| |
Collapse
|
218
|
Ouyang T, Pedrycz W, Reyes-Galaviz OF, Pizzi NJ. Granular Description of Data Structures: A Two-Phase Design. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1902-1912. [PMID: 30605118 DOI: 10.1109/tcyb.2018.2887115] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The study is concerned with a description of large numeric data with the aid of building a limited collection of representative information granules with the objective of capturing the structure of the original data. The proposed development scheme consists of two steps. First, a clustering algorithm characterized by high flexibility of coping with the diverse geometry of data structure and efficient computational overhead is invoked. At the second step, a clustering algorithm applied to the clusters already formed during the first phase, yielding a collection of numeric prototypes is involved and the numeric prototypes produced there are then generalized into their granular prototypes. The quality of granular prototypes is quantified while their build-up is supported by the mechanisms of granular computing such as the principle of justifiable granularity. In this paper, the clustering algorithms of DBSCAN and fuzzy C -means were used in successive phases of the processed approach. The experimental studies concerning synthetic data and publicly available data are covered and the performance of the developed approach is assessed along with a comparative analysis.
Collapse
|
219
|
Malafeev A, Hertig-Godeschalk A, Schreier DR, Skorucak J, Mathis J, Achermann P. Automatic Detection of Microsleep Episodes With Deep Learning. Front Neurosci 2021; 15:564098. [PMID: 33841068 PMCID: PMC8024556 DOI: 10.3389/fnins.2021.564098] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Accepted: 03/03/2021] [Indexed: 11/16/2022] Open
Abstract
Brief fragments of sleep shorter than 15 s are defined as microsleep episodes (MSEs), often subjectively perceived as sleepiness. Their main characteristic is a slowing in frequency in the electroencephalogram (EEG), similar to stage N1 sleep according to standard criteria. The maintenance of wakefulness test (MWT) is often used in a clinical setting to assess vigilance. Scoring of the MWT in most sleep-wake centers is limited to classical definition of sleep (30 s epochs), and MSEs are mostly not considered in the absence of established scoring criteria defining MSEs but also because of the laborious work. We aimed for automatic detection of MSEs with machine learning, i.e., with deep learning based on raw EEG and EOG data as input. We analyzed MWT data of 76 patients. Experts visually scored wakefulness, and according to recently developed scoring criteria MSEs, microsleep episode candidates (MSEc), and episodes of drowsiness (ED). We implemented segmentation algorithms based on convolutional neural networks (CNNs) and a combination of a CNN with a long-short term memory (LSTM) network. A LSTM network is a type of a recurrent neural network which has a memory for past events and takes them into account. Data of 53 patients were used for training of the classifiers, 12 for validation and 11 for testing. Our algorithms showed a good performance close to human experts. The detection was very good for wakefulness and MSEs and poor for MSEc and ED, similar to the low inter-expert reliability for these borderline segments. We performed a visualization of the internal representation of the data by the artificial neuronal network performing best using t-distributed stochastic neighbor embedding (t-SNE). Visualization revealed that MSEs and wakefulness were mostly separable, though not entirely, and MSEc and ED largely intersected with the two main classes. We provide a proof of principle that it is feasible to reliably detect MSEs with deep neuronal networks based on raw EEG and EOG data with a performance close to that of human experts. The code of the algorithms (https://github.com/alexander-malafeev/microsleep-detection) and data (https://zenodo.org/record/3251716) are available.
Collapse
Affiliation(s)
- Alexander Malafeev
- Institute of Pharmacology and Toxicology, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Anneke Hertig-Godeschalk
- Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - David R. Schreier
- Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Jelena Skorucak
- Institute of Pharmacology and Toxicology, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Johannes Mathis
- Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Peter Achermann
- Institute of Pharmacology and Toxicology, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
- Department of Psychiatry, Psychotherapy and Psychosomatics, The KEY Institute for Brain-Mind Research, University Hospital of Psychiatry, Zurich, Switzerland
- Sleep and Health, University of Zurich, Zurich, Switzerland
| |
Collapse
|
220
|
Luo Z, Zeng LL, Qin J, Hou C, Shen H, Hu D. Functional Parcellation of Human Brain Precuneus Using Density-Based Clustering. Cereb Cortex 2021; 30:269-282. [PMID: 31044223 DOI: 10.1093/cercor/bhz086] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 03/12/2019] [Accepted: 03/29/2019] [Indexed: 12/22/2022] Open
Abstract
The human precuneus is involved in many high-level cognitive functions, which strongly suggests the existence of biologically meaningful subdivisions. However, the functional parcellation of the precuneus needs much to be investigated. In this study, we developed an eigen clustering (EIC) approach for the parcellation using precuneus-cortical functional connectivity from fMRI data of the Human Connectome Project. The EIC approach is robust to noise and can automatically determine the cluster number. It is consistently demonstrated that the human precuneus can be subdivided into six symmetrical and connected parcels. The anterior and posterior precuneus participate in sensorimotor and visual functions, respectively. The central precuneus with four subregions indicates a media role in the interaction of the default mode, dorsal attention, and frontoparietal control networks. The EIC-based functional parcellation is free of the spatial distance constraint and is more functionally coherent than parcellation using typical clustering algorithms. The precuneus subregions had high accordance with cortical morphology and revealed good functional segregation and integration characteristics in functional task-evoked activations. This study may shed new light on the human precuneus function at a delicate level and offer an alternative scheme for human brain parcellation.
Collapse
Affiliation(s)
- Zhiguo Luo
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
| | - Ling-Li Zeng
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
| | - Jian Qin
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
| | - Chenping Hou
- College of Science, National University of Defense Technology, Changsha, Hunan, China
| | - Hui Shen
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
| | - Dewen Hu
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan, China
| |
Collapse
|
221
|
Universal image segmentation for optical identification of 2D materials. Sci Rep 2021; 11:5808. [PMID: 33707609 PMCID: PMC7970966 DOI: 10.1038/s41598-021-85159-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 02/19/2021] [Indexed: 11/21/2022] Open
Abstract
Machine learning methods are changing the way data is analyzed. One of the most powerful and widespread applications of these techniques is in image segmentation wherein disparate objects of a digital image are partitioned and classified. Here we present an image segmentation program incorporating a series of unsupervised clustering algorithms for the automatic thickness identification of two-dimensional materials from digital optical microscopy images. The program identifies mono- and few-layer flakes of a variety of materials on both opaque and transparent substrates with a pixel accuracy of roughly 95%. Contrasting with previous attempts, application generality is achieved through preservation and analysis of all three digital color channels and Gaussian mixture model fits to arbitrarily shaped data clusters. Our results provide a facile implementation of data clustering for the universal, automatic identification of two-dimensional materials exfoliated onto any substrate.
Collapse
|
222
|
Xu N, Finkelman RB, Dai S, Xu C, Peng M. Average Linkage Hierarchical Clustering Algorithm for Determining the Relationships between Elements in Coal. ACS OMEGA 2021; 6:6206-6217. [PMID: 33718711 PMCID: PMC7948219 DOI: 10.1021/acsomega.0c05758] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 02/09/2021] [Indexed: 05/24/2023]
Abstract
The modes of occurrence of elements in coal are important not only because they can provide insights into the sources of mineral matter in coal but also because they are vital in determining the behavior of their environmental and human health impacts. Besides a number of physical and chemical analyses for determining the modes of occurrence in coal, some statistical methods have been commonly adopted to investigate elements in coal. Among many statistical methods, the hierarchy clustering algorithm is the most common method for deducing modes of occurrence of elements in coal. However, different hierarchical clustering algorithms with a number of similarity measures sometimes result in different modes of occurrence of elements in coal, and subsequently in some cases, such results could be confusing. Therefore, which algorithm is more effective in determining the modes of occurrence in coal deserves to be investigated. In this paper, the data sets of coals from the Adaohai coal mine in Inner Mongolia, China, are used for this performance evaluation. From the analytical results with the average linkage hierarchical clustering algorithm on Adaohai coal samples, many instructive and surprising insights can be concluded. For example, selenium, Be, and Tl do not appear to be in agreement with geochemical principles, that is, substituting for P, associated with rare earth elements, and occurring in Fe-sulfides, respectively. In conclusion, the average linkage hierarchical clustering algorithm with correlation similarity is much better in the analysis of the geological processes than the previous statistical method used in Adaohai coal samples, that is, centroid linkage hierarchical clustering algorithm with Pearson correlation similarity.
Collapse
Affiliation(s)
- Na Xu
- College
of Geoscience and Survey Engineering, China
University of Mining and Technology (Beijing), Beijing 100083, China
| | - Robert B. Finkelman
- College
of Geoscience and Survey Engineering, China
University of Mining and Technology (Beijing), Beijing 100083, China
- University
of Texas at Dallas, Richardson, Texas 75080, United States
| | - Shifeng Dai
- College
of Geoscience and Survey Engineering, China
University of Mining and Technology (Beijing), Beijing 100083, China
| | - Chuanpeng Xu
- College
of Geoscience and Survey Engineering, China
University of Mining and Technology (Beijing), Beijing 100083, China
| | - Mengmeng Peng
- College
of Geoscience and Survey Engineering, China
University of Mining and Technology (Beijing), Beijing 100083, China
| |
Collapse
|
223
|
High-throughput image segmentation and machine learning approaches in the plant sciences across multiple scales. Emerg Top Life Sci 2021; 5:239-248. [DOI: 10.1042/etls20200273] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/09/2021] [Accepted: 02/11/2021] [Indexed: 01/12/2023]
Abstract
Agriculture has benefited greatly from the rise of big data and high-performance computing. The acquisition and analysis of data across biological scales have resulted in strategies modeling inter- actions between plant genotype and environment, models of root architecture that provide insight into resource utilization, and the elucidation of cell-to-cell communication mechanisms that are instrumental in plant development. Image segmentation and machine learning approaches for interpreting plant image data are among many of the computational methodologies that have evolved to address challenging agricultural and biological problems. These approaches have led to contributions such as the accelerated identification of gene that modulate stress responses in plants and automated high-throughput phenotyping for early detection of plant diseases. The continued acquisition of high throughput imaging across multiple biological scales provides opportunities to further push the boundaries of our understandings quicker than ever before. In this review, we explore the current state of the art methodologies in plant image segmentation and machine learning at the agricultural, organ, and cellular scales in plants. We show how the methodologies for segmentation and classification differ due to the diversity of physical characteristics found at these different scales. We also discuss the hardware technologies most commonly used at these different scales, the types of quantitative metrics that can be extracted from these images, and how the biological mechanisms by which plants respond to abiotic/biotic stresses or genotypic modifications can be extracted from these approaches.
Collapse
|
224
|
Jesus J, Canuto A, Araújo D. An exploratory analysis of data noisy scenarios in a Pareto-front based dynamic feature selection method. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106951] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
225
|
Shi L, Laramee RS, Chen G. Integral Curve Clustering and Simplification for Flow Visualization: A Comparative Evaluation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1967-1985. [PMID: 31514143 DOI: 10.1109/tvcg.2019.2940935] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Unsupervised clustering techniques have been widely applied to flow simulation data to alleviate clutter and occlusion in the resulting visualization. However, there is an absence of systematic guidelines for users to evaluate (both quantitatively and visually) the appropriate clustering technique and similarity measures for streamline and pathline curves. In this work, we provide an overview of a number of prevailing curve clustering techniques. We then perform a comprehensive experimental study to qualitatively and quantitatively compare these clustering techniques coupled with popular similarity measures used in the flow visualization literature. Based on our experimental results, we derive empirical guidelines for selecting the appropriate clustering technique and similarity measure given the requirements of the visualization task. We believe our work will inform the task of generating meaningful reduced representations for large-scale flow data and inspire the continuous investigation of a more refined guidance on clustering technique selection.
Collapse
|
226
|
Ramos Emmendorfer L, de Paula Canuto AM. A generalized average linkage criterion for Hierarchical Agglomerative Clustering. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106990] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
227
|
Lazaris A, Prasanna VK. An LSTM Framework for Software-Defined Measurement. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT 2021. [DOI: 10.1109/tnsm.2020.3040157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
228
|
Ziaei-Halimejani H, Zarghami R, Mansouri SS, Mostoufi N. Data-Driven Fault Diagnosis of Chemical Processes Based on Recurrence Plots. Ind Eng Chem Res 2021. [DOI: 10.1021/acs.iecr.0c06307] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Hooman Ziaei-Halimejani
- Multiphase Systems Research Lab, School of Chemical Engineering, College of Engineering, University of Tehran, Tehran 11155/4563, Iran
| | - Reza Zarghami
- Multiphase Systems Research Lab, School of Chemical Engineering, College of Engineering, University of Tehran, Tehran 11155/4563, Iran
| | - Seyed Soheil Mansouri
- Department of Chemical and Biochemical Engineering, Technical University of Denmark, Søltofts Plads, Building 228A, 2800 Kongens Lyngby, Denmark
| | - Navid Mostoufi
- Multiphase Systems Research Lab, School of Chemical Engineering, College of Engineering, University of Tehran, Tehran 11155/4563, Iran
| |
Collapse
|
229
|
A Survey on Machine Learning-Based Performance Improvement of Wireless Networks: PHY, MAC and Network Layer. ELECTRONICS 2021. [DOI: 10.3390/electronics10030318] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper presents a systematic and comprehensive survey that reviews the latest research efforts focused on machine learning (ML) based performance improvement of wireless networks, while considering all layers of the protocol stack: PHY, MAC and network. First, the related work and paper contributions are discussed, followed by providing the necessary background on data-driven approaches and machine learning to help non-machine learning experts understand all discussed techniques. Then, a comprehensive review is presented on works employing ML-based approaches to optimize the wireless communication parameters settings to achieve improved network quality-of-service (QoS) and quality-of-experience (QoE). We first categorize these works into: radio analysis, MAC analysis and network prediction approaches, followed by subcategories within each. Finally, open challenges and broader perspectives are discussed.
Collapse
|
230
|
Estimating Forest Structure from UAV-Mounted LiDAR Point Cloud Using Machine Learning. REMOTE SENSING 2021. [DOI: 10.3390/rs13030352] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Monitoring the structure of forest stands is of high importance for forest managers to help them in maintaining ecosystem services. For that purpose, Unmanned Aerial Vehicles (UAVs) open new prospects, especially in combination with Light Detection and Ranging (LiDAR) technology. Indeed, the shorter distance from the Earth’s surface significantly increases the point density beneath the canopy, thus offering new possibilities for the extraction of the underlying semantics. For example, tree stems can now be captured with sufficient detail, which is a gateway to accurately locating trees and directly retrieving metrics—e.g., the Diameter at Breast Height (DBH). Current practices usually require numerous site-specific parameters, which may preclude their use when applied beyond their initial application context. To overcome this shortcoming, the machine learning Hierarchical Density-Based Spatial Clustering of Application of Noise (HDBSCAN) clustering algorithm was further improved and implemented to segment tree stems. Afterwards, Principal Component Analysis (PCA) was applied to extract tree stem orientation for subsequent DBH estimation. This workflow was then validated using LiDAR point clouds collected in a temperate deciduous closed-canopy forest stand during the leaf-on and leaf-off seasons, along with multiple scanning angle ranges. The results show that the proposed methodology can correctly detect up to 82% of tree stems (with a precision of 98%) during the leaf-off season and have a Maximum Scanning Angle Range (MSAR) of 75 degrees, without having to set up any site-specific parameters for the segmentation procedure. In the future, our method could then minimize the omission and commission errors when initially detecting trees, along with assisting further tree metrics retrieval. Finally, this research shows that, under the study conditions, the point density within an approximately 1.3-meter height above the ground remains low within closed-canopy forest stands even during the leaf-off season, thus restricting the accurate estimation of the DBH. As a result, autonomous UAVs that can both fly above and under the canopy provide a clear opportunity to achieve this purpose.
Collapse
|
231
|
Identifying Fake News on Social Networks Based on Natural Language Processing: Trends and Challenges. INFORMATION 2021. [DOI: 10.3390/info12010038] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The epidemic spread of fake news is a side effect of the expansion of social networks to circulate news, in contrast to traditional mass media such as newspapers, magazines, radio, and television. Human inefficiency to distinguish between true and false facts exposes fake news as a threat to logical truth, democracy, journalism, and credibility in government institutions. In this paper, we survey methods for preprocessing data in natural language, vectorization, dimensionality reduction, machine learning, and quality assessment of information retrieval. We also contextualize the identification of fake news, and we discuss research initiatives and opportunities.
Collapse
|
232
|
Kułacz Ł, Kliks A. Brain-Inspired Data Transmission in Dense Wireless Network. SENSORS 2021; 21:s21020576. [PMID: 33467437 PMCID: PMC7830927 DOI: 10.3390/s21020576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 01/08/2021] [Accepted: 01/12/2021] [Indexed: 11/16/2022]
Abstract
In this paper, the authors investigate the innovative concept of a dense wireless network supported by additional functionalities inspired by the human nervous system. The nervous system controls the entire human body due to reliable and energetically effective signal transmission. Among the structure and modes of operation of such an ultra-dense network of neurons and glial cells, the authors selected the most worthwhile when planning a dense wireless network. These ideas were captured, modeled in the context of wireless data transmission. The performance of such an approach have been analyzed in two ways, first, the theoretic limits of such an approach has been derived based on the stochastic geometry, in particular-based on the percolation theory. Additionally, computer experiments have been carried out to verify the performance of the proposed transmission schemes in four simulation scenarios. Achieved results showed the prospective improvement of the reliability of the wireless networks while applying proposed bio-inspired solutions and keeping the transmission extremely simple.
Collapse
|
233
|
Bach MM, Daffertshofer A, Dominici N. The development of mature gait patterns in children during walking and running. Eur J Appl Physiol 2021; 121:1073-1085. [PMID: 33439307 PMCID: PMC7966230 DOI: 10.1007/s00421-020-04592-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 12/17/2020] [Indexed: 12/11/2022]
Abstract
PURPOSE We sought to identify the developing maturity of walking and running in young children. We assessed gait patterns for the presence of flight and double support phases complemented by mechanical energetics. The corresponding classification outcomes were contrasted via a shotgun approach involving several potentially informative gait characteristics. A subsequent clustering turned out very effective to classify the degree of gait maturity. METHODS Participants (22 typically developing children aged 2-9 years and 7 young, healthy adults) walked/ran on a treadmill at comfortable speeds. We determined double support and flight phases and the relationship between potential and kinetic energy oscillations of the center-of-mass. Based on the literature, we further incorporated a total of 93 gait characteristics (including the above-mentioned ones) and employed multivariate statistics comprising principal component analysis for data compression and hierarchical clustering for classification. RESULTS While the ability to run including a flight phase increased with age, the flight phase did not reach 20% of the gait cycle. It seems that children use a walk-run-strategy when learning to run. Yet, the correlation strength between potential and kinetic energies saturated and so did the amount of recovered mechanical energy. Clustering the set of gait characteristics allowed for classifying gait in more detail. This defines a metric for maturity in terms of deviations from adult gait, which disagrees with chronological age. CONCLUSIONS The degree of gait maturity estimated statistically using various gait characteristics does not always relate directly to the chronological age of the child.
Collapse
Affiliation(s)
- Margit M Bach
- Department of Human Movement Sciences, Faculty of Behavioural and Movement Sciences, Amsterdam Movement Sciences & Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Andreas Daffertshofer
- Department of Human Movement Sciences, Faculty of Behavioural and Movement Sciences, Amsterdam Movement Sciences & Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Nadia Dominici
- Department of Human Movement Sciences, Faculty of Behavioural and Movement Sciences, Amsterdam Movement Sciences & Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
234
|
Kabir S, Farrokhvar L, Russell MW, Forman A, Kamali B. Regional socioeconomic factors and length of hospital stay: a case study in Appalachia. J Public Health (Oxf) 2021. [DOI: 10.1007/s10389-020-01418-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
235
|
Prasad KR, Reddy BE, Mohammed M. An effective assessment of cluster tendency through sampling based multi-viewpoints visual method. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2021:1-14. [PMID: 33425056 PMCID: PMC7779163 DOI: 10.1007/s12652-020-02710-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 11/18/2020] [Indexed: 06/12/2023]
Abstract
Social networks are the rich sources to people for sharing the knowledge on health-related issues. Nowadays, Twitter is one of the great significant social platforms to the people for a discussion on topics. Analyzing the clusters for the tweets concerning terms is a complex process due to the sparsity problem. Topic models are useful or avoiding this problem with derivations of topic clusters. Finding pre-cluster tendency is the major problem in many clustering methods. Existing methods, such as visual access tendency (VAT), cosine-based VAT (cVAT), multi viewpoints-based cosine similarity VAT (MVS-VAT) majorly used to access the prior information about clusters tendency problem. Solution of cluster tendency indicates the tractable number of clusters. The MVS-VAT enables the cluster tendency for the tweet documents effectively than other visual methods. However, it takes a higher number of viewpoints, thus requiring more computational time for the clustering of tweets data. Therefore, sampling-based visual methods are proposed to overcome the computational problem. Several standard health keywords are used for the extraction of health tweets to illustrate the effectiveness of proposed work in the experimental study.
Collapse
Affiliation(s)
- K. Rajendra Prasad
- Department of CSE, Rajeev Gandhi Memorial College of Engineering and Technology, Nandyal, Andhra Pradesh India
| | - B. Eswara Reddy
- Department of CSE, JNTUA College of Engineering, Anantapur, Andhra Pradesh India
| | - Moulana Mohammed
- Department of CSE, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh India
| |
Collapse
|
236
|
|
237
|
Guérin J, Thiery S, Nyiri E, Gibaru O, Boots B. Combining pretrained CNN feature extractors to enhance clustering of complex natural images. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.10.068] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
238
|
Scott J, Niemetz A, Preiner M, Nejati S, Ganesh V. MachSMT: A Machine Learning-based Algorithm Selector for SMT Solvers. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS 2021. [PMCID: PMC7984560 DOI: 10.1007/978-3-030-72013-1_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
AbstractIn this paper, we present MachSMT, an algorithm selection tool for Satisfiability Modulo Theories (SMT) solvers. MachSMT supports the entirety of the SMT-LIB language. It employs machine learning (ML) methods to construct both empirical hardness models (EHMs) and pairwise ranking comparators (PWCs) over state-of-the-art SMT solvers. Given an SMT formula $$\mathcal {I}$$
I
as input, MachSMT leverages these learnt models to output a ranking of solvers based on predicted run time on the formula $$\mathcal {I}$$
I
. We evaluate MachSMT on the solvers, benchmarks, and data obtained from SMT-COMP 2019 and 2020. We observe MachSMT frequently improves on competition winners, winning $$54$$
54
divisions outright and up to a $$198.4$$
198.4
% improvement in PAR-2 score, notably in logics that have broad applications (e.g., BV, LIA, NRA, etc.) in verification, program analysis, and software engineering. The MachSMT tool is designed to be easily tuned and extended to any suitable solver application by users. MachSMT is not a replacement for SMT solvers by any means. Instead, it is a tool that enables users to leverage the collective strength of the diverse set of algorithms implemented as part of these sophisticated solvers.
Collapse
|
239
|
|
240
|
The Identification of Diabetes Mellitus Subtypes Applying Cluster Analysis Techniques: A Systematic Review. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17249523. [PMID: 33353219 PMCID: PMC7766625 DOI: 10.3390/ijerph17249523] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 12/16/2020] [Accepted: 12/17/2020] [Indexed: 12/23/2022]
Abstract
Diabetes Mellitus is a chronic and lifelong disease that incurs a huge burden to healthcare systems. Its prevalence is on the rise worldwide. Diabetes is more complex than the classification of Type 1 and 2 may suggest. The purpose of this systematic review was to identify the research studies that tried to find new sub-groups of diabetes patients by using unsupervised learning methods. The search was conducted on Pubmed and Medline databases by two independent researchers. All time publications on cluster analysis of diabetes patients were selected and analysed. Among fourteen studies that were included in the final review, five studies found five identical clusters: Severe Autoimmune Diabetes; Severe Insulin-Deficient Diabetes; Severe Insulin-Resistant Diabetes; Mild Obesity-Related Diabetes; and Mild Age-Related Diabetes. In addition, two studies found the same clusters, except Severe Autoimmune Diabetes cluster. Results of other studies differed from one to another and were less consistent. Cluster analysis enabled finding non-classic heterogeneity in diabetes, but there is still a necessity to explore and validate the capabilities of cluster analysis in more diverse and wider populations.
Collapse
|
241
|
Gheyas I, Parkinson S, Khan S. OCEAN: A Non-Conventional Parameter Free Clustering Algorithm Using Relative Densities of Categories. INT J PATTERN RECOGN 2020. [DOI: 10.1142/s0218001421500178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a fully autonomous density-based clustering algorithm named ‘Ocean’, which is inspired by the oceanic landscape and phenomena that occur in it. Ocean is an improvement over conventional algorithms regarding both distance metric and the clustering mechanism. Ocean defines the distance between two categories as the difference in the relative densities of categories. Unlike existing approaches, Ocean neither assigns the same distance to all pairs of categories, nor assigns arbitrary weights to matches and mismatches between categories that can lead to clustering errors. Ocean uses density ratios of adjacent regions in multidimensional space to detect the edges of the clusters. Ocean is robust against clusters of identical patterns. Unlike conventional approaches, Ocean neither makes any assumption regarding the data distribution within clusters, nor requires tuning of free parameters. Empirical evaluations demonstrate improved performance of Ocean over existing approaches.
Collapse
Affiliation(s)
- Iffat Gheyas
- Secure Societies Institute, School of Human and Health Sciences, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, UK
| | - Simon Parkinson
- Department of Computer Science, School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, UK
| | - Saad Khan
- Department of Computer Science, School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, UK
| |
Collapse
|
242
|
Fast Searching Density Peak Clustering Algorithm Based on Shared Nearest Neighbor and Adaptive Clustering Center. Symmetry (Basel) 2020. [DOI: 10.3390/sym12122014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms always need prior knowledge, we proposed a fast searching density peak clustering algorithm based on the shared nearest neighbor and adaptive clustering center (DPC-SNNACC) algorithm. It can automatically ascertain the number of knee points in the decision graph according to the characteristics of different datasets, and further determine the number of clustering centers without human intervention. First, an improved calculation method of local density based on the symmetric distance matrix was proposed. Then, the position of knee point was obtained by calculating the change in the difference between decision values. Finally, the experimental and comparative evaluation of several datasets from diverse domains established the viability of the DPC-SNNACC algorithm.
Collapse
|
243
|
Abstract
AbstractIn this paper, we suggest a new technique for soft clustering of multidimensional data. It is based on a new convex voting model, where each voter chooses a party with certain probability depending on the divergence between his/her preferences and the position of the party. The parties can react on the results of polls by changing their positions. We prove that under some natural assumptions this system has a unique fixed point, providing a unique solution for soft clustering. The solution of our model can be found either by imitation of the sequential elections, or by direct minimization of a convex potential function. In both cases, the methods converge linearly to the solution. We provide our methods with worst-case complexity bounds. To the best of our knowledge, these are the first polynomial-time complexity results in this field.
Collapse
|
244
|
Rangaprakash D, Odemuyiwa T, Narayana Dutt D, Deshpande G. Density-based clustering of static and dynamic functional MRI connectivity features obtained from subjects with cognitive impairment. Brain Inform 2020; 7:19. [PMID: 33242116 PMCID: PMC7691406 DOI: 10.1186/s40708-020-00120-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 10/29/2020] [Indexed: 11/29/2022] Open
Abstract
Various machine-learning classification techniques have been employed previously to classify brain states in healthy and disease populations using functional magnetic resonance imaging (fMRI). These methods generally use supervised classifiers that are sensitive to outliers and require labeling of training data to generate a predictive model. Density-based clustering, which overcomes these issues, is a popular unsupervised learning approach whose utility for high-dimensional neuroimaging data has not been previously evaluated. Its advantages include insensitivity to outliers and ability to work with unlabeled data. Unlike the popular k-means clustering, the number of clusters need not be specified. In this study, we compare the performance of two popular density-based clustering methods, DBSCAN and OPTICS, in accurately identifying individuals with three stages of cognitive impairment, including Alzheimer’s disease. We used static and dynamic functional connectivity features for clustering, which captures the strength and temporal variation of brain connectivity respectively. To assess the robustness of clustering to noise/outliers, we propose a novel method called recursive-clustering using additive-noise (R-CLAN). Results demonstrated that both clustering algorithms were effective, although OPTICS with dynamic connectivity features outperformed in terms of cluster purity (95.46%) and robustness to noise/outliers. This study demonstrates that density-based clustering can accurately and robustly identify diagnostic classes in an unsupervised way using brain connectivity.
Collapse
Affiliation(s)
- D Rangaprakash
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA.,Department of Radiology, Harvard Medical School, Boston, MA, USA.,Division of Health Sciences and Technology, Harvard University and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Toluwanimi Odemuyiwa
- Division of Engineering Science, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, Canada
| | - D Narayana Dutt
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India
| | - Gopikrishna Deshpande
- AU MRI Research Center, Department of Electrical and Computer Engineering, Auburn University, 560 Devall Dr, Suite 266D, Auburn, AL, 36849, USA. .,Department of Psychological Sciences, Auburn University, Auburn, AL, USA. .,Alabama Advanced Imaging Consortium, University of Alabama Birmingham, Alabama, USA. .,Center for Health Ecology and Equity Research, Auburn University, Auburn, AL, USA. .,Center for Neuroscience, Auburn University, Auburn, AL, USA. .,School of Psychology, Capital Normal University, Beijing, China. .,Key Laboratory for Learning and Cognition, Capital Normal University, Beijing, China. .,Department of Psychiatry, National Institute of Mental Health and Neurosciences, Bangalore, India.
| | | |
Collapse
|
245
|
Improving K-Nearest Neighbor Approaches for Density-Based Pixel Clustering in Hyperspectral Remote Sensing Images. REMOTE SENSING 2020. [DOI: 10.3390/rs12223745] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We investigated nearest-neighbor density-based clustering for hyperspectral image analysis. Four existing techniques were considered that rely on a K-nearest neighbor (KNN) graph to estimate local density and to propagate labels through algorithm-specific labeling decisions. We first improved two of these techniques, a KNN variant of the density peaks clustering method dpc, and a weighted-mode variant of knnclust, so the four methods use the same input KNN graph and only differ by their labeling rules. We propose two regularization schemes for hyperspectral image analysis: (i) a graph regularization based on mutual nearest neighbors (MNN) prior to clustering to improve cluster discovery in high dimensions; (ii) a spatial regularization to account for correlation between neighboring pixels. We demonstrate the relevance of the proposed methods on synthetic data and hyperspectral images, and show they achieve superior overall performances in most cases, outperforming the state-of-the-art methods by up to 20% in kappa index on real hyperspectral images.
Collapse
|
246
|
Lamsal R, Katiyar S. cs-means: Determining optimal number of clusters based on a level-of-similarity. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-03582-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
247
|
Improving Ant Collaborative Filtering on Sparsity via Dimension Reduction. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10207245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recommender systems should be able to handle highly sparse training data that continues to change over time. Among the many solutions, Ant Colony Optimization, as a kind of optimization algorithm modeled on the actions of an ant colony, enjoys the favorable characteristic of being optimal, which has not been easily achieved by other kinds of algorithms. A recent work adopting genetic optimization proposes a collaborative filtering scheme: Ant Collaborative Filtering (ACF), which models the pheromone of ants for a recommender system in two ways: (1) use the pheromone exchange to model the ratings given by users with respect to items; (2) use the evaporation of existing pheromone to model the evolution of users’ preference change over time. This mechanism helps to identify the users and the items most related, even in the case of sparsity, and can capture the drift of user preferences over time. However, it reveals that many users share the same preference over items, which means it is not necessary to initialize each user with a unique type of pheromone, as was done with the ACF. Regarding the sparsity problem, this work takes one step further to improve the Ant Collaborative Filtering’s performance by adding a clustering step in the initialization phase to reduce the dimension of the rate matrix, which leads to the results that K<<#users, where K is the number of clusters, which stands for the maximum number of types of pheromone carried by all users. We call this revised version the Improved Ant Collaborative Filtering (IACF). Experiments are conducted on larger datasets, compared with the previous work, based on three typical recommender systems: (1) movie recommendations, (2) music recommendations, and (3) book recommendations. For movie recommendation, a larger dataset, MoviesLens 10M, was used, instead of MoviesLens 1M. For book recommendation and music recommendation, we used a new dataset that has a much larger size of samples from Douban and NetEase. The results illustrate that our IACF algorithm can better deal with practical recommendation scenarios that handle sparse dataset.
Collapse
|
248
|
Ezugwu AE, Shukla AK, Agbaje MB, Oyelade ON, José-García A, Agushaka JO. Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05395-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
249
|
|
250
|
|