1
|
Kritchanchai D, Srinon R, Kietdumrongwong P, Jansuwan J, Phanuphak N, Chanpuypetch W. Enhancing home delivery of emergency medicine and medical supplies through clustering and simulation techniques: A case study of COVID-19 home isolation in Bangkok. Heliyon 2024; 10:e33177. [PMID: 39005897 PMCID: PMC11239690 DOI: 10.1016/j.heliyon.2024.e33177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/09/2024] [Accepted: 06/14/2024] [Indexed: 07/16/2024] Open
Abstract
This study investigates the enhancement of the home delivery distribution network for COVID-19 Home Isolation (HI) kits during the Delta variant outbreak of the SARS-CoV-2 virus in Bangkok Metropolitan Area, Thailand. It addresses challenges related to limited resources and delays in delivering HI kits, which can exacerbate symptoms and increase mortality rates. A k-means clustering approach is utilized to optimize the assignment of service areas within the COVID-19 HI program, while discrete event simulation (DES) evaluates potential changes in the home delivery logistics network. Real-world data from the peak outbreak is used to determine the optimal allocation of resources and propose a new logistics network based on proximity to patients' residences. Experimental results demonstrate a significant 44.29 % improvement in overall performance and a substantial 40.80 % decrease in maximum service time. The findings offer theoretical and managerial implications for effective HI management, supporting practitioners and policymakers in mitigating the impact of future outbreaks.
Collapse
Affiliation(s)
- Duangpun Kritchanchai
- Department of Industrial Engineering, Mahidol University, Nakhon Pathom, 73170, Thailand
| | - Rawinkhan Srinon
- The Cluster of Logistics and Rail Engineering, Mahidol University, Nakhon Pathom, 73170, Thailand
| | | | - Jirawan Jansuwan
- Faculty of Business Administration, Rajamangala University of Technology Srivijaya, Songkhla, 90000, Thailand
| | | | | |
Collapse
|
2
|
Majcherek D, Hegerty SW, Kowalski AM, Lewandowska MS, Dikova D. Opportunities for healthcare digitalization in Europe: Comparative analysis of inequalities in access to medical services. Health Policy 2024; 139:104950. [PMID: 38061175 DOI: 10.1016/j.healthpol.2023.104950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/18/2023] [Accepted: 11/27/2023] [Indexed: 12/31/2023]
Abstract
Digitalization of healthcare systems is a great opportunity to address inequalities in access to healthcare in the European Union. There is an urgent need to build on what we learned from the COVID-19 pandemic, where digital health technologies were integrated swiftly to limit challenges in healthcare delivery. We created a database for the 27 European Union countries from the European Health Interview Survey (EHIS), the Digital Economy and Society Index (DESI), and other Eurostat databases. We performed k-means cluster analysis to group EU countries along two dimensions: inequalities in access to medical services and level of digitalization. We identified five distinct clusters: two clusters with high, two clusters with moderate, and one cluster with low unmet need for healthcare. Regarding digitalization, only one cluster comprising the Nordic countries, Spain and Cyprus exhibit high digital readiness. A cluster comprising the most developed countries in Western Europe represents moderate levels of both unmet need for healthcare and digitalization. For most EU countries, there is still a need to build digital infrastructure for the healthcare industry, which in the long term may increase the number of digital solutions used by both patients and healthcare professionals. Policy makers across the EU need to consider investing in initiatives that would support digital health solutions as an effective means of healthcare provision and healthcare management.
Collapse
|
3
|
Huang M, Long C, Ma J. AAFL: automatic association feature learning for gene signature identification of cancer subtypes in single-cell RNA-seq data. Brief Funct Genomics 2023; 22:420-427. [PMID: 37122141 DOI: 10.1093/bfgp/elac047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 05/02/2023] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) technologies have enabled the study of human cancers in individual cells, which explores the cellular heterogeneity and the genotypic status of tumors. Gene signature identification plays an important role in the precise classification of cancer subtypes. However, most existing gene selection methods only select the same informative genes for each subtype. In this study, we propose a novel gene selection method, automatic association feature learning (AAFL), which automatically identifies different gene signatures for different cell subpopulations (cancer subtypes) at the same time. The proposed AAFL method combines the residual network with the low-rank network, which selects genes that are most associated with the corresponding cell subpopulations. Moreover, the differential expression genes are acquired before gene selection to filter the redundant genes. We apply the proposed feature learning method to the real cancer scRNA-seq data sets (melanoma) to identify cancer subtypes and detect gene signatures of identified cancer subtypes. The experimental results demonstrate that the proposed method can automatically identify different gene signatures for identified cancer subtypes. Gene ontology enrichment analysis shows that the identified gene signatures of different subtypes reveal the key biological processes and pathways. These gene signatures are expected to bring important implications for understanding cellular heterogeneity and the complex ecosystem of tumors.
Collapse
Affiliation(s)
- Meng Huang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Changzhou Long
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Jiangtao Ma
- Department of Automation, Xiamen University, Xiamen, 361005, China
- School of Engineering, Dali University, Dali, 671000, China
| |
Collapse
|
4
|
Cuppens T, Kaur M, Kumar AA, Shatto J, Ng ACH, Leclercq M, Reformat MZ, Droit A, Dunham I, Bolduc FV. Developing a cluster-based approach for deciphering complexity in individuals with neurodevelopmental differences. Front Pediatr 2023; 11:1171920. [PMID: 37790694 PMCID: PMC10543689 DOI: 10.3389/fped.2023.1171920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/01/2023] [Indexed: 10/05/2023] Open
Abstract
Objective Individuals with neurodevelopmental disorders such as global developmental delay (GDD) present both genotypic and phenotypic heterogeneity. This diversity has hampered developing of targeted interventions given the relative rarity of each individual genetic etiology. Novel approaches to clinical trials where distinct, but related diseases can be treated by a common drug, known as basket trials, which have shown benefits in oncology but have yet to be used in GDD. Nonetheless, it remains unclear how individuals with GDD could be clustered. Here, we assess two different approaches: agglomerative and divisive clustering. Methods Using the largest cohort of individuals with GDD, which is the Deciphering Developmental Disorders (DDD), characterized using a systematic approach, we extracted genotypic and phenotypic information from 6,588 individuals with GDD. We then used a k-means clustering (divisive) and hierarchical agglomerative clustering (HAC) to identify subgroups of individuals. Next, we extracted gene network and molecular function information with regard to the clusters identified by each approach. Results HAC based on phenotypes identified in individuals with GDD revealed 16 clusters, each presenting with one dominant phenotype displayed by most individuals in the cluster, along with other minor phenotypes. Among the most common phenotypes reported were delayed speech, absent speech, and seizure. Interestingly, each phenotypic cluster molecularly included several (3-12) gene sub-networks of more closely related genes with diverse molecular function. k-means clustering also segregated individuals harboring those phenotypes, but the genetic pathways identified were different from the ones identified from HAC. Conclusion Our study illustrates how divisive (k-means) and agglomerative clustering can be used in order to group individuals with GDD for future basket trials. Moreover, the result of our analysis suggests that phenotypic clusters should be subdivided into molecular sub-networks for an increased likelihood of successful treatment. Finally, a combination of both agglomerative and divisive clustering may be required for developing of a comprehensive treatment.
Collapse
Affiliation(s)
- Tania Cuppens
- Département de Médecine Moléculaire de L'Université Laval, Centre de Recherche du CHU de Québec-Université Laval, Québec, QC, Canada
| | - Manpreet Kaur
- Department of Pediatric Neurology, University of Alberta, Edmonton, AB, Canada
| | - Ajay A. Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - Julie Shatto
- Department of Pediatric Neurology, University of Alberta, Edmonton, AB, Canada
| | - Andy Cheuk-Him Ng
- Department of Pediatric Neurology, University of Alberta, Edmonton, AB, Canada
| | - Mickael Leclercq
- Département de Médecine Moléculaire de L'Université Laval, Centre de Recherche du CHU de Québec-Université Laval, Québec, QC, Canada
| | - Marek Z. Reformat
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
| | - Arnaud Droit
- Département de Médecine Moléculaire de L'Université Laval, Centre de Recherche du CHU de Québec-Université Laval, Québec, QC, Canada
| | - Ian Dunham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - François V. Bolduc
- Department of Pediatric Neurology, University of Alberta, Edmonton, AB, Canada
- Department of Medical Genetics, University of Alberta, Edmonton, AB, Canada
- Neuroscience and Mental Health Institute, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
5
|
Stratified multi-density spectral clustering using Gaussian mixture model. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
|
6
|
Ariunzaya G, Baasanmunkh S, Choi HJ, Kavalan JCL, Chung S. A Multi-Considered Seed Coat Pattern Classification of Allium L. Using Unsupervised Machine Learning. PLANTS (BASEL, SWITZERLAND) 2022; 11:3097. [PMID: 36432826 PMCID: PMC9692843 DOI: 10.3390/plants11223097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/10/2022] [Accepted: 11/10/2022] [Indexed: 06/16/2023]
Abstract
The seed coat sculpture is one of the most important taxonomic distinguishing features. The objective of this study is to classify coat patterns of Allium L. seeds into new groups using scanning electron microscopy unsupervised machine learning. Selected images of seed coat patterns from more than 100 Allium species described in literature and data from our samples were classified into seven types of anticlinal (irregular curved, irregular curved to nearly straight, straight, S, U, U to Ω, and Ω) and five types of periclinal walls (granule, small verrucae, large verrucae, marginal verrucae, and verrucate verrucae). We used five unsupervised machine learning approaches: K-means, K-means++, Minibatch K-means, Spectral, and Birch. The elbow and silhouette approaches were then used to determine the number of clusters required. Thereafter, we compared human- and machine-based results and proposed a new clustering. We then separated the data into six target clusters: SI, SS, SM, NS, PS, and PD. The proposed strongly identical grouping is distinct from the other groups in that the results are exactly the same, but PD is unrelated to the others. Thus, unsupervised machine learning has been shown to support the development of new groups in the Allium seed coat pattern.
Collapse
Affiliation(s)
- Gantulga Ariunzaya
- Department of Computer Engineering, Changwon National University, Changwon 51140, Republic of Korea
| | - Shukherdorj Baasanmunkh
- Department of Biology and Chemistry, Changwon National University, Changwon 51140, Republic of Korea
| | - Hyeok Jae Choi
- Department of Biology and Chemistry, Changwon National University, Changwon 51140, Republic of Korea
| | - Jonathan C. L. Kavalan
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611, USA
| | - Sungwook Chung
- Department of Computer Engineering, Changwon National University, Changwon 51140, Republic of Korea
| |
Collapse
|
7
|
Zhang D, Ma G, Deng Z, Wang Q, Zhang G, Zhou W. A self-adaptive gradient-based particle swarm optimization algorithm with dynamic population topology. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
8
|
Jain G, Mahara T, Sharma SC, Verma OP, Sharma T. Clustering-Based Recommendation System for Preliminary Disease Detection. INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS 2022. [DOI: 10.4018/ijehmc.313191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The catastrophic outbreak COVID-19 has brought threat to the society and also placed severe stress on the healthcare systems worldwide. Different segments of society are contributing to their best effort to curb the spread of COVID-19. As a part of this contribution, in this research, a clustering-based recommender system is proposed for early detection of COVID-19 based on the symptoms of an individual. For this, the suspected patient's symptoms are compared with the patient who has already contracted COVID-19 by computing similarity between symptoms. Based on this, the suspected person is classified into either of the three risk categories: high, medium, and low. This is not a confirmed test but only a mechanism to alert the suspected patient. The accuracy of the algorithm is more than 85%.
Collapse
Affiliation(s)
- Gourav Jain
- Indian Institute of Technology, Roorkee, India
| | | | | | - Om Prakash Verma
- Dr. B. R. Ambedkar National Institute of Technology, Jalandhar, India
| | | |
Collapse
|
9
|
|
10
|
A State-of-the-Art Vegetation Map for Jordan: A New Tool for Conservation in a Biodiverse Country. CONSERVATION 2022. [DOI: 10.3390/conservation2010012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In many countries, including Jordan, the updating of vegetation maps is required to aid in formulating development and management plans for agriculture, forest, and rangeland sectors. Remote sensing data contributes widely to vegetation mapping at different scales by providing multispectral information that can separate and identify different vegetation groups at reasonable accuracy and low cost. Here, we implemented state-of-the-art approaches to develop a vegetation map for Jordan, as an example of how such maps can be produced in regions of high vegetation complexity. Specifically, we used a reciprocal illumination technique that combines extensive ground data (640 vegetation inventory plots) and Sentinel-2 satellite images to produce a categorical vegetation map (scale 1:50,000). Supervised classification was used to translate the spectral characteristics into vegetation types, which were first delimited by the clustering analyses of species composition data from the plots. From the satellite image interpretation, two maps were created: an unsupervised land cover/land use map and a supervised map of present-day vegetation types, both consisting of 18 categories. These new maps should inform ecosystem management and conservation planning decisions in Jordan over the coming years.
Collapse
|
11
|
Alfaro C, Gomez J, Moguerza JM, Castillo J, Martinez JI. Toward Accelerated Training of Parallel Support Vector Machines Based on Voronoi Diagrams. ENTROPY 2021; 23:e23121605. [PMID: 34945911 PMCID: PMC8700103 DOI: 10.3390/e23121605] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Revised: 11/19/2021] [Accepted: 11/25/2021] [Indexed: 12/05/2022]
Abstract
Typical applications of wireless sensor networks (WSN), such as in Industry 4.0 and smart cities, involves acquiring and processing large amounts of data in federated systems. Important challenges arise for machine learning algorithms in this scenario, such as reducing energy consumption and minimizing data exchange between devices in different zones. This paper introduces a novel method for accelerated training of parallel Support Vector Machines (pSVMs), based on ensembles, tailored to these kinds of problems. To achieve this, the training set is split into several Voronoi regions. These regions are small enough to permit faster parallel training of SVMs, reducing computational payload. Results from experiments comparing the proposed method with a single SVM and a standard ensemble of SVMs demonstrate that this approach can provide comparable performance while limiting the number of regions required to solve classification tasks. These advantages facilitate the development of energy-efficient policies in WSN.
Collapse
|
12
|
Clustering and Classification Based on Distributed Automatic Feature Engineering for Customer Segmentation. Symmetry (Basel) 2021. [DOI: 10.3390/sym13091557] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
To beat competition and obtain valuable information, decision-makers must conduct in-depth machine learning or data mining for data analytics. Traditionally, clustering and classification are two common methods used in machine mining. For clustering, data are divided into various groups according to the similarity or common features. On the other hand, classification refers to building a model by given training data, where the target class or label is predicted for the test data. In recent years, many researchers focus on the hybrid of clustering and classification. These techniques have admirable achievements, but there is still room to ameliorate performances, such as distributed process. Therefore, we propose clustering and classification based on distributed automatic feature engineering (AFE) for customer segmentation in this paper. In the proposed algorithm, AFE uses artificial bee colony (ABC) to select valuable features of input data, and then RFM provides the basic data analytics. In AFE, it first initializes the number of cluster k. Moreover, the clustering methods of k-means, Wald method, and fuzzy c-means (FCM) are processed to cluster the examples in variant groups. Finally, the classification method of an improved fuzzy decision tree classifies the target data and generates decision rules for explaining the detail situations. AFE also determines the value of the split number in the improved fuzzy decision tree to increase classification accuracy. The proposed clustering and classification based on automatic feature engineering is distributed, performed in Apache Spark platform. The topic of this paper is about solving the problem of clustering and classification for machine learning. From the results, the corresponding classification accuracy outperforms other approaches. Moreover, we also provide useful strategies and decision rules from data analytics for decision-makers.
Collapse
|
13
|
Muñoz-Rivas M, Bellot A, Montorio I, Ronzón-Tirado R, Redondo N. Profiles of Emotion Regulation and Post-Traumatic Stress Severity among Female Victims of Intimate Partner Violence. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18136865. [PMID: 34206787 PMCID: PMC8297086 DOI: 10.3390/ijerph18136865] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 06/23/2021] [Accepted: 06/24/2021] [Indexed: 11/26/2022]
Abstract
Emotional dysregulation is a construct that has drawn substantial attention as a transdiagnostic contributing factor to the loss of health. Intimate partner violence (IPV) is a term used to describe physical, psychological, or sexual assault of a spouse or sexual partner. The aim of the study was to determine the variability of emotional dysregulation among women with different types of IPV revictimization and post-traumatic stress. The cross-sectional survey included 120 women attended by the Integrated Monitoring System of Gender Violence of Madrid, Spain, due to a gender violence complaint. The presence of post-traumatic stress disorder (DSM 5 criteria), emotional dysregulation (Emotional Processing Scale (EPS)), childhood trauma, and type of revictimization were evaluated. Cluster analysis found three profiles of emotional regulation: Emotionally Regulated, Avoidance/Non-Impoverished, and Emotional Overwhelm. The results showed that the Emotional Overwhelm group was characterized by a general dysregulation of emotional experiences and a greater intensity of post-traumatic stress symptoms. In addition, women who have suffered several episodes of IPV by different partners showed a differential pattern of emotional regulation than the rest of the victims that entailed greater psychopathology. Findings confirm that emotional dysregulation is a critical pathway to the decrease of health among IPV victims.
Collapse
|
14
|
Cluster Analysis and Model Comparison Using Smart Meter Data. SENSORS 2021; 21:s21093157. [PMID: 34063197 PMCID: PMC8124309 DOI: 10.3390/s21093157] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 04/22/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022]
Abstract
Load forecasting plays a crucial role in the world of smart grids. It governs many aspects of the smart grid and smart meter, such as demand response, asset management, investment, and future direction. This paper proposes time-series forecasting for short-term load prediction to unveil the load forecast benefits through different statistical and mathematical models, such as artificial neural networks, auto-regression, and ARIMA. It targets the problem of excessive computational load when dealing with time-series data. It also presents a business case that is used to analyze different clusters to find underlying factors of load consumption and predict the behavior of customers based on different parameters. On evaluating the accuracy of the prediction models, it is observed that ARIMA models with the (P, D, Q) values as (1, 1, 1) were most accurate compared to other values.
Collapse
|
15
|
Xu X, Ding S, Wang Y, Wang L, Jia W. A fast density peaks clustering algorithm with sparse search. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.11.050] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
He Y, Wu Y, Qin H, Huang JZ, Jin Y. Improved I-nice clustering algorithm based on density peaks mechanism. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.09.068] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
17
|
Calmon W, Albi M. Estimating the number of clusters in a ranking data context. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.09.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
18
|
Gherbaoui R, Ouali M, Benamrane N. Generation of Gaussian sets for clustering methods assessment. DATA KNOWL ENG 2021. [DOI: 10.1016/j.datak.2021.101876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
19
|
Prasad M, Tripathi S, Dahal K. Unsupervised feature selection and cluster center initialization based arbitrary shaped clusters for intrusion detection. Comput Secur 2020. [DOI: 10.1016/j.cose.2020.102062] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
20
|
Xu X, Ding S, Wang L, Wang Y. A robust density peaks clustering algorithm with density-sensitive similarity. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106028] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
21
|
Boundary Matching and Interior Connectivity-Based Cluster Validity Anlysis. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10041337] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The evaluation of clustering results plays an important role in clustering analysis. However, the existing validity indices are limited to a specific clustering algorithm, clustering parameter, and assumption in practice. In this paper, we propose a novel validity index to solve the above problems based on two complementary measures: boundary points matching and interior points connectivity. Firstly, when any clustering algorithm is performed on a dataset, we extract all boundary points for the dataset and its partitioned clusters using a nonparametric metric. The measure of boundary points matching is computed. Secondly, the interior points connectivity of both the dataset and all the partitioned clusters are measured. The proposed validity index can evaluate different clustering results on the dataset obtained from different clustering algorithms, which cannot be evaluated by the existing validity indices at all. Experimental results demonstrate that the proposed validity index can evaluate clustering results obtained by using an arbitrary clustering algorithm and find the optimal clustering parameters.
Collapse
|
22
|
Azhar M, Huang JZ, Masud MA, Li MJ, Cui L. A hierarchical Gamma Mixture Model-based method for estimating the number of clusters in complex data. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2019.105891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
23
|
A Fast Method for Estimating the Number of Clusters Based on Score and the Minimum Distance of the Center Point. INFORMATION 2019. [DOI: 10.3390/info11010016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Clustering is widely used as an unsupervised learning algorithm. However, it is often necessary to manually enter the number of clusters, and the number of clusters has a great impact on the clustering effect. At present, researchers propose some algorithms to determine the number of clusters, but the results are not very good for determining the number of clusters of data sets with complex and scattered shapes. To solve these problems, this paper proposes using the Gaussian Kernel density estimation function to determine the maximum number of clusters, use the change of center point score to get the candidate set of center points, and further use the change of the minimum distance between center points to get the number of clusters. The experiment shows the validity and practicability of the proposed algorithm.
Collapse
|
24
|
Aruna Kumar S, Harish B, Mahanand B, Sundararajan N. An efficient Meta-cognitive Fuzzy C-Means clustering approach. Appl Soft Comput 2019. [DOI: 10.1016/j.asoc.2019.105838] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
25
|
|