1
|
Gao CX, Dwyer D, Zhu Y, Smith CL, Du L, Filia KM, Bayer J, Menssink JM, Wang T, Bergmeir C, Wood S, Cotton SM. An overview of clustering methods with guidelines for application in mental health research. Psychiatry Res 2023; 327:115265. [PMID: 37348404 DOI: 10.1016/j.psychres.2023.115265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/20/2023] [Accepted: 05/21/2023] [Indexed: 06/24/2023]
Abstract
Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and libraries.
Collapse
Affiliation(s)
- Caroline X Gao
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia; Department of Epidemiology and Preventative Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia.
| | - Dominic Dwyer
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Ye Zhu
- School of Information Technology, Deakin University, Geelong, VIC, Australia
| | - Catherine L Smith
- Department of Epidemiology and Preventative Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Lan Du
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Kate M Filia
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Johanna Bayer
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Jana M Menssink
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Teresa Wang
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Christoph Bergmeir
- Faculty of Information Technology, Monash University, Clayton, VIC, Australia; Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| | - Stephen Wood
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| | - Sue M Cotton
- Centre for Youth Mental Health, The University of Melbourne, Parkville, VIC, Australia; Orygen, Parkville, VIC, Australia
| |
Collapse
|
2
|
Abstract
Cognitive diagnosis models (CDMs) have increasingly been applied in education and other fields. This article provides an overview of a widely used CDM, namely, the G-DINA model, and demonstrates a hands-on example of using multiple R packages for a series of CDM analyses. This overview involves a step-by-step illustration and explanation of performing Q-matrix evaluation, CDM calibration, model fit evaluation, item diagnosticity investigation, classification reliability examination, and the result presentation and visualization. Some limitations of conducting CDM analysis in R are also discussed.
Collapse
|
3
|
Abstract
Sparse estimation through regularization is gaining popularity in psychological research. Such techniques penalize the complexity of the model and could perform variable/path selection in an automatic way, and thus are particularly useful in models that have small parameter-to-sample-size ratios. This paper gives a detailed tutorial of the R package regsem, which implements regularization for structural equation models. Example R code is also provided to highlight the key arguments of implementing regularized structural equation models in this package. The tutorial ends by discussing remedies of some known drawbacks of a popular type of regularization, computational methods supported by the package that can improve the selection result, and some other practical issues such as dealing with missing data and categorical variables.
Collapse
|