1
|
Sonomoto K, Fujino Y, Tanaka H, Nagayasu A, Nakayamada S, Tanaka Y. A Machine Learning Approach for Prediction of CDAI Remission with TNF Inhibitors: A Concept of Precision Medicine from the FIRST Registry. Rheumatol Ther 2024:10.1007/s40744-024-00668-z. [PMID: 38637465 DOI: 10.1007/s40744-024-00668-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 03/18/2024] [Indexed: 04/20/2024] Open
Abstract
INTRODUCTION This study aimed to develop low-cost models using machine learning approaches predicting the achievement of Clinical Disease Activity Index (CDAI) remission 6 months after initiation of tumor necrosis factor inhibitors (TNFi) as primary biologic/targeted synthetic disease-modifying antirheumatic drugs (b/tsDMARDs) for rheumatoid arthritis (RA). METHODS Data of patients with RA initiating TNFi as first b/tsDMARD after unsuccessful methotrexate treatment were collected from the FIRST registry (August 2003 to October 2022). Baseline characteristics and 6-month CDAI were collected. The analysis used various machine learning approaches including logistic regression with stepwise variable selection, decision tree, support vector machine, and lasso logistic regression (Lasso), with 48 factors accessible in routine clinical practice for the prediction model. Robustness was ensured by k-fold cross validation. RESULTS Among the approaches tested, Lasso showed the advantages in predicting CDAI remission: with a mean area under the curve 0.704, sensitivity 61.7%, and specificity 69.9%. Predicted TNFi responders achieved CDAI remission at an average rate of 53.2%, while only 26.4% of predicted TNFi non-responders achieved remission. Encouragingly, the models generated relied solely on patient-reported outcomes and quantitative parameters, excluding subjective physician input. CONCLUSIONS While external cohort validation is warranted for broader applicability, this study highlights the potential for a low-cost predictive model to predict CDAI remission following TNFi treatment. The approach of the study using only baseline data and 6-month CDAI measures, suggests the feasibility of establishing regional cohorts to generate low-cost models tailored to specific regions or institutions. This may facilitate the application of regional/in-house precision medicine strategies in RA management.
Collapse
Affiliation(s)
- Koshiro Sonomoto
- Department of Clinical Nursing, School of Health Sciences, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
| | - Yoshihisa Fujino
- Department of Environmental Epidemiology, Institute of Industrial Ecological Sciences, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
| | - Hiroaki Tanaka
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
| | - Atsushi Nagayasu
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
| | - Shingo Nakayamada
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan
| | - Yoshiya Tanaka
- The First Department of Internal Medicine, School of Medicine, University of Occupational and Environmental Health, Japan, 1-1, Iseigaoka, Yahatanishi-ku, Kitakyushu, 807-8555, Japan.
| |
Collapse
|
2
|
Struber L, Baumont M, Barraud PA, Nougier V, Cignetti F. Brain oscillatory correlates of visuomotor adaptive learning. Neuroimage 2021; 245:118645. [PMID: 34687861 DOI: 10.1016/j.neuroimage.2021.118645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 10/06/2021] [Accepted: 10/10/2021] [Indexed: 11/24/2022] Open
Abstract
Sensorimotor adaptation involves the recalibration of the mapping between motor command and sensory feedback in response to movement errors. Although adaptation operates within individual movements on a trial-to-trial basis, it can also undergo learning when adaptive responses improve over the course of many trials. Brain oscillatory activities related to these "adaptation" and "learning" processes remain unclear. The main reason for this is that previous studies principally focused on the beta band, which confined the outcome message to trial-to-trial adaptation. To provide a wider understanding of adaptive learning, we decoded visuomotor tasks with constant, random or no perturbation from EEG recordings in different bandwidths and brain regions using a multiple kernel learning approach. These different experimental tasks were intended to separate trial-to-trial adaptation from the formation of the new visuomotor mapping across trials. We found changes in EEG power in the post-movement period during the course of the visuomotor-constant rotation task, in particular an increased (i) theta power in prefrontal region, (ii) beta power in supplementary motor area, and (iii) gamma power in motor regions. Classifying the visuomotor task with constant rotation versus those with random or no rotation, we were able to relate power changes in beta band mainly to trial-to-trial adaptation to error while changes in theta band would relate rather to the learning of the new mapping. Altogether, this suggested that there is a tight relationship between modulation of the synchronization of low (theta) and higher (essentially beta) frequency oscillations in prefrontal and sensorimotor regions, respectively, and adaptive learning.
Collapse
Affiliation(s)
- Lucas Struber
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France.
| | - Marie Baumont
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Pierre-Alain Barraud
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Vincent Nougier
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Fabien Cignetti
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| |
Collapse
|
3
|
Amano Y, Honda H, Sawada R, Nukada Y, Yamane M, Ikeda N, Morita O, Yamanishi Y. In silico systems for predicting chemical-induced side effects using known and potential chemical protein interactions, enabling mechanism estimation. J Toxicol Sci 2020; 45:137-149. [PMID: 32147637 DOI: 10.2131/jts.45.137] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
In silico models for predicting chemical-induced side effects have become increasingly important for the development of pharmaceuticals and functional food products. However, existing predictive models have difficulty in estimating the mechanisms of side effects in terms of molecular targets or they do not cover the wide range of pharmacological targets. In the present study, we constructed novel in silico models to predict chemical-induced side effects and estimate the underlying mechanisms with high general versatility by integrating the comprehensive prediction of potential chemical-protein interactions (CPIs) with machine learning. First, the potential CPIs were comprehensively estimated by chemometrics based on the known CPI data (1,179,848 interactions involving 3,905 proteins and 824,143 chemicals). Second, the predictive models for 61 side effects in the cardiovascular system (CVS), gastrointestinal system (GIS), and central nervous system (CNS) were constructed by sparsity-induced classifiers based on the known and potential CPI data. The cross validation experiments showed that the proposed CPI-based models had a higher or comparable performance than the traditional chemical structure-based models. Moreover, our enrichment analysis indicated that the highly weighted proteins derived from predictive models could be involved in the corresponding functions of the side effects. For example, in CVS, the carcinogenesis-related pathways (e.g., prostate cancer, PI3K-Akt signal pathway), which were recently reported to be involved in cardiovascular side effects, were enriched. Therefore, our predictive models are biologically valid and would be useful for predicting side effects and novel potential underlying mechanisms of chemical-induced side effects.
Collapse
Affiliation(s)
- Yuto Amano
- R&D Safety Science Research, Kao Corporation
| | | | - Ryusuke Sawada
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology
| | - Yuko Nukada
- R&D Safety Science Research, Kao Corporation
| | | | | | | | | |
Collapse
|
4
|
Kanda Y, Fujii H, Oguchi T. Sparse modeling of chemical bonding in binary compounds. Sci Technol Adv Mater 2019; 20:1178-1188. [PMID: 32082439 PMCID: PMC7006824 DOI: 10.1080/14686996.2019.1697858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2019] [Revised: 10/24/2019] [Accepted: 11/23/2019] [Indexed: 06/10/2023]
Abstract
A sparse model for quantifying energy difference between zinc-blende and rock-salt crystal structures in octet elemental and binary materials is constructed by using the linearly independent descriptor-generation method and exhaustive search, following the previous work by Ghiringhelli et al. [Phys Rev Lett. 2015;114:105503]. The obtained simplest model includes only atomic radius information of constituent atoms and its physical meaning is interpreted in relation to van Arkel-Ketelaar's triangle for classifying chemical bonding in binary compounds.
Collapse
Affiliation(s)
- Yosuke Kanda
- Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan
| | - Hitoshi Fujii
- MaDIS-CMI2, National Institute for Materials Science, Tsukuba, Japan
| | - Tamio Oguchi
- Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan
- MaDIS-CMI2, National Institute for Materials Science, Tsukuba, Japan
| |
Collapse
|
5
|
Tajimi T, Wakui N, Yanagisawa K, Yoshikawa Y, Ohue M, Akiyama Y. Computational prediction of plasma protein binding of cyclic peptides from small molecule experimental data using sparse modeling techniques. BMC Bioinformatics 2018; 19:527. [PMID: 30598072 PMCID: PMC6311893 DOI: 10.1186/s12859-018-2529-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Cyclic peptide-based drug discovery is attracting increasing interest owing to its potential to avoid target protein depletion. In drug discovery, it is important to maintain the biostability of a drug within the proper range. Plasma protein binding (PPB) is the most important index of biostability, and developing a computational method to predict PPB of drug candidate compounds contributes to the acceleration of drug discovery research. PPB prediction of small molecule drug compounds using machine learning has been conducted thus far; however, no study has investigated cyclic peptides because experimental information of cyclic peptides is scarce. RESULTS First, we adopted sparse modeling and small molecule information to construct a PPB prediction model for cyclic peptides. As cyclic peptide data are limited, applying multidimensional nonlinear models involves concerns regarding overfitting. However, models constructed by sparse modeling can avoid overfitting, offering high generalization performance and interpretability. More than 1000 PPB data of small molecules are available, and we used them to construct a prediction models with two enumeration methods: enumerating lasso solutions (ELS) and forward beam search (FBS). The accuracies of the prediction models constructed by ELS and FBS were equal to or better than those of conventional non-linear models (MAE = 0.167-0.174) on cross-validation of a small molecule compound dataset. Moreover, we showed that the prediction accuracies for cyclic peptides were close to those for small molecule compounds (MAE = 0.194-0.288). Such high accuracy could not be obtained by a simple method of learning from cyclic peptide data directly by lasso regression (MAE = 0.286-0.671) or ridge regression (MAE = 0.244-0.354). CONCLUSION In this study, we proposed a machine learning techniques that uses low-dimensional sparse modeling to predict the PPB value of cyclic peptides computationally. The low-dimensional sparse model not only exhibits excellent generalization performance but also improves interpretation of the prediction model. This can provide common an noteworthy knowledge for future cyclic peptide drug discovery studies.
Collapse
Affiliation(s)
- Takashi Tajimi
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Naoki Wakui
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.,Middle Molecule IT-based Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, RGBT2-A-1C 3-25-10 Tonomachi, Kawasaki-ku, Kawasaki city, Kanagawa, 210-0821, Japan
| | - Keisuke Yanagisawa
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Yasushi Yoshikawa
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.,Middle Molecule IT-based Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, RGBT2-A-1C 3-25-10 Tonomachi, Kawasaki-ku, Kawasaki city, Kanagawa, 210-0821, Japan
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.,Middle Molecule IT-based Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, RGBT2-A-1C 3-25-10 Tonomachi, Kawasaki-ku, Kawasaki city, Kanagawa, 210-0821, Japan
| | - Yutaka Akiyama
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan. .,Middle Molecule IT-based Drug Discovery Laboratory (MIDL), Tokyo Institute of Technology, RGBT2-A-1C 3-25-10 Tonomachi, Kawasaki-ku, Kawasaki city, Kanagawa, 210-0821, Japan. .,Molecular Profiling Research Center for Drug Discovery (molprof), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
| |
Collapse
|
6
|
Abstract
Elucidating neural dynamics is one of the important subjects in neuroscience. To elucidate nonlinear dynamics of single neurons, it is important to extract nonlinear membrane currents from many types of membrane current candidates. In this study, we propose a sparse modeling method for estimating a conductance-based neuron model from observed data, by extracting necessary membrane currents from multiple candidates. We show using simulated data that our proposed sparse modeling approach with different sparsity levels for distinct membrane currents extracts only necessary membrane currents from candidates more accurately, compared with least-squares method and sparse method with uniform sparsity level.
Collapse
Affiliation(s)
- Shinya Otsuka
- Department of Electrical and Electronic Engineering, Graduate School of Engineering, Kobe University, Japan
| | - Toshiaki Omori
- Department of Electrical and Electronic Engineering, Graduate School of Engineering, Kobe University, Japan.
| |
Collapse
|
7
|
Abstract
Identification of drug-target interactions is a crucial process in drug discovery. In this chapter, we present protocols for recent advancements in machine learning methods for predicting drug-target interactions from heterogeneous biological data in a chemogenomic framework, in which prediction is based on the chemical structure data of drug candidate compounds and translated genomic sequence data of target candidate proteins. Most existing methods are based on either linear modeling or kernel modeling. To illustrate linear modeling, we introduce sparsity-induced binary classifiers and sparse canonical correlation analysis. To illustrate kernel modeling, we introduce pairwise kernel-based support vector machines and kernel-based distance learning. Workflows for using these techniques are presented. We also discuss the characteristics of each method and suggest some directions for future research.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan.
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama, Japan.
| |
Collapse
|
8
|
Abstract
Most drugs produce their phenotypic effects by interacting with target proteins, and understanding the molecular features that underpin drug-target interactions is crucial when designing a novel drug. In this chapter, we introduce the protocols that have driven recent advances in sparse modeling methods for analyzing drug-target interaction networks within a chemogenomic framework. In this approach, the chemical structures of candidate drug compounds are correlated with the genomic sequences of the candidate target proteins. We demonstrate the use of sparse canonical correspondence analysis and sparsity-induced binary classifiers to extract the underlying molecular features that are most strongly involved in drug-target interactions. We focus on drug chemical substructures and protein domains. Workflows for applying these methods are presented, and an application is described in detail. We consider the characteristics of each method and suggest possible directions for future research.
Collapse
|
9
|
Abstract
To identify disease-associated taxa is an important task in metagenomics. To date, many methods have been proposed for feature selection and prediction. However, those proposed methods are either using univariate (generalized) regression approaches to get the corresponding P-values without considering the interactions among taxa, or using lasso or L0 type sparse modeling approaches to identify taxa with best predictions without providing P-values. To the best of our knowledge, there are no available methods that consider taxon interactions and also generate P-values.In this paper, we propose a treatment-effect model for identifying taxa (STEMIT) and performing statistical inference with high-dimensional metagenomic data. STEMIT will provide a P-value for a taxon through a two-step treatment-effect maximization. It will provide causal inference if the study is a clinical trial. We first identify taxa associated with the treatment-effect variable and the targeting feature with sparse modeling, and then estimate the P-value of the targeting gene with ordinary least square (OLS) regression. We demonstrate that the proposed method is efficient and can identify biologically important taxa with a real metagenomic data set. The software for L0 sparse modeling can be downloaded at https://cran.r-project.org/web/packages/l0ara/ .
Collapse
Affiliation(s)
- Zhenqiu Liu
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| | - Shili Lin
- Department of Statistics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
10
|
Liu Z, Sun F, McGovern DP. Sparse generalized linear model with L0 approximation for feature selection and prediction with big omics data. BioData Min 2017; 10:39. [PMID: 29270229 PMCID: PMC5735537 DOI: 10.1186/s13040-017-0159-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 12/04/2017] [Indexed: 11/10/2022] Open
Abstract
Background Feature selection and prediction are the most important tasks for big data mining. The common strategies for feature selection in big data mining are L1, SCAD and MC+. However, none of the existing algorithms optimizes L0, which penalizes the number of nonzero features directly. Results In this paper, we develop a novel sparse generalized linear model (GLM) with L0 approximation for feature selection and prediction with big omics data. The proposed approach approximate the L0 optimization directly. Even though the original L0 problem is non-convex, the problem is approximated by sequential convex optimizations with the proposed algorithm. The proposed method is easy to implement with only several lines of code. Novel adaptive ridge algorithms (L0ADRIDGE) for L0 penalized GLM with ultra high dimensional big data are developed. The proposed approach outperforms the other cutting edge regularization methods including SCAD and MC+ in simulations. When it is applied to integrated analysis of mRNA, microRNA, and methylation data from TCGA ovarian cancer, multilevel gene signatures associated with suboptimal debulking are identified simultaneously. The biological significance and potential clinical importance of those genes are further explored. Conclusions The developed Software L0ADRIDGE in MATLAB is available at https://github.com/liuzqx/L0adridge. Electronic supplementary material The online version of this article (doi:10.1186/s13040-017-0159-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zhenqiu Liu
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, 90048 CA USA
| | - Fengzhu Sun
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, 90089 CA USA
| | - Dermot P McGovern
- Foundation Inflammatory Bowel & Immunobiology Research Institute, Cedars-Sinai Medical Center, Los Angeles, 90048 CA USA
| |
Collapse
|
11
|
Abstract
The appearance of massive data has become increasingly common in contemporary scientific research. When sample size n is huge, classical learning methods become computationally costly for the regression purpose. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance. The proposed rule is conceptually simple and convenient for implementation, which suggests an [Formula: see text] number of essential updates in an OGA process. It therefore provides an appealing route to conduct efficient learning for massive data. With a sample dependent kernel dictionary, we show that the proposed method is strongly consistent with an [Formula: see text] convergence rate to the oracle prediction. The promising performance of the method is supported by both simulation and real data examples.
Collapse
Affiliation(s)
- Chen Xu
- The Pennsylvania State University
| | | | | | - Runze Li
- The Pennsylvania State University
| |
Collapse
|
12
|
Lin D, Cao H, Calhoun VD, Wang YP. Sparse models for correlative and integrative analysis of imaging and genetic data. J Neurosci Methods 2014; 237:69-78. [PMID: 25218561 DOI: 10.1016/j.jneumeth.2014.09.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/27/2014] [Accepted: 09/01/2014] [Indexed: 11/29/2022]
Abstract
The development of advanced medical imaging technologies and high-throughput genomic measurements has enhanced our ability to understand their interplay as well as their relationship with human behavior by integrating these two types of datasets. However, the high dimensionality and heterogeneity of these datasets presents a challenge to conventional statistical methods; there is a high demand for the development of both correlative and integrative analysis approaches. Here, we review our recent work on developing sparse representation based approaches to address this challenge. We show how sparse models are applied to the correlation and integration of imaging and genetic data for biomarker identification. We present examples on how these approaches are used for the detection of risk genes and classification of complex diseases such as schizophrenia. Finally, we discuss future directions on the integration of multiple imaging and genomic datasets including their interactions such as epistasis.
Collapse
Affiliation(s)
- Dongdong Lin
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA; Center of Genomics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA.
| | - Hongbao Cao
- Unit on Statistical Genomics, Intramural Program of Research, National Institute of Mental Health, NIH, Bethesda 20852, USA.
| | - Vince D Calhoun
- The Mind Research Network & LBERI, Albuquerque, NM 87106, USA; Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131, USA.
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA; Center of Genomics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
13
|
Malsiner-Walli G, Frühwirth-Schnatter S, Grün B. Model-based clustering based on sparse finite Gaussian mixtures. Stat Comput 2014; 26:303-324. [PMID: 26900266 PMCID: PMC4750551 DOI: 10.1007/s11222-014-9500-2] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2013] [Accepted: 07/30/2014] [Indexed: 06/05/2023]
Abstract
In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous components during MCMC. A straightforward estimator for the true number of components is given by the most frequent number of non-empty components visited during MCMC sampling. Specifying a shrinkage prior, namely the normal gamma prior, on the component means leads to improved parameter estimates as well as identification of cluster-relevant variables. After estimating the mixture model using MCMC methods based on data augmentation and Gibbs sampling, an identified model is obtained by relabeling the MCMC output in the point process representation of the draws. This is performed using [Formula: see text]-centroids cluster analysis based on the Mahalanobis distance. We evaluate our proposed strategy in a simulation setup with artificial data and by applying it to benchmark data sets.
Collapse
Affiliation(s)
| | | | - Bettina Grün
- Institut für Angewandte Statistik, Johannes Kepler Universität Linz, Linz, Austria
| |
Collapse
|