Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Liou JW, Liou M, Cheng PE. Modeling Categorical Variables by Mutual Information Decomposition. ENTROPY (BASEL, SWITZERLAND) 2023;25:e25050750. [PMID: 37238505 DOI: 10.3390/e25050750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/24/2023] [Accepted: 04/26/2023] [Indexed: 05/28/2023]

Di X, Yin Y, Fu Y, Mo Z, Lo SH, DiGuiseppi C, Eby DW, Hill L, Mielenz TJ, Strogatz D, Kim M, Li G. Detecting mild cognitive impairment and dementia in older adults using naturalistic driving data and interaction-based classification from influence score. Artif Intell Med 2023;138:102510. [PMID: 36990588 DOI: 10.1016/j.artmed.2023.102510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 02/04/2023] [Accepted: 02/09/2023] [Indexed: 02/22/2023]

Abstract

Several recent studies indicate that atypical changes in driving behaviors appear to be early signs of mild cognitive impairment (MCI) and dementia. These studies, however, are limited by small sample sizes and short follow-up duration. This study aims to develop an interaction-based classification method building on a statistic named Influence Score (i.e., I-score) for prediction of MCI and dementia using naturalistic driving data collected from the Longitudinal Research on Aging Drivers (LongROAD) project. Naturalistic driving trajectories were collected through in-vehicle recording devices for up to 44 months from 2977 participants who were cognitively intact at the time of enrollment. These data were further processed and aggregated to generate 31 time-series driving variables. Because of high dimensional time-series features for driving variables, we used I-score for variable selection. I-score is a measure to evaluate variables' ability to predict and is proven to be effective in differentiating between noisy and predictive variables in big data. It is introduced here to select influential variable modules or groups that account for compound interactions among explanatory variables. It is explainable regarding to what extent variables and their interactions contribute to the predictiveness of a classifier. In addition, I-score boosts the performance of classifiers over imbalanced datasets due to its association with the F1 score. Using predictive variables selected by I-score, interaction-based residual blocks are constructed over top I-score modules to generate predictors and ensemble learning aggregates these predictors to boost the prediction of the overall classifier. Experiments using naturalistic driving data show that our proposed classification method achieves the best accuracy (96%) for predicting MCI and dementia, followed by random forest (93%) and logistic regression (88%). In terms of F1 score and AUC, our proposed classifier achieves 98% and 87%, respectively, followed by random forest (with an F1 score of 96% and an AUC of 79%) and logistic regression (with an F1 score of 92% and an AUC of 77%). The results indicate that incorporating I-score into machine learning algorithms could considerably improve the model performance for predicting MCI and dementia in older drivers. We also performed the feature importance analysis and found that the right to left turn ratio and the number of hard braking events are the most important driving variables to predict MCI and dementia.

Collapse

Chu X, Jiang M, Liu ZJ. Biomarker interaction selection and disease detection based on multivariate gain ratio. BMC Bioinformatics 2022;23:176. [PMID: 35550010 PMCID: PMC9103137 DOI: 10.1186/s12859-022-04699-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 04/14/2022] [Indexed: 11/30/2022] Open

Abstract

Background

Disease detection is an important aspect of biotherapy. With the development of biotechnology and computer technology, there are many methods to detect disease based on single biomarker. However, biomarker does not influence disease alone in some cases. It’s the interaction between biomarkers that determines disease status. The existing influence measure I-score is used to evaluate the importance of interaction in determining disease status, but there is a deviation about the number of variables in interaction when applying I-score. To solve the problem, we propose a new influence measure Multivariate Gain Ratio (MGR) based on Gain Ratio (GR) of single-variate, which provides us with multivariate combination called interaction.

Results

We propose a preprocessing verification algorithm based on partial predictor variables to select an appropriate preprocessing method. In this paper, an algorithm for selecting key interactions of biomarkers and applying key interactions to construct a disease detection model is provided. MGR is more credible than I-score in the case of interaction containing small number of variables. Our method behaves better with average accuracy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$93.13\%$$\end{document}93.13% than I-score of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$91.73\%$$\end{document}91.73% in Breast Cancer Wisconsin (Diagnostic) Dataset. Compared to the classification results \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$89.80\%$$\end{document}89.80% based on all predictor variables, MGR identifies the true main biomarkers and realizes the dimension reduction. In Leukemia Dataset, the experiment results show the effectiveness of MGR with the accuracy of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$97.32\%$$\end{document}97.32% compared to I-score with accuracy \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$89.11\%$$\end{document}89.11%. The results can be explained by the nature of MGR and I-score mentioned above because every key interaction contains a small number of variables in Leukemia Dataset.

Conclusions

MGR is effective for selecting important biomarkers and biomarker interactions even in high-dimension feature space in which the interaction could contain more than two biomarkers. The prediction ability of interactions selected by MGR is better than I-score in the case of interaction containing small number of variables. MGR is generally applicable to various types of biomarker datasets including cell nuclei, gene, SNPs and protein datasets.

Collapse

Epistasis Detection via the Joint Cumulant. STATISTICS IN BIOSCIENCES 2022. [DOI: 10.1007/s12561-022-09336-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Language Semantics Interpretation with an Interaction-Based Recurrent Neural Network. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3040046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

An Interaction-Based Convolutional Neural Network (ICNN) Toward a Better Understanding of COVID-19 X-ray Images. ALGORITHMS 2021. [DOI: 10.3390/a14110337] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Abstract The field of explainable artificial intelligence (XAI) aims to build explainable and interpretable machine learning (or deep learning) methods without sacrificing prediction performance. Convolutional neural networks (CNNs) have been successful in making predictions, especially in image classification. These popular and well-documented successes use extremely deep CNNs such as VGG16, DenseNet121, and Xception. However, these well-known deep learning models use tens of millions of parameters based on a large number of pretrained filters that have been repurposed from previous data sets. Among these identified filters, a large portion contain no information yet remain as input features. Thus far, there is no effective method to omit these noisy features from a data set, and their existence negatively impacts prediction performance. In this paper, a novel interaction-based convolutional neural network (ICNN) is introduced that does not make assumptions about the relevance of local information. Instead, a model-free influence score (I-score) is proposed to directly extract the influential information from images to form important variable modules. This innovative technique replaces all pretrained filters found by trial-and-error with explainable, influential, and predictive variable sets (modules) determined by the I-score. In other words, future researchers need not rely on pretrained filters; the suggested algorithm identifies only the variables or pixels with high I-score values that are extremely predictive and important. The proposed method and algorithm were tested on real-world data set and a state-of-the-art prediction performance of 99.8% was achieved without sacrificing the explanatory power of the model. This proposed design can efficiently screen patients infected by COVID-19 before human diagnosis and can be a benchmark for addressing future XAI problems in large-scale data sets. Collapse

Hung H, Huang SY. Sufficient dimension reduction via random-partitions for the large-p-small-n problem. Biometrics 2018;75:245-255. [PMID: 30052272 DOI: 10.1111/biom.12926] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 05/01/2018] [Accepted: 05/01/2018] [Indexed: 11/30/2022]

Crawford L, Zeng P, Mukherjee S, Zhou X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLoS Genet 2017;13:e1006869. [PMID: 28746338 PMCID: PMC5550000 DOI: 10.1371/journal.pgen.1006869] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 08/09/2017] [Accepted: 06/15/2017] [Indexed: 12/13/2022] Open

Framework for making better predictions by directly estimating variables' predictivity. Proc Natl Acad Sci U S A 2016;113:14277-14282. [PMID: 27911830 DOI: 10.1073/pnas.1616647113] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Wang MH, Sun R, Guo J, Weng H, Lee J, Hu I, Sham PC, Zee BCY. A fast and powerful W-test for pairwise epistasis testing. Nucleic Acids Res 2016;44:e115. [PMID: 27112568 PMCID: PMC4937324 DOI: 10.1093/nar/gkw347] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2015] [Revised: 04/14/2016] [Accepted: 04/15/2016] [Indexed: 01/08/2023] Open

Why significant variables aren't automatically good predictors. Proc Natl Acad Sci U S A 2015;112:13892-7. [PMID: 26504198 DOI: 10.1073/pnas.1518285112] [Citation(s) in RCA: 142] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Satten GA, Biswas S, Papachristou C, Turkmen A, König IR. Population-based association and gene by environment interactions in Genetic Analysis Workshop 18. Genet Epidemiol 2014;38 Suppl 1:S49-56. [PMID: 25112188 DOI: 10.1002/gepi.21825] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Liu Y, Huang C, Hu I, Lo SH, Zheng T. A dual-clustering framework for association screening with whole genome sequencing data and longitudinal traits. BMC Proc 2014;8:S47. [PMID: 25519328 PMCID: PMC4143709 DOI: 10.1186/1753-6561-8-s1-s47] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Agne M, Huang CH, Hu I, Wang H, Zheng T, Lo SH. Considering interactive effects in the identification of influential regions with extremely rare variants via fixed bin approach. BMC Proc 2014;8:S7. [PMID: 25519400 PMCID: PMC4143804 DOI: 10.1186/1753-6561-8-s1-s7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Wang MH, Huang CH, Zheng T, Lo SH, Hu I. Discovering pure gene-environment interactions in blood pressure genome-wide association studies data: a two-step approach incorporating new statistics. BMC Proc 2014;8:S62. [PMID: 25519396 PMCID: PMC4143689 DOI: 10.1186/1753-6561-8-s1-s62] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Fan R, Huang CH, Hu I, Wang H, Zheng T, Lo SH. A partition-based approach to identify gene-environment interactions in genome wide association studies. BMC Proc 2014;8:S60. [PMID: 25519395 PMCID: PMC4143762 DOI: 10.1186/1753-6561-8-s1-s60] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Hwang JS, Hu TH. A stepwise regression algorithm for high-dimensional variable selection. J STAT COMPUT SIM 2014. [DOI: 10.1080/00949655.2014.902460] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Fan R, Lo SH. A robust model-free approach for rare variants association studies incorporating gene-gene and gene-environmental interactions. PLoS One 2013;8:e83057. [PMID: 24358248 PMCID: PMC3866272 DOI: 10.1371/journal.pone.0083057] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2013] [Accepted: 10/30/2013] [Indexed: 11/19/2022] Open

Wang H, Lo SH, Zheng T, Hu I. Interaction-based feature selection and classification for high-dimensional biological data. Bioinformatics 2012;28:2834-42. [PMID: 22945786 PMCID: PMC3577111 DOI: 10.1093/bioinformatics/bts531] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Revised: 08/20/2012] [Accepted: 08/22/2012] [Indexed: 11/13/2022] Open

Papathomas M, Molitor J, Hoggart C, Hastie D, Richardson S. Exploring data from genetic association studies using Bayesian variable selection and the Dirichlet process: application to searching for gene × gene patterns. Genet Epidemiol 2012;36:663-74. [PMID: 22851500 DOI: 10.1002/gepi.21661] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Revised: 05/16/2012] [Accepted: 06/08/2012] [Indexed: 11/09/2022]

Variable Selection for Classification and Regression in Large p, Small n Problems. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/978-1-4614-1966-2_10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Liu Y, Huang CH, Hu I, Lo SH, Zheng T. Association screening for genes with multiple potentially rare variants: an inverse-probability weighted clustering approach. BMC Proc 2011;5 Suppl 9:S106. [PMID: 22373536 PMCID: PMC3287829 DOI: 10.1186/1753-6561-5-s9-s106] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Stepwise Paring down Variation for Identifying Influential Multi-factor Interactions Related to a Continuous Response Variable. STATISTICS IN BIOSCIENCES 2011. [DOI: 10.1007/s12561-011-9045-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]

Bailey-Wilson JE, Brennan JS, Bull SB, Culverhouse R, Kim Y, Jiang Y, Jung J, Li Q, Lamina C, Liu Y, Mägi R, Niu YS, Simpson CL, Wang L, Yilmaz YE, Zhang H, Zhang Z. Regression and data mining methods for analyses of multiple rare variants in the Genetic Analysis Workshop 17 mini-exome data. Genet Epidemiol 2011;35 Suppl 1:S92-100. [PMID: 22128066 PMCID: PMC3360949 DOI: 10.1002/gepi.20657] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Zhang Y, Jiang B, Zhu J, Liu JS. Bayesian models for detecting epistatic interactions from genetic data. Ann Hum Genet 2010;75:183-93. [PMID: 21091453 DOI: 10.1111/j.1469-1809.2010.00621.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Chernoff H, Lo SH, Zheng T. Discovering influential variables: A method of partitions. Ann Appl Stat 2009. [DOI: 10.1214/09-aoas265] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]