1
|
Liang W, Zhang K, Cao P, Liu X, Yang J, Zaiane OR. Exploiting task relationships for Alzheimer's disease cognitive score prediction via multi-task learning. Comput Biol Med 2023; 152:106367. [PMID: 36516575 DOI: 10.1016/j.compbiomed.2022.106367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 11/16/2022] [Accepted: 11/25/2022] [Indexed: 12/13/2022]
Abstract
Alzheimer's disease (AD) is highly prevalent and a significant cause of dementia and death in elderly individuals. Motivated by breakthroughs of multi-task learning (MTL), efforts have been made to extend MTL to improve the Alzheimer's disease cognitive score prediction by exploiting structure correlation. Though important and well-studied, three key aspects are yet to be fully handled in an unified framework: (i) appropriately modeling the inherent task relationship; (ii) fully exploiting the task relatedness by considering the underlying feature structure. (iii) automatically determining the weight of each task. To this end, we present the Bi-Graph guided self-Paced Multi-Task Feature Learning (BGP-MTFL) framework for exploring the relationship among multiple tasks to improve overall learning performance of cognitive score prediction. The framework consists of the two correlation regularization for features and tasks, ℓ2,1 regularization and self-paced learning scheme. Moreover, we design an efficient optimization method to solve the non-smooth objective function of our approach based on the Alternating Direction Method of Multipliers (ADMM) combined with accelerated proximal gradient (APG). The proposed model is comprehensively evaluated on the Alzheimer's disease neuroimaging initiative (ADNI) datasets. Overall, the proposed algorithm achieves an nMSE (normalized Mean Squared Error) of 3.923 and an wR (weighted R-value) of 0.416 for predicting eighteen cognitive scores, respectively. The empirical study demonstrates that the proposed BGP-MTFL model outperforms the state-of-the-art AD prediction approaches and enables identifying more stable biomarkers.
Collapse
Affiliation(s)
- Wei Liang
- Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Kai Zhang
- Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Peng Cao
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China.
| | - Xiaoli Liu
- DAMO Academy, Alibaba Group, Hangzhou, China
| | - Jinzhu Yang
- Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
| | - Osmar R Zaiane
- Alberta Machine Intelligence Institute, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
2
|
Sevilla-Salcedo C, Imani V, M Olmos P, Gómez-Verdejo V, Tohka J. Multi-task longitudinal forecasting with missing values on Alzheimer's disease. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107056. [PMID: 36191353 DOI: 10.1016/j.cmpb.2022.107056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 06/16/2022] [Accepted: 08/01/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Machine learning techniques typically used in dementia assessment are not able to learn multiple tasks jointly and deal with time-dependent heterogeneous data containing missing values. In this paper, we reformulate SSHIBA, a recently introduced Bayesian multi-view latent variable model, for jointly learning diagnosis, ventricle volume, and ADAS score in dementia on longitudinal data with missing values. METHODS We propose a novel Bayesian Variational inference framework capable of simultaneously imputing missing values and combining information from several views. This way, we can combine different data views from different time-points in a common latent space and learn the relationships between each time-point, using the semi-supervised formulation to fully exploit the temporal structure of the data and handle missing values. In turn, the model can combine all the available information to simultaneously model and predict multiple output variables. RESULTS We applied the proposed model to jointly predict diagnosis, ventricle volume, and ADAS score in dementia. The comparison of imputation strategies demonstrated the superior performance of the semi-supervised formulation of the model, improving the best baseline methods. Moreover, the performance in simultaneous prediction of diagnosis, ventricle volume, and ADAS score led to an improved prediction performance over the best baseline method. CONCLUSIONS The results demonstrate that the proposed SSHIBA framework can learn an excellent imputation of the missing values and outperforming the baselines while simultaneously predicting three different tasks.
Collapse
Affiliation(s)
- Carlos Sevilla-Salcedo
- Signal Theory and Communications Department, University Carlos III of Madrid, Leganés 28911 Spain.
| | - Vandad Imani
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Pablo M Olmos
- Signal Theory and Communications Department, University Carlos III of Madrid, Leganés 28911 Spain
| | - Vanessa Gómez-Verdejo
- Signal Theory and Communications Department, University Carlos III of Madrid, Leganés 28911 Spain
| | - Jussi Tohka
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| |
Collapse
|
3
|
Abdulaal MJ, Mehedi IM, Aljohani AJ, Milyani AH, Mahmoud M, Abusorrah AM, Jannat R. Separation of Different Blogs from Skin Disease Data using Artificial Intelligence. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7538643. [PMID: 36052051 PMCID: PMC9427218 DOI: 10.1155/2022/7538643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/20/2022] [Accepted: 07/25/2022] [Indexed: 11/23/2022]
Abstract
A combination of environmental conditions may cause skin illness everywhere on the earth, and it is one of the most dangerous diseases that can develop as a result. A major goal in the selection of characteristics is to produce predictions about skin disease instances in connection with influencing variables, which is one of the most important tasks. As a consequence of the widespread usage of sensors, the amount of data collected in the health industry is disproportionately large when compared to data collected in other sectors. In the past, researchers have used a variety of machine learning algorithms to determine the relationship between illnesses and other disorders. Forecasting is a procedure that involves many steps, the most important of which are the preprocessing of any scenario and the selection of forecasting features. A major disadvantage of doing business in the health industry is a lack of data availability, which is particularly problematic when data is provided in an unstructured format. Filling in missing numbers and converting between various types of data take somewhat more than 70% of the total time. When dealing with missing data in machine learning applications, the mean, average, and median, as well as the stand mechanism, may all be employed to solve the problem. Previous research has shown that the characteristics chosen for a model's overall performance may have an influence on the overall performance of the model's overall performance. One of the primary goals of this study is to develop an intelligent algorithm for identifying relevant traits in models while simultaneously eliminating nonsignificant attributes that have an impact on model performance. To present a full view of the data, artificial intelligence techniques such as SVM, decision tree, and logistic regression models were used in conjunction with three separate feature combination methodologies, each of which was developed independently. As a consequence of this, their accuracy, F-measure, and precision are all raised by a factor of ten, respectively. We then have a list of the most important features, together with the weights that have been allocated to each of them.
Collapse
Affiliation(s)
- Mohammed J. Abdulaal
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ibrahim M. Mehedi
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Abdulah Jeza Aljohani
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ahmad H. Milyani
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohamed Mahmoud
- Electrical and Engineering Department, Tennessee Technological University, Cookeville, TN, USA
| | - Abdullah M. Abusorrah
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Rahtul Jannat
- Department of Electrical and Electronic Engineering, BRAC University, Dhaka, Bangladesh
| |
Collapse
|
4
|
A novel method for prediction of skin disease through supervised classification techniques. Soft comput 2022. [DOI: 10.1007/s00500-022-07435-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
5
|
Chen Z, Liu Y, Zhang Y, Jin R, Tao J, Chen L. Low-rank sparse feature selection with incomplete labels for Alzheimer's disease progression prediction. Comput Biol Med 2022; 147:105705. [PMID: 35717935 DOI: 10.1016/j.compbiomed.2022.105705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 05/16/2022] [Accepted: 06/04/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND How to predict the cognitive performance of Alzheimer's disease (AD) and identify the informative neuroimaging markers is essential for timely treatment and possible delay of the disease. However, incomplete labeled samples and noises in neuroimaging data pose challenges to building reliable and robust prediction models. In this paper, we present a model named Low-rank Sparse Feature Selection with Incomplete Labels (LSFSIL) for predicting cognitive performance and identifying informative neuroimaging markers with MRI data and incomplete cognitive scores. METHOD We propose a sparse matrix decomposition method to decompose the incomplete cognitive score matrix into two parts for recovering missing scores and utilizing incomplete labeled data. The former is the recovered cognitive score matrix without missing values. To make the recovered scores close to the real ones, a manifold regularizer is devised to fit the label distribution for capturing the label correlations locally. The latter is a ℓ1-norm regularized matrix which represents the associated errors. Next, a low-rank regression model that regards the recovered matrix as the target is developed to increase the robustness to noises and outliers. Besides, ℓ2,1-norm is introduced into the objective function as a sparse regularization to identify the important features. RESULTS Experimental results demonstrate that LSFSIL achieves higher performance and outperforms several state-of-the-art feature selection approaches. Moreover, the neuroimaging markers selected by LSFSIL are consistent with the previous AD studies. CONCLUSIONS LSFSIL is effective in informative neuroimaging marker identification for cognitive performance prediction with incomplete labeled data.
Collapse
Affiliation(s)
- Zhi Chen
- Knowledge and Data Engineering Laboratory of Chinese Medicine, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yongguo Liu
- Knowledge and Data Engineering Laboratory of Chinese Medicine, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Yun Zhang
- Knowledge and Data Engineering Laboratory of Chinese Medicine, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Rongjiang Jin
- College of Health Preservation and Rehabilitation, Chengdu University of Traditional Chinese Medicine, Chengdu, 610075, China
| | - Jing Tao
- College of Rehabilitation Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, 350122, China
| | - Lidian Chen
- College of Rehabilitation Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, 350122, China
| |
Collapse
|
6
|
Kumar S, Oh I, Schindler S, Lai AM, Payne PRO, Gupta A. Machine learning for modeling the progression of Alzheimer disease dementia using clinical data: a systematic literature review. JAMIA Open 2021; 4:ooab052. [PMID: 34350389 PMCID: PMC8327375 DOI: 10.1093/jamiaopen/ooab052] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 06/21/2021] [Accepted: 06/30/2021] [Indexed: 11/17/2022] Open
Abstract
OBJECTIVE Alzheimer disease (AD) is the most common cause of dementia, a syndrome characterized by cognitive impairment severe enough to interfere with activities of daily life. We aimed to conduct a systematic literature review (SLR) of studies that applied machine learning (ML) methods to clinical data derived from electronic health records in order to model risk for progression of AD dementia. MATERIALS AND METHODS We searched for articles published between January 1, 2010, and May 31, 2020, in PubMed, Scopus, ScienceDirect, IEEE Explore Digital Library, Association for Computing Machinery Digital Library, and arXiv. We used predefined criteria to select relevant articles and summarized them according to key components of ML analysis such as data characteristics, computational algorithms, and research focus. RESULTS There has been a considerable rise over the past 5 years in the number of research papers using ML-based analysis for AD dementia modeling. We reviewed 64 relevant articles in our SLR. The results suggest that majority of existing research has focused on predicting progression of AD dementia using publicly available datasets containing both neuroimaging and clinical data (neurobehavioral status exam scores, patient demographics, neuroimaging data, and laboratory test values). DISCUSSION Identifying individuals at risk for progression of AD dementia could potentially help to personalize disease management to plan future care. Clinical data consisting of both structured data tables and clinical notes can be effectively used in ML-based approaches to model risk for AD dementia progression. Data sharing and reproducibility of results can enhance the impact, adaptation, and generalizability of this research.
Collapse
Affiliation(s)
- Sayantan Kumar
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Inez Oh
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Suzanne Schindler
- Department of Neurology, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Albert M Lai
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Philip R O Payne
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Aditi Gupta
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
7
|
Li J, Bian C, Chen D, Meng X, Luo H, Liang H, Shen L. Effect of APOE ε4 on multimodal brain connectomic traits: a persistent homology study. BMC Bioinformatics 2020; 21:535. [PMID: 33371873 PMCID: PMC7768655 DOI: 10.1186/s12859-020-03877-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 11/13/2020] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Although genetic risk factors and network-level neuroimaging abnormalities have shown effects on cognitive performance and brain atrophy in Alzheimer's disease (AD), little is understood about how apolipoprotein E (APOE) ε4 allele, the best-known genetic risk for AD, affect brain connectivity before the onset of symptomatic AD. This study aims to investigate APOE ε4 effects on brain connectivity from the perspective of multimodal connectome. RESULTS Here, we propose a novel multimodal brain network modeling framework and a network quantification method based on persistent homology for identifying APOE ε4-related network differences. Specifically, we employ sparse representation to integrate multimodal brain network information derived from both the resting state functional magnetic resonance imaging (rs-fMRI) data and the diffusion-weighted magnetic resonance imaging (dw-MRI) data. Moreover, persistent homology is proposed to avoid the ad hoc selection of a specific regularization parameter and to capture valuable brain connectivity patterns from the topological perspective. The experimental results demonstrate that our method outperforms the competing methods, and reasonably yields connectomic patterns specific to APOE ε4 carriers and non-carriers. CONCLUSIONS We have proposed a multimodal framework that integrates structural and functional connectivity information for constructing a fused brain network with greater discriminative power. Using persistent homology to extract topological features from the fused brain network, our method can effectively identify APOE ε4-related brain connectomic biomarkers.
Collapse
Affiliation(s)
- Jin Li
- College of Automation, Harbin Engineering University, 145 Nantong Street, Harbin, 150001, Heilongjiang, China
| | - Chenyuan Bian
- College of Automation, Harbin Engineering University, 145 Nantong Street, Harbin, 150001, Heilongjiang, China
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, B306 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA
| | - Dandan Chen
- College of Automation, Harbin Engineering University, 145 Nantong Street, Harbin, 150001, Heilongjiang, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou, 213032, China
| | - Haoran Luo
- College of Automation, Harbin Engineering University, 145 Nantong Street, Harbin, 150001, Heilongjiang, China
| | - Hong Liang
- College of Automation, Harbin Engineering University, 145 Nantong Street, Harbin, 150001, Heilongjiang, China.
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, B306 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.
| |
Collapse
|
8
|
Lombardi A, Amoroso N, Diacono D, Monaco A, Logroscino G, De Blasi R, Bellotti R, Tangaro S. Association between Structural Connectivity and Generalized Cognitive Spectrum in Alzheimer's Disease. Brain Sci 2020; 10:E879. [PMID: 33233622 PMCID: PMC7699729 DOI: 10.3390/brainsci10110879] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 11/10/2020] [Accepted: 11/17/2020] [Indexed: 01/10/2023] Open
Abstract
Modeling disease progression through the cognitive scores has become an attractive challenge in the field of computational neuroscience due to its importance for early diagnosis of Alzheimer's disease (AD). Several scores such as Alzheimer's Disease Assessment Scale cognitive total score, Mini Mental State Exam score and Rey Auditory Verbal Learning Test provide a quantitative assessment of the cognitive conditions of the patients and are commonly used as objective criteria for clinical diagnosis of dementia and mild cognitive impairment (MCI). On the other hand, connectivity patterns extracted from diffusion tensor imaging (DTI) have been successfully used to classify AD and MCI subjects with machine learning algorithms proving their potential application in the clinical setting. In this work, we carried out a pilot study to investigate the strength of association between DTI structural connectivity of a mixed ADNI cohort and cognitive spectrum in AD. We developed a machine learning framework to find a generalized cognitive score that summarizes the different functional domains reflected by each cognitive clinical index and to identify the connectivity biomarkers more significantly associated with the score. The results indicate that the efficiency and the centrality of some regions can effectively track cognitive impairment in AD showing a significant correlation with the generalized cognitive score (R = 0.7).
Collapse
Affiliation(s)
- Angela Lombardi
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
| | - Nicola Amoroso
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
- Dipartimento di Farmacia–Scienze del Farmaco, Università degli Studi di Bari, 70125 Bari, Italy
| | - Domenico Diacono
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
| | - Giancarlo Logroscino
- Center for Neurodegenerative Diseases and the Aging Brain, Università degli Studi di Bari at Pia Fondazione “Card. G. Panico”, 73039 Tricase, Italy;
- Department of Basic Medicine Neuroscience and Sense Organs, Università degli Studi di Bari, 70124 Bari, Italy
| | | | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
- Dipartimento Interateneo di Fisica, Università degli Studi di Bari, 70126 Bari, Italy
| | - Sabina Tangaro
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy; (A.L.); (N.A.); (D.D.); (R.B.)
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari, 70126 Bari, Italy
| |
Collapse
|
9
|
|
10
|
Robitzsch A. Regularized Latent Class Analysis for Polytomous Item Responses: An Application to SPM-LS Data. J Intell 2020; 8:E30. [PMID: 32823949 PMCID: PMC7555561 DOI: 10.3390/jintelligence8030030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 07/26/2020] [Accepted: 08/10/2020] [Indexed: 11/28/2022] Open
Abstract
The last series of Raven's standard progressive matrices (SPM-LS) test was studied with respect to its psychometric properties in a series of recent papers. In this paper, the SPM-LS dataset is analyzed with regularized latent class models (RLCMs). For dichotomous item response data, an alternative estimation approach based on fused regularization for RLCMs is proposed. For polytomous item responses, different alternative fused regularization penalties are presented. The usefulness of the proposed methods is demonstrated in a simulated data illustration and for the SPM-LS dataset. For the SPM-LS dataset, it turned out the regularized latent class model resulted in five partially ordered latent classes. In total, three out of five latent classes are ordered for all items. For the remaining two classes, violations for two and three items were found, respectively, which can be interpreted as a kind of latent differential item functioning.
Collapse
Affiliation(s)
- Alexander Robitzsch
- IPN—Leibniz Institute for Science and Mathematics Education, D-24098 Kiel, Germany;
- Centre for International Student Assessment (ZIB), D-24098 Kiel, Germany
| |
Collapse
|
11
|
Li Y, Mark B, Raskutti G, Willett R, Song H, Neiman D. Graph-based regularization for regression problems with alignment and highly-correlated designs. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE 2020; 2:480-504. [PMID: 32968717 PMCID: PMC7508309 DOI: 10.1137/19m1287365] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Sparse models for high-dimensional linear regression and machine learning have received substantial attention over the past two decades. Model selection, or determining which features or covariates are the best explanatory variables, is critical to the interpretability of a learned model. Much of the current literature assumes that covariates are only mildly correlated. However, in many modern applications covariates are highly correlated and do not exhibit key properties (such as the restricted eigenvalue condition, restricted isometry property, or other related assumptions). This work considers a high-dimensional regression setting in which a graph governs both correlations among the covariates and the similarity among regression coefficients - meaning there is alignment between the covariates and regression coefficients. Using side information about the strength of correlations among features, we form a graph with edge weights corresponding to pairwise covariances. This graph is used to define a graph total variation regularizer that promotes similar weights for correlated features. This work shows how the proposed graph-based regularization yields mean-squared error guarantees for a broad range of covariance graph structures. These guarantees are optimal for many specific covariance graphs, including block and lattice graphs. Our proposed approach outperforms other methods for highly-correlated design in a variety of experiments on synthetic data and real biochemistry data.
Collapse
Affiliation(s)
- Yuan Li
- Department of Statistics, University of Wisconsin-Madison
| | - Benjamin Mark
- Department of Mathematics, University of Wisconsin-Madison
| | | | - Rebecca Willett
- Departments of Statistics and Computer Science, University of Chicago
| | - Hyebin Song
- Department of Statistics, University of Wisconsin-Madison
| | - David Neiman
- Department of Statistics, University of Wisconsin-Madison
| |
Collapse
|
12
|
Group Guided Fused Laplacian Sparse Group Lasso for Modeling Alzheimer's Disease Progression. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:4036560. [PMID: 32104201 PMCID: PMC7033952 DOI: 10.1155/2020/4036560] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2018] [Revised: 04/27/2019] [Accepted: 06/18/2019] [Indexed: 11/18/2022]
Abstract
As the largest cause of dementia, Alzheimer's disease (AD) has brought serious burdens to patients and their families, mostly in the financial, psychological, and emotional aspects. In order to assess the progression of AD and develop new treatment methods for the disease, it is essential to infer the trajectories of patients' cognitive performance over time to identify biomarkers that connect the patterns of brain atrophy and AD progression. In this article, a structured regularized regression approach termed group guided fused Laplacian sparse group Lasso (GFL-SGL) is proposed to infer disease progression by considering multiple prediction of the same cognitive scores at different time points (longitudinal analysis). The proposed GFL-SGL simultaneously exploits the interrelated structures within the MRI features and among the tasks with sparse group Lasso (SGL) norm and presents a novel group guided fused Laplacian (GFL) regularization. This combination effectively incorporates both the relatedness among multiple longitudinal time points with a general weighted (undirected) dependency graphs and useful inherent group structure in features. Furthermore, an alternating direction method of multipliers- (ADMM-) based algorithm is also derived to optimize the nonsmooth objective function of the proposed approach. Experiments on the dataset from Alzheimer's Disease Neuroimaging Initiative (ADNI) show that the proposed GFL-SGL outperformed some other state-of-the-art algorithms and effectively fused the multimodality data. The compact sets of cognition-relevant imaging biomarkers identified by our approach are consistent with the results of clinical studies.
Collapse
|
13
|
Persistent Feature Analysis of Multimodal Brain Networks Using Generalized Fused Lasso for EMCI Identification. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2020; 12267:44-52. [PMID: 34766172 DOI: 10.1007/978-3-030-59728-3_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Early Mild Cognitive Impairment (EMCI) involves very subtle changes in brain pathological process, and thus identification of EMCI can be challenging. By jointly analyzing cross-information among different neuroimaging data, an increased interest recently emerges in multimodal fusion to better understand clinical measurements with respect to both structural and functional connectivity. In this paper, we propose a novel multimodal brain network modeling method for EMCI identification. Specifically, we employ the structural connectivity based on diffusion tensor imaging (DTI), as a constraint, to guide the regression of BOLD time series from resting state functional magnetic resonance imaging (rs-fMRI). In addition, we introduce multiscale persistent homology features to avoid the uncertainty of regularization parameter selection. An empirical study on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database demonstrates that the proposed method effectively improves classification performance compared with several competing approaches, and reasonably yields connectivity patterns specific to different diagnostic groups.
Collapse
|
14
|
Shi Y, Suk HI, Gao Y, Lee SW, Shen D. Leveraging Coupled Interaction for Multimodal Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:186-200. [PMID: 30908241 DOI: 10.1109/tnnls.2019.2900077] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
As the population becomes older worldwide, accurate computer-aided diagnosis for Alzheimer's disease (AD) in the early stage has been regarded as a crucial step for neurodegeneration care in recent years. Since it extracts the low-level features from the neuroimaging data, previous methods regarded this computer-aided diagnosis as a classification problem that ignored latent featurewise relation. However, it is known that multiple brain regions in the human brain are anatomically and functionally interlinked according to the current neuroscience perspective. Thus, it is reasonable to assume that the extracted features from different brain regions are related to each other to some extent. Also, the complementary information between different neuroimaging modalities could benefit multimodal fusion. To this end, we consider leveraging the coupled interactions in the feature level and modality level for diagnosis in this paper. First, we propose capturing the feature-level coupled interaction using a coupled feature representation. Then, to model the modality-level coupled interaction, we present two novel methods: 1) the coupled boosting (CB) that models the correlation of pairwise coupled-diversity on both inconsistently and incorrectly classified samples between different modalities and 2) the coupled metric ensemble (CME) that learns an informative feature projection from different modalities by integrating the intrarelation and interrelation of training samples. We systematically evaluated our methods with the AD neuroimaging initiative data set. By comparison with the baseline learning-based methods and the state-of-the-art methods that are specially developed for AD/MCI (mild cognitive impairment) diagnosis, our methods achieved the best performance with accuracy of 95.0% and 80.7% (CB), 94.9% and 79.9% (CME) for AD/NC (normal control), and MCI/NC identification, respectively.
Collapse
|
15
|
Zhang X, Zhang Q, Wang X, Ma S, Fang K. Structured sparse logistic regression with application to lung cancer prediction using breath volatile biomarkers. Stat Med 2019; 39:955-967. [PMID: 31880351 DOI: 10.1002/sim.8454] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Revised: 09/24/2019] [Accepted: 11/21/2019] [Indexed: 11/10/2022]
Abstract
This article is motivated by a study of lung cancer prediction using breath volatile organic compound (VOC) biomarkers, where the challenge is that the predictors include not only high-dimensional time-dependent or functional VOC features but also the time-independent clinical variables. We consider a high-dimensional logistic regression and propose two different penalties: group spline-penalty or group smooth-penalty to handle the group structures of the time-dependent variables in the model. The new methods have the advantage for the situation where the model coefficients are sparse but change smoothly within the group, compared with other existing methods such as the group lasso and the group bridge approaches. Our methods are easy to implement since they can be turned into a group minimax concave penalty problem after certain transformations. We show that our fitting algorithm possesses the descent property and leads to attractive convergence properties. The simulation studies and the lung cancer application are performed to demonstrate the accuracy and stability of the proposed approaches.
Collapse
Affiliation(s)
- Xiaochen Zhang
- Department of Statistics, School of Economics, Xiamen University, China
| | - Qingzhao Zhang
- Department of Statistics, School of Economics, Xiamen University, China.,The Wang Yanan Institute for Studies in Economics, Xiamen University, China
| | - Xiaofeng Wang
- Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut
| | - Kuangnan Fang
- Department of Statistics, School of Economics, Xiamen University, China
| |
Collapse
|
16
|
Jiang P, Wang X, Li Q, Jin L, Li S. Correlation-Aware Sparse and Low-Rank Constrained Multi-Task Learning for Longitudinal Analysis of Alzheimer's Disease. IEEE J Biomed Health Inform 2019; 23:1450-1456. [DOI: 10.1109/jbhi.2018.2885331] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|