1
|
Ma Y, Shi Y, Chen X, Zhang B, Wu H, Gao J. NFMCLDA: Predicting miRNA-based lncRNA-disease associations by network fusion and matrix completion. Comput Biol Med 2024; 174:108403. [PMID: 38582002 DOI: 10.1016/j.compbiomed.2024.108403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/28/2024] [Accepted: 04/01/2024] [Indexed: 04/08/2024]
Abstract
In recent years, emerging evidence has revealed a strong association between dysregulations of long non-coding RNAs (lncRNAs) and sophisticated human diseases. Biological experiments are adequate to identify such associations, but they are costly and time-consuming. Therefore, developing high-quality computational methods is a challenging and urgent task in the field of bioinformatics. This paper proposes a new lncRNA-disease association inference approach NFMCLDA (Network Fusion and Matrix Completion lncRNA-Disease Association), which can effectively integrate multi-source association data. In this approach, miRNA information is used as the transition path, and an unbalanced random walk method on three-layer heterogeneous network is adopted in the preprocessing. Therefore, more effective information between networks can be mined and the sparsity problem of the association matrix can be solved. Finally, the matrix completion method accurately predicts associations. The results show that NFMCLDA can provide more accurate lncRNA-disease associations than state-of-the-art methods. The areas under the receiver operating characteristic curves are 0.9648 and 0.9713, respectively, through the cross-validation of 5-fold and 10-fold. Data from published case studies on four diseases - lung cancer, osteosarcoma, cervical cancer, and colon cancer - have confirmed the reliable predictive potential of NFMCLDA model.
Collapse
Affiliation(s)
- Yibing Ma
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Yongle Shi
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Xiang Chen
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Bai Zhang
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Hanwen Wu
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Jie Gao
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China.
| |
Collapse
|
2
|
Sleem OM, Ashour M, Aybat NS, Lagoa CM. Lp Quasi-norm Minimization: Algorithm and Applications. Res Sq 2023:rs.3.rs-3632062. [PMID: 38076799 PMCID: PMC10705602 DOI: 10.21203/rs.3.rs-3632062/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Sparsity finds applications in diverse areas such as statistics, machine learning, and signal processing. Computations over sparse structures are less complex compared to their dense counterparts and need less storage. This paper proposes a heuristic method for retrieving sparse approximate solutions of optimization problems via minimizing the ℓ p quasi-norm, where 0 < p < 1 . An iterative two-block algorithm for minimizing the ℓ p quasi-norm subject to convex constraints is proposed. The proposed algorithm requires solving for the roots of a scalar degree polynomial as opposed to applying a soft thresholding operator in the case of ℓ 1 norm minimization. The algorithm's merit relies on its ability to solve the ℓ p quasi-norm minimization subject to any convex constraints set. For the specific case of constraints defined by differentiable functions with Lipschitz continuous gradient, a second, faster algorithm is proposed. Using a proximal gradient step, we mitigate the convex projection step and hence enhance the algorithm's speed while proving its convergence. We present various applications where the proposed algorithm excels, namely, sparse signal reconstruction, system identification, and matrix completion. The results demonstrate the significant gains obtained by the proposed algorithm compared to other ℓ p quasi-norm based methods presented in previous literature.
Collapse
Affiliation(s)
- Omar M. Sleem
- Department of Electrical Engineering, Pennsylvania State University, State College, PA, 16802 USA
| | - M.E. Ashour
- Wireless R&D Department, Qualcomm Technologies, Inc, San Diego, CA, 92121, USA
| | - N. S. Aybat
- Department of Industrial and Manufacturing Engineering, Pennsylvania State University, State College, PA, 16802 USA
| | - Constantino M. Lagoa
- Department of Electrical Engineering, Pennsylvania State University, State College, PA, 16802 USA
| |
Collapse
|
3
|
O'Reilly T, Börnert P, Liu H, Webb A, Koolstra K. 3D magnetic resonance fingerprinting on a low-field 50 mT point-of-care system prototype: evaluation of muscle and lipid relaxation time mapping and comparison with standard techniques. MAGMA 2023:10.1007/s10334-023-01092-0. [PMID: 37202655 PMCID: PMC10386962 DOI: 10.1007/s10334-023-01092-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/11/2023] [Accepted: 04/17/2023] [Indexed: 05/20/2023]
Abstract
OBJECTIVE To implement magnetic resonance fingerprinting (MRF) on a permanent magnet 50 mT low-field system deployable as a future point-of-care (POC) unit and explore the quality of the parameter maps. MATERIALS AND METHODS 3D MRF was implemented on a custom-built Halbach array using a slab-selective spoiled steady-state free precession sequence with 3D Cartesian readout. Undersampled scans were acquired with different MRF flip angle patterns and reconstructed using matrix completion and matched to the simulated dictionary, taking excitation profile and coil ringing into account. MRF relaxation times were compared to that of inversion recovery (IR) and multi-echo spin echo (MESE) experiments in phantom and in vivo. Furthermore, B0 inhomogeneities were encoded in the MRF sequence using an alternating TE pattern, and the estimated map was used to correct for image distortions in the MRF images using a model-based reconstruction. RESULTS Phantom relaxation times measured with an optimized MRF sequence for low field were in better agreement with reference techniques than for a standard MRF sequence. In vivo muscle relaxation times measured with MRF were longer than those obtained with an IR sequence (T1: 182 ± 21.5 vs 168 ± 9.89 ms) and with an MESE sequence (T2: 69.8 ± 19.7 vs 46.1 ± 9.65 ms). In vivo lipid MRF relaxation times were also longer compared with IR (T1: 165 ± 15.1 ms vs 127 ± 8.28 ms) and with MESE (T2: 160 ± 15.0 ms vs 124 ± 4.27 ms). Integrated ΔB0 estimation and correction resulted in parameter maps with reduced distortions. DISCUSSION It is possible to measure volumetric relaxation times with MRF at 2.5 × 2.5 × 3.0 mm3 resolution in a 13 min scan time on a 50 mT permanent magnet system. The measured MRF relaxation times are longer compared to those measured with reference techniques, especially for T2. This discrepancy can potentially be addressed by hardware, reconstruction and sequence design, but long-term reproducibility needs to be further improved.
Collapse
Affiliation(s)
- Thomas O'Reilly
- Radiology, C.J. Gorter Center for MRI, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Peter Börnert
- Radiology, C.J. Gorter Center for MRI, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
- Philips Research, Röntgenstraβe 24-26, 22335, Hamburg, Germany
| | - Hongyan Liu
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Imaging Sciences, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Andrew Webb
- Radiology, C.J. Gorter Center for MRI, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Kirsten Koolstra
- Radiology, Division of Image Processing, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands.
| |
Collapse
|
4
|
Gnecco G, Landi S, Riccaboni M. The emergence of social soft skill needs in the post COVID-19 era. Qual Quant 2023:1-34. [PMID: 37359962 PMCID: PMC10107589 DOI: 10.1007/s11135-023-01659-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/23/2023] [Indexed: 06/28/2023]
Abstract
Social soft skills are crucial for workers to perform their tasks, yet it is hard to train people on them and to readapt their skill set when needed. In the present work, we analyze the possible effects of the COVID-19 pandemic on social soft skills in the context of Italian occupations related to 88 economic sectors and 14 age groups. We leverage detailed information coming from ICP (i.e. the Italian equivalent of O*Net), provided by the Italian National Institute for the Analysis of Public Policy, from the microdata for research on the continuous detection of labor force, provided by the Italian National Institute of Statistics (ISTAT), and from ISTAT data on the Italian population. Based on these data, we simulate the impact of COVID-19 on workplace characteristics and working styles that were more severely affected by the lockdown measures and the sanitary dispositions during the pandemic (e.g. physical proximity, face-to-face discussions, working remotely). We then apply matrix completion-a machine-learning technique often used in the context of recommender systems-to predict the average variation in the social soft skills importance levels required for each occupation when working conditions change, as some changes might be persistent in the near future. Professions, sectors, and age groups showing negative average variations are exposed to a deficit in their social soft-skills endowment, which might ultimately lead to lower productivity.
Collapse
Affiliation(s)
- Giorgio Gnecco
- Scuola IMT Alti Studi Lucca, Piazza S. Francesco, 19, Lucca, Italy
| | - Sara Landi
- LUISS University, Viale Romania, 32, Rome, Italy
| | | |
Collapse
|
5
|
Nethery RC, Katz-Christy N, Kioumourtzoglou MA, Parks RM, Schumacher A, Anderson GB. Integrated causal-predictive machine learning models for tropical cyclone epidemiology. Biostatistics 2023; 24:449-464. [PMID: 34962265 PMCID: PMC10102905 DOI: 10.1093/biostatistics/kxab047] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 11/04/2021] [Accepted: 11/24/2021] [Indexed: 11/13/2022] Open
Abstract
Strategic preparedness reduces the adverse health impacts of hurricanes and tropical storms, referred to collectively as tropical cyclones (TCs), but its protective impact could be enhanced by a more comprehensive and rigorous characterization of TC epidemiology. To generate the insights and tools necessary for high-precision TC preparedness, we introduce a machine learning approach that standardizes estimation of historic TC health impacts, discovers common patterns and sources of heterogeneity in those health impacts, and enables identification of communities at highest health risk for future TCs. The model integrates (i) a causal inference component to quantify the immediate health impacts of recent historic TCs at high spatial resolution and (ii) a predictive component that captures how TC meteorological features and socioeconomic/demographic characteristics of impacted communities are associated with health impacts. We apply it to a rich data platform containing detailed historic TC exposure information and records of all-cause mortality and cardiovascular- and respiratory-related hospitalization among Medicare recipients. We report a high degree of heterogeneity in the acute health impacts of historic TCs, both within and across TCs, and, on average, substantial TC-attributable increases in respiratory hospitalizations. TC-sustained windspeeds are found to be the primary driver of mortality and respiratory risks.
Collapse
Affiliation(s)
- Rachel C Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 655 Huntington Ave, Boston, MA, USA
| | - Nina Katz-Christy
- Department of Statistics, Harvard University, 1 Oxford St, Cambridge, MA, USA
| | - Marianthi-Anna Kioumourtzoglou
- Department of Environmental Health Sciences, Columbia Mailman School of Public Health, 722 W. 168th Street, New York City, NY, USA
| | - Robbie M Parks
- Department of Environmental Health Sciences, Columbia Mailman School of Public Health, 722 W. 168th Street, New York City, NY, USA
| | - Andrea Schumacher
- Cooperative Institute for Research in the Atmosphere, Colorado State University, 3925A West Laporte Ave, Fort Collins, CO, USA
| | - G Brooke Anderson
- Department of Environmental & Radiological Health Sciences, Colorado State University, 122A Environmental Health Building, Fort Collins, CO, USA
| |
Collapse
|
6
|
Zhang GZ, Gao YL. BRWMC: Predicting lncRNA-disease associations based on bi-random walk and matrix completion on disease and lncRNA networks. Comput Biol Chem 2023; 103:107833. [PMID: 36812824 DOI: 10.1016/j.compbiolchem.2023.107833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 12/29/2022] [Accepted: 02/15/2023] [Indexed: 02/19/2023]
Abstract
Many experiments have proved that long non-coding RNAs (lncRNAs) in humans have been implicated in disease development. The prediction of lncRNA-disease association is essential in promoting disease treatment and drug development. It is time-consuming and laborious to explore the relationship between lncRNA and diseases in the laboratory. The computation-based approach has clear advantages and has become a promising research direction. This paper proposes a new lncRNA disease association prediction algorithm BRWMC. Firstly, BRWMC constructed several lncRNA (disease) similarity networks based on different measurement angles and fused them into an integrated similarity network by similarity network fusion (SNF). In addition, the random walk method is used to preprocess the known lncRNA-disease association matrix and calculate the estimated scores of potential lncRNA-disease associations. Finally, the matrix completion method accurately predicts the potential lncRNA-disease associations. Under the framework of leave-one-out cross-validation and 5-fold cross-validation, the AUC values obtained by BRWMC are 0.9610 and 0.9739, respectively. In addition, case studies of three common diseases show that BRWMC is a reliable method for prediction.
Collapse
Affiliation(s)
- Guo-Zheng Zhang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, China.
| |
Collapse
|
7
|
Shen Y, Liu JX, Yin MM, Zheng CH, Gao YL. BMPMDA: Prediction of MiRNA-Disease Associations Using a Space Projection Model Based on Block Matrix. Interdiscip Sci 2023; 15:88-99. [PMID: 36335274 DOI: 10.1007/s12539-022-00542-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 10/13/2022] [Accepted: 10/14/2022] [Indexed: 11/07/2022]
Abstract
With the high-quality development of bioinformatics technology, miRNA-disease associations (MDAs) are gradually being uncovered. At present, convenient and efficient prediction methods, which solve the problem of resource-consuming in traditional wet experiments, need to be further put forward. In this study, a space projection model based on block matrix is presented for predicting MDAs (BMPMDA). Specifically, two block matrices are first composed of the known association matrix and similarity to increase comprehensiveness. For the integrity of information in the heterogeneous network, matrix completion (MC) is utilized to mine potential MDAs. Considering the neighborhood information of data points, linear neighborhood similarity (LNS) is regarded as a measure of similarity. Next, LNS is projected onto the corresponding completed association matrix to derive the projection score. Finally, the AUC and AUPR values for BMPMDA reach 0.9691 and 0.6231, respectively. Additionally, the majority of novel MDAs in three disease cases are identified in existing databases and literature. It suggests that BMPMDA can serve as a reliable prediction model for biological research.
Collapse
Affiliation(s)
- Yi Shen
- Qufu Normal University, Rizhao, 276800, China
| | | | | | - Chun-Hou Zheng
- Co-Innovation Center for Information Supply and Assurance Technology, Anhui University, Hefei, 230000, China
| | - Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, 276800, China.
| |
Collapse
|
8
|
Mankad J, Natarajan B, Srinivasan B. Integrated approach for optimal sensor placement and state estimation: A case study on water distribution networks. ISA Trans 2022; 123:272-285. [PMID: 34130860 DOI: 10.1016/j.isatra.2021.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 06/03/2021] [Accepted: 06/03/2021] [Indexed: 06/12/2023]
Abstract
The objective of the design and operation of any water distribution network (WDN) includes meeting the desired demand at sufficient pressure at all nodes. However, this requires situational awareness; in other words, the knowledge of system state variables such as pressure and flow throughout the network. In this work, a hybrid approach is developed for sensor placement (SP) and state estimation (SE) that exploits the underlying correlation structure in the data, along with the principles governing the flow through circular pipes. The problem of SP in WDN is addressed since measuring the state variables throughout the network is not practical. The problem of SE that maps to a matrix completion problem under certain physical and logical constraints is solved later. The completed matrix represents the state of WDN at any given time. Benchmark networks used in literature were used to evaluate the proposed approach. The mean absolute percentage error (MAPE) of less than 5% was obtained while estimating the head available at nodes. The knowledge of the states in the entire network could help operate the network adaptively.
Collapse
Affiliation(s)
- Jaivik Mankad
- Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat, India
| | | | | |
Collapse
|
9
|
Abstract
BACKGROUND Circular RNAs (circRNAs) are a class of single-stranded RNA molecules with a closed-loop structure. A growing body of research has shown that circRNAs are closely related to the development of diseases. Because biological experiments to verify circRNA-disease associations are time-consuming and wasteful of resources, it is necessary to propose a reliable computational method to predict the potential candidate circRNA-disease associations for biological experiments to make them more efficient. RESULTS In this paper, we propose a double matrix completion method (DMCCDA) for predicting potential circRNA-disease associations. First, we constructed a similarity matrix of circRNA and disease according to circRNA sequence information and semantic disease information. We also built a Gauss interaction profile similarity matrix for circRNA and disease based on experimentally verified circRNA-disease associations. Then, the corresponding circRNA sequence similarity and semantic similarity of disease are used to update the association matrix from the perspective of circRNA and disease, respectively, by matrix multiplication. Finally, from the perspective of circRNA and disease, matrix completion is used to update the matrix block, which is formed by splicing the association matrix obtained in the previous step with the corresponding Gaussian similarity matrix. Compared with other approaches, the model of DMCCDA has a relatively good result in leave-one-out cross-validation and five-fold cross-validation. Additionally, the results of the case studies illustrate the effectiveness of the DMCCDA model. CONCLUSION The results show that our method works well for recommending the potential circRNAs for a disease for biological experiments.
Collapse
Affiliation(s)
- Zong-Lan Zuo
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China
| | - Rui-Fen Cao
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, Fujian, China
| | - Pi-Jing Wei
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Jun-Feng Xia
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei, China.
| |
Collapse
|
10
|
Abstract
BACKGROUND Matrix factorization methods are linear models, with limited capability to model complex relations. In our work, we use tropical semiring to introduce non-linearity into matrix factorization models. We propose a method called Sparse Tropical Matrix Factorization (STMF) for the estimation of missing (unknown) values in sparse data. RESULTS We evaluate the efficiency of the STMF method on both synthetic data and biological data in the form of gene expression measurements downloaded from The Cancer Genome Atlas (TCGA) database. Tests on unique synthetic data showed that STMF approximation achieves a higher correlation than non-negative matrix factorization (NMF), which is unable to recover patterns effectively. On real data, STMF outperforms NMF on six out of nine gene expression datasets. While NMF assumes normal distribution and tends toward the mean value, STMF can better fit to extreme values and distributions. CONCLUSION STMF is the first work that uses tropical semiring on sparse data. We show that in certain cases semirings are useful because they consider the structure, which is different and simpler to understand than it is with standard linear algebra.
Collapse
Affiliation(s)
- Amra Omanović
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1000 Ljubljana, Slovenia
| | - Hilal Kazan
- Department of Computer Engineering, Antalya Bilim University, Çıplaklı, Akdeniz Blv. No:290/A, 07190 Antalya, Turkey
| | - Polona Oblak
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1000 Ljubljana, Slovenia
| | - Tomaž Curk
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1000 Ljubljana, Slovenia
| |
Collapse
|
11
|
Wu TR, Yin MM, Jiao CN, Gao YL, Kong XZ, Liu JX. MCCMF: collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations. BMC Bioinformatics 2020; 21:454. [PMID: 33054708 PMCID: PMC7556955 DOI: 10.1186/s12859-020-03799-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 10/02/2020] [Indexed: 02/06/2023] Open
Abstract
Background MicroRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a method, collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations. Results The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix. Then the Weight K Nearest Known Neighbors method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the fivefold cross-validation, with an AUC of 0.9569 (0.0005). Conclusions The AUC value of MCCMF is higher than other advanced methods in the fivefold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.
Collapse
Affiliation(s)
- Tian-Ru Wu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Meng-Meng Yin
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Cui-Na Jiao
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Ying-Lian Gao
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Xiang-Zhen Kong
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, China.
| |
Collapse
|
12
|
Wu X, Lan W, Chen Q, Dong Y, Liu J, Peng W. Inferring LncRNA-disease associations based on graph autoencoder matrix completion. Comput Biol Chem 2020; 87:107282. [PMID: 32502934 DOI: 10.1016/j.compbiolchem.2020.107282] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 04/01/2020] [Accepted: 05/09/2020] [Indexed: 02/09/2023]
Abstract
Accumulating studies have indicated that long non-coding RNAs (lncRNAs) play crucial roles in large amount of biological processes. Predicting lncRNA-disease associations can help biologist to understand the molecular mechanism of human disease and benefit for disease diagnosis, treatment and prevention. In this paper, we introduce a computational framework based on graph autoencoder matrix completion (GAMCLDA) to identify lncRNA-disease associations. In our method, the graph convolutional network is utilized to encode local graph structure and features of nodes for learning latent factor vectors of lncRNA and disease. Further, the inner product of lncRNA factor vector and disease factor vector is used as decoder to reconstruct the lncRNA-disease association matrix. In addition, the cost-sensitive neural network is utilized to deal with the imbalance between positive and negative samples. The experimental results show GAMLDA outperforms other state-of-the-art methods in prediction performance which is evaluated by AUC value, AUPR value, PPV and F1-score. Moreover, the case study shows our method is the effectively tool for potential lncRNA-disease prediction.
Collapse
Affiliation(s)
- Ximin Wu
- School of Computer, Electronic and Information, Guangxi University, Nanning, China.
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China; Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, China.
| | - Yi Dong
- School of Computer, Electronic and Information, Guangxi University, Nanning, China.
| | - Jin Liu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Wei Peng
- The Network Center, Kunming University of Science and Technology, Kunming, China.
| |
Collapse
|
13
|
Ren F, Wen R. A new method based on the manifold-alternative approximating for low-rank matrix completion. J Inequal Appl 2018; 2018:340. [PMID: 30839894 PMCID: PMC6290672 DOI: 10.1186/s13660-018-1931-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 12/03/2018] [Indexed: 06/09/2023]
Abstract
In this paper, a new method is proposed for low-rank matrix completion which is based on the least squares approximating to the known elements in the manifold formed by the singular vectors of the partial singular value decomposition alternatively. The method can achieve a reduction of the rank of the manifold by gradually reducing the number of the singular value of the thresholding and get the optimal low-rank matrix. It is proven that the manifold-alternative approximating method is convergent under some conditions. Furthermore, compared with the augmented Lagrange multiplier and the orthogonal rank-one matrix pursuit algorithms by random experiments, it is more effective as regards the CPU time and the low-rank property.
Collapse
Affiliation(s)
- Fujiao Ren
- Department of Mathematics, Taiyuan Normal University, Shanxi, P.R. China
| | - Ruiping Wen
- Key Laboratory of Engineering & Computing Science, Shanxi Provincial Department of Education/Department of Mathematics, Taiyuan Normal University, Shanxi, P.R. China
| |
Collapse
|
14
|
Lu J, Sun J, Wang X, Kranzler H, Gelernter J, Bi J. Inferring phenotypes from substance use via collaborative matrix completion. BMC Syst Biol 2018; 12:104. [PMID: 30463556 PMCID: PMC6249733 DOI: 10.1186/s12918-018-0623-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Although substance use disorders (SUDs) are heritable, few genetic risk factors for them have been identified, in part due to the small sample sizes of study populations. To address this limitation, researchers have aggregated subjects from multiple existing genetic studies, but these subjects can have missing phenotypic information, including diagnostic criteria for certain substances that were not originally a focus of study. Recent advances in addiction neurobiology have shown that comorbid SUDs (e.g., the abuse of multiple substances) have similar genetic determinants, which makes it possible to infer missing SUD diagnostic criteria using criteria from another SUD and patient genotypes through statistical modeling. RESULTS We propose a new approach based on matrix completion techniques to integrate features of comorbid health conditions and individual's genotypes to infer unreported diagnostic criteria for a disorder. This approach optimizes a bi-linear model that uses the interactions between known disease correlations and candidate genes to impute missing criteria. An efficient stochastic and parallel algorithm was developed to optimize the model with a speed 20 times greater than the classic sequential algorithm. It was tested on 3441 subjects who had both cocaine and opioid use disorders and successfully inferred missing diagnostic criteria with consistently better accuracy than other recent statistical methods. CONCLUSIONS The proposed matrix completion imputation method is a promising tool to impute unreported or unobserved symptoms or criteria for disease diagnosis. Integrating data at multiple scales or from heterogeneous sources may help improve the accuracy of phenotype imputation.
Collapse
Affiliation(s)
- Jin Lu
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT, USA
| | - Jiangwen Sun
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT, USA
| | - Xinyu Wang
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT, USA
| | - Henry Kranzler
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, 3535 Market Street, Suite 500 and Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
| | - Joel Gelernter
- Departments of Psychiatry, Genetics, and Neurobiology, Yale University School of Medicine, 333 Cedar St, New Haven, CT, USA
| | - Jinbo Bi
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT, USA.
| |
Collapse
|
15
|
Abstract
Background While there are a large number of bioinformatics datasets for clustering, many of them are incomplete, i.e., missing attribute values in some data samples needed by clustering algorithms. A variety of clustering algorithms have been proposed in the past years, but they usually are limited to cluster on the complete dataset. Besides, conventional clustering algorithms cannot obtain a trade-off between accuracy and efficiency of the clustering process since many essential parameters are determined by the human user’s experience. Results The paper proposes a Multiple Kernel Density Clustering algorithm for Incomplete datasets called MKDCI. The MKDCI algorithm consists of recovering missing attribute values of input data samples, learning an optimally combined kernel for clustering the input dataset, reducing dimensionality with the optimal kernel based on multiple basis kernels, detecting cluster centroids with the Isolation Forests method, assigning clusters with arbitrary shape and visualizing the results. Conclusions Extensive experiments on several well-known clustering datasets in bioinformatics field demonstrate the effectiveness of the proposed MKDCI algorithm. Compared with existing density clustering algorithms and parameter-free clustering algorithms, the proposed MKDCI algorithm tends to automatically produce clusters of better quality on the incomplete dataset in bioinformatics.
Collapse
Affiliation(s)
- Longlong Liao
- College of Computer, National University of Defense Technology, Sanyi Road, Changsha, China.,State Key Laboratory of High Performance Computing, Sanyi Road, Changsha, China
| | - Kenli Li
- College of Information Science and Engineering, Hunan University, Lushan Road, Changsha, China.
| | - Keqin Li
- Department of Computer Science, State University of New York, Road, New Paltz, USA
| | - Canqun Yang
- College of Computer, National University of Defense Technology, Sanyi Road, Changsha, China.,State Key Laboratory of High Performance Computing, Sanyi Road, Changsha, China
| | - Qi Tian
- Department of Computer Science, University of Texas at San Antonio, Road, San Antonio, USA
| |
Collapse
|
16
|
Abstract
BACKGROUND Human Microbiome Project reveals the significant mutualistic influence between human body and microbes living in it. Such an influence lead to an interesting phenomenon that many noninfectious diseases are closely associated with diverse microbes. However, the identification of microbe-noninfectious disease associations (MDAs) is still a challenging task, because of both the high cost and the limitation of microbe cultivation. Thus, there is a need to develop fast approaches to screen potential MDAs. The growing number of validated MDAs enables us to meet the demand in a new insight. Computational approaches, especially machine learning, are promising to predict MDA candidates rapidly among a large number of microbe-disease pairs with the advantage of no limitation on microbe cultivation. Nevertheless, a few computational efforts at predicting MDAs are made so far. RESULTS In this paper, grouping a set of MDAs into a binary MDA matrix, we propose a novel predictive approach (BMCMDA) based on Binary Matrix Completion to predict potential MDAs. The proposed BMCMDA assumes that the incomplete observed MDA matrix is the summation of a latent parameterizing matrix and a noising matrix. It also assumes that the independently occurring subscripts of observed entries in the MDA matrix follows a binomial model. Adopting a standard mean-zero Gaussian distribution for the nosing matrix, we model the relationship between the parameterizing matrix and the MDA matrix under the observed microbe-disease pairs as a probit regression. With the recovered parameterizing matrix, BMCMDA deduces how likely a microbe would be associated with a particular disease. In the experiment under leave-one-out cross-validation, it exhibits the inspiring performance (AUC = 0.906, AUPR =0.526) and demonstrates its superiority by ~ 7% and ~ 5% improvements in terms of AUC and AUPR respectively in the comparison with the pioneering approach KATZHMDA. CONCLUSIONS Our BMCMDA provides an effective approach for predicting MDAs and can be also extended to other similar predicting tasks of binary relationship (e.g. protein-protein interaction, drug-target interaction).
Collapse
Affiliation(s)
- Jian-Yu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, 70072 China
| | - Hua Huang
- School of Software and Microelectronics, Northwestern Polytechnical University, Xi’an, 70072 China
| | - Yan-Ning Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 70072 China
| | - Jiang-Bo Cao
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, 70072 China
| | - Siu-Ming Yiu
- Department of Computer Science, The University of Hong Kong, Hong Kong, 999077 China
| |
Collapse
|
17
|
Thung KH, Yap PT, Adeli E, Lee SW, Shen D. Conversion and time-to-conversion predictions of mild cognitive impairment using low-rank affinity pursuit denoising and matrix completion. Med Image Anal 2018; 45:68-82. [PMID: 29414437 PMCID: PMC6892173 DOI: 10.1016/j.media.2018.01.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Revised: 12/12/2017] [Accepted: 01/12/2018] [Indexed: 10/18/2022]
Abstract
In this paper, we aim to predict conversion and time-to-conversion of mild cognitive impairment (MCI) patients using multi-modal neuroimaging data and clinical data, via cross-sectional and longitudinal studies. However, such data are often heterogeneous, high-dimensional, noisy, and incomplete. We thus propose a framework that includes sparse feature selection, low-rank affinity pursuit denoising (LRAD), and low-rank matrix completion (LRMC) in this study. Specifically, we first use sparse linear regressions to remove unrelated features. Then, considering the heterogeneity of the MCI data, which can be assumed as a union of multiple subspaces, we propose to use a low rank subspace method (i.e., LRAD) to denoise the data. Finally, we employ LRMC algorithm with three data fitting terms and one inequality constraint for joint conversion and time-to-conversion predictions. Our framework aims to answer a very important but yet rarely explored question in AD study, i.e., when will the MCI convert to AD? This is different from survival analysis, which provides the probabilities of conversion at different time points that are mainly used for global analysis, while our time-to-conversion prediction is for each individual subject. Evaluations using the ADNI dataset indicate that our method outperforms conventional LRMC and other state-of-the-art methods. Our method achieves a maximal pMCI classification accuracy of 84% and time prediction correlation of 0.665.
Collapse
Affiliation(s)
- Kim-Han Thung
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA.
| | - Pew-Thian Yap
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA
| | - Ehsan Adeli
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA
| | - Seong-Whan Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill 27599, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
18
|
Biswas AK, Kim D, Kang M, Ding C, Gao JX. Stable solution to l 2,1-based robust inductive matrix completion and its application in linking long noncoding RNAs to human diseases. BMC Med Genomics 2017; 10:77. [PMID: 29297358 PMCID: PMC5751820 DOI: 10.1186/s12920-017-0310-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Backgrounds A large number of long intergenic non-coding RNAs (lincRNAs) are linked to a broad spectrum of human diseases. The disease association with many other lincRNAs still remain as puzzle. Validation of such links between the two entities through biological experiments are expensive. However, a plethora lincRNA-data are available now, thanks to the High Throughput Sequencing (HTS) platforms, Genome Wide Association Studies (GWAS), etc, which opens the opportunity for cutting-edge machine learning and data mining approaches to extract meaningful relationships among lincRNAs and diseases. However, there are only a few in silico lincRNA-disease association inference tools available to date, and none of them utilizes side information of both the entities simultaneously in a single framework. Methods The recently developed Inductive Matrix Completion (IMC) technique provides a recommendation platform among two entities considering respective side information about them. However, the formulation of IMC is incapable of handling noise and outliers that may be present in the datasets, while data sparsity consideration is another issue with the standard IMC method. Thus, a robust version of IMC is needed that can solve the two issues. As a remedy, in this paper, we propose Stable Robust Inductive Matrix Completion (SRIMC) that utilizes the l2,1 norm based regularization to optimize the objective function with a unique 2-step stable solution approach. Results We applied SRIMC to the available association data between human lincRNAs and OMIM disease phenotypes as well as a diverse set of side information about the lincRNAs and the diseases. The method performs better than the state-of-the-art methods in terms of precision@k and recall@k at the top-k disease prioritization to the subject lincRNAs. We also demonstrate that SRIMC is equally effective for querying about novel lincRNAs, as well as predicting rank of a newly known disease for a set of well-characterized lincRNAs. Conclusions With the experimental results and computational evaluation, we show that SRIMC is robust in handling datasets with noise and outliers as well as dealing with novel lincRNAs and disease phenotypes.
Collapse
Affiliation(s)
- Ashis Kumer Biswas
- Department of Computer Science and Engineering, University of Colorado Denver, Denver, 80204, Colorado, USA
| | - Dongchul Kim
- Department of Computer Science, University of Rio Grande Valley, Edinburg, 78541, Texas, USA
| | - Mingon Kang
- Department of Computer Science, Kennesaw State University, Marietta, 30060, Georgia, USA
| | - Chris Ding
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, 76019, Texas, USA
| | - Jean X Gao
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, 76019, Texas, USA.
| |
Collapse
|
19
|
Gutierrez-Barragan F, Ithapu VK, Hinrichs C, Maumet C, Johnson SC, Nichols TE, Singh V; Alzheimer's Disease Neuroimaging Initiative. Accelerating permutation testing in voxel-wise analysis through subspace tracking: A new plugin for SnPM. Neuroimage 2017; 159:79-98. [PMID: 28720551 DOI: 10.1016/j.neuroimage.2017.07.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 07/11/2017] [Accepted: 07/12/2017] [Indexed: 10/19/2022] Open
Abstract
Permutation testing is a non-parametric method for obtaining the max null distribution used to compute corrected p-values that provide strong control of false positives. In neuroimaging, however, the computational burden of running such an algorithm can be significant. We find that by viewing the permutation testing procedure as the construction of a very large permutation testing matrix, T, one can exploit structural properties derived from the data and the test statistics to reduce the runtime under certain conditions. In particular, we see that T is low-rank plus a low-variance residual. This makes T a good candidate for low-rank matrix completion, where only a very small number of entries of T (∼0.35% of all entries in our experiments) have to be computed to obtain a good estimate. Based on this observation, we present RapidPT, an algorithm that efficiently recovers the max null distribution commonly obtained through regular permutation testing in voxel-wise analysis. We present an extensive validation on a synthetic dataset and four varying sized datasets against two baselines: Statistical NonParametric Mapping (SnPM13) and a standard permutation testing implementation (referred as NaivePT). We find that RapidPT achieves its best runtime performance on medium sized datasets (50≤n≤200), with speedups of 1.5× - 38× (vs. SnPM13) and 20x-1000× (vs. NaivePT). For larger datasets (n≥200) RapidPT outperforms NaivePT (6× - 200×) on all datasets, and provides large speedups over SnPM13 when more than 10000 permutations (2× - 15×) are needed. The implementation is a standalone toolbox and also integrated within SnPM13, able to leverage multi-core architectures when available.
Collapse
|
20
|
Chen L, Zhang H, Thung KH, Liu L, Lu J, Wu J, Wang Q, Shen D. Multi-label Inductive Matrix Completion for Joint MGMT and IDH1 Status Prediction for Glioma Patients. Med Image Comput Comput Assist Interv 2017; 10434:450-458. [PMID: 29770368 DOI: 10.1007/978-3-319-66185-8_51] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
MGMT promoter methylation and IDH1 mutation in high-grade gliomas (HGG) have proven to be the two important molecular indicators associated with better prognosis. Traditionally, the statuses of MGMT and IDH1 are obtained via surgical biopsy, which is laborious, invasive and time-consuming. Accurate presurgical prediction of their statuses based on preoperative imaging data is of great clinical value towards better treatment plan. In this paper, we propose a novel Multi-label Inductive Matrix Completion (MIMC) model, highlighted by the online inductive learning strategy, to jointly predict both MGMT and IDH1 statuses. Our MIMC model not only uses the training subjects with possibly missing MGMT/IDH1 labels, but also leverages the unlabeled testing subjects as a supplement to the limited training dataset. More importantly, we learn inductive labels, instead of directly using transductive labels, as the prediction results for the testing subjects, to alleviate the overfitting issue in small-sample-size studies. Furthermore, we design an optimization algorithm with guaranteed convergence based on the block coordinate descent method to solve the multivariate non-smooth MIMC model. Finally, by using a precious single-center multi-modality presurgical brain imaging and genetic dataset of primary HGG, we demonstrate that our method can produce accurate prediction results, outperforming the previous widely-used single- or multi-task machine learning methods. This study shows the promise of utilizing imaging-derived brain connectome phenotypes for prognosis of HGG in a non-invasive manner.
Collapse
Affiliation(s)
- Lei Chen
- Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing, China.,Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Han Zhang
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Kim-Han Thung
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Luyan Liu
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Junfeng Lu
- Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China.,Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
| | - Jinsong Wu
- Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China.,Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
| | - Qian Wang
- School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| |
Collapse
|
21
|
Adeli E, Shi F, An L, Wee CY, Wu G, Wang T, Shen D. Joint feature-sample selection and robust diagnosis of Parkinson's disease from MRI data. Neuroimage 2016; 141:206-219. [PMID: 27296013 DOI: 10.1016/j.neuroimage.2016.05.054] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Revised: 03/31/2016] [Accepted: 05/22/2016] [Indexed: 01/27/2023] Open
Abstract
Parkinson's disease (PD) is an overwhelming neurodegenerative disorder caused by deterioration of a neurotransmitter, known as dopamine. Lack of this chemical messenger impairs several brain regions and yields various motor and non-motor symptoms. Incidence of PD is predicted to double in the next two decades, which urges more research to focus on its early diagnosis and treatment. In this paper, we propose an approach to diagnose PD using magnetic resonance imaging (MRI) data. Specifically, we first introduce a joint feature-sample selection (JFSS) method for selecting an optimal subset of samples and features, to learn a reliable diagnosis model. The proposed JFSS model effectively discards poor samples and irrelevant features. As a result, the selected features play an important role in PD characterization, which will help identify the most relevant and critical imaging biomarkers for PD. Then, a robust classification framework is proposed to simultaneously de-noise the selected subset of features and samples, and learn a classification model. Our model can also de-noise testing samples based on the cleaned training data. Unlike many previous works that perform de-noising in an unsupervised manner, we perform supervised de-noising for both training and testing data, thus boosting the diagnostic accuracy. Experimental results on both synthetic and publicly available PD datasets show promising results. To evaluate the proposed method, we use the popular Parkinson's progression markers initiative (PPMI) database. Our results indicate that the proposed method can differentiate between PD and normal control (NC), and outperforms the competing methods by a relatively large margin. It is noteworthy to mention that our proposed framework can also be used for diagnosis of other brain disorders. To show this, we have also conducted experiments on the widely-used ADNI database. The obtained results indicate that our proposed method can identify the imaging biomarkers and diagnose the disease with favorable accuracies compared to the baseline methods.
Collapse
Affiliation(s)
- Ehsan Adeli
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA
| | - Feng Shi
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA
| | - Le An
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA
| | - Chong-Yaw Wee
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA; Department of Biomedical Engineering, National University of Singapore, Singapore
| | - Guorong Wu
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA
| | - Tao Wang
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA; Department of Geriatric Psychiatry, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Alzheimer's Disease and Related Disorders Center, Shanghai Jiao Tong University, Shanghai, China
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina-Chapel Hill, NC 27599, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
22
|
Abstract
It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank k tensor as a sum of rank one tensors. When observations are only available for a tiny subset of the cells of a big tensor, the low rank assumption is not sufficient and PARAFAC has poor performance. We induce an additional layer of dimension reduction by allowing the effective rank to vary across dimensions of the table. For concreteness, we focus on a contingency table application. Taking a Bayesian approach, we place priors on terms in the factorization and develop an efficient Gibbs sampler for posterior computation. Theory is provided showing posterior concentration rates in high-dimensional settings, and the methods are shown to have excellent performance in simulations and several real data applications.
Collapse
Affiliation(s)
- Jing Zhou
- Department of Biostatistics, The University of North Carolina at Chapel Hill
| | | | - Amy Herring
- Department of Biostatistics and Carolina Population Center, The University of North Carolina at Chapel Hill
| | - David Dunson
- Department of Statistical Science, Duke University
| |
Collapse
|
23
|
Abstract
It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank k tensor as a sum of rank one tensors. When observations are only available for a tiny subset of the cells of a big tensor, the low rank assumption is not sufficient and PARAFAC has poor performance. We induce an additional layer of dimension reduction by allowing the effective rank to vary across dimensions of the table. For concreteness, we focus on a contingency table application. Taking a Bayesian approach, we place priors on terms in the factorization and develop an efficient Gibbs sampler for posterior computation. Theory is provided showing posterior concentration rates in high-dimensional settings, and the methods are shown to have excellent performance in simulations and several real data applications.
Collapse
Affiliation(s)
- Jing Zhou
- Department of Biostatistics, The University of North Carolina at Chapel Hill
| | | | - Amy Herring
- Department of Biostatistics and Carolina Population Center, The University of North Carolina at Chapel Hill
| | - David Dunson
- Department of Statistical Science, Duke University
| |
Collapse
|
24
|
Sanroma G, Wu G, Gao Y, Thung KH, Guo Y, Shen D. A transversal approach for patch-based label fusion via matrix completion. Med Image Anal 2015; 24:135-148. [PMID: 26160394 PMCID: PMC4701198 DOI: 10.1016/j.media.2015.06.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Revised: 04/10/2015] [Accepted: 06/11/2015] [Indexed: 11/22/2022]
Abstract
Recently, multi-atlas patch-based label fusion has received an increasing interest in the medical image segmentation field. After warping the anatomical labels from the atlas images to the target image by registration, label fusion is the key step to determine the latent label for each target image point. Two popular types of patch-based label fusion approaches are (1) reconstruction-based approaches that compute the target labels as a weighted average of atlas labels, where the weights are derived by reconstructing the target image patch using the atlas image patches; and (2) classification-based approaches that determine the target label as a mapping of the target image patch, where the mapping function is often learned using the atlas image patches and their corresponding labels. Both approaches have their advantages and limitations. In this paper, we propose a novel patch-based label fusion method to combine the above two types of approaches via matrix completion (and hence, we call it transversal). As we will show, our method overcomes the individual limitations of both reconstruction-based and classification-based approaches. Since the labeling confidences may vary across the target image points, we further propose a sequential labeling framework that first labels the highly confident points and then gradually labels more challenging points in an iterative manner, guided by the label information determined in the previous iterations. We demonstrate the performance of our novel label fusion method in segmenting the hippocampus in the ADNI dataset, subcortical and limbic structures in the LONI dataset, and mid-brain structures in the SATA dataset. We achieve more accurate segmentation results than both reconstruction-based and classification-based approaches. Our label fusion method is also ranked 1st in the online SATA Multi-Atlas Segmentation Challenge.
Collapse
Affiliation(s)
- Gerard Sanroma
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Guorong Wu
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Yaozong Gao
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Kim-Han Thung
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Yanrong Guo
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina, Chapel Hill, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul, Republic of Korea.
| |
Collapse
|
25
|
Thung KH, Wee CY, Yap PT, Shen D. Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion. Neuroimage 2014; 91:386-400. [PMID: 24480301 PMCID: PMC4096013 DOI: 10.1016/j.neuroimage.2014.01.033] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Revised: 01/13/2014] [Accepted: 01/18/2014] [Indexed: 12/17/2022] Open
Abstract
In this work, we are interested in predicting the diagnostic statuses of potentially neurodegenerated patients using feature values derived from multi-modality neuroimaging data and biological data, which might be incomplete. Collecting the feature values into a matrix, with each row containing a feature vector of a sample, we propose a framework to predict the corresponding associated multiple target outputs (e.g., diagnosis label and clinical scores) from this feature matrix by performing matrix shrinkage following matrix completion. Specifically, we first combine the feature and target output matrices into a large matrix and then partition this large incomplete matrix into smaller submatrices, each consisting of samples with complete feature values (corresponding to a certain combination of modalities) and target outputs. Treating each target output as the outcome of a prediction task, we apply a 2-step multi-task learning algorithm to select the most discriminative features and samples in each submatrix. Features and samples that are not selected in any of the submatrices are discarded, resulting in a shrunk version of the original large matrix. The missing feature values and unknown target outputs of the shrunk matrix is then completed simultaneously. Experimental results using the ADNI dataset indicate that our proposed framework achieves higher classification accuracy at a greater speed when compared with conventional imputation-based classification methods and also yields competitive performance when compared with the state-of-the-art methods.
Collapse
Affiliation(s)
- Kim-Han Thung
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA.
| | - Chong-Yaw Wee
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA
| | - Pew-Thian Yap
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA
| | - Dinggang Shen
- Biomedical Research Imaging Center (BRIC) and Department of Radiology, University of North Carolina at Chapel Hill, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea.
| |
Collapse
|