1
|
Hayakawa T, Nagashima T, Akimoto H, Minagawa K, Takahashi Y, Asai S. Benzodiazepine-related dementia risks and protopathic biases revealed by multiple-kernel learning with electronic medical records. Digit Health 2023; 9:20552076231178577. [PMID: 37312937 PMCID: PMC10259140 DOI: 10.1177/20552076231178577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 05/06/2023] [Indexed: 06/15/2023] Open
Abstract
Objectives To simultaneously estimate how the risk of incident dementia nonlinearly varies with the administration period and cumulative dose of benzodiazepines, the duration of disorders with an indication for benzodiazepines, and other potential confounders, with the goal of settling the controversy over the role of benzodiazepines in the development of dementia. Methods The classical hazard model was extended using the techniques of multiple-kernel learning. Regularised maximum-likelihood estimation, including determination of hyperparameter values with 10-fold cross-validation, bootstrap goodness-of-fit test, and bootstrap estimation of confidence intervals, was applied to cohorts retrospectively extracted from electronic medical records of our university hospitals between 1 November 2004 and 31 July 2020. The analysis was mainly focused on 8160 patients aged 40 or older with new onset of insomnia, affective disorders, or anxiety disorders, who were followed up for 4.10 ± 3.47 years. Results Besides previously reported risk associations, we detected significant nonlinear risk variations over 2-4 years attributable to the duration of insomnia and anxiety disorders, and to the administration period of short-acting benzodiazepines. After nonlinear adjustment for potential confounders, we observed no significant risk associations with long-term use of benzodiazepines. Conclusions The pattern of the detected nonlinear risk variations suggested reverse causation and confounding. Their putative bias effects over 2-4 years suggested similar biases in previously reported results. These results, together with the lack of significant risk associations with long-term use of benzodiazepines, suggested the need to reconsider previous results and methods for future analysis.
Collapse
Affiliation(s)
- Takashi Hayakawa
- Division of Pharmacology, Department of Biomedical Sciences, Nihon University School of Medicine, Tokyo, Japan
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| | - Takuya Nagashima
- Division of Pharmacology, Department of Biomedical Sciences, Nihon University School of Medicine, Tokyo, Japan
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| | - Hayato Akimoto
- Division of Pharmacology, Department of Biomedical Sciences, Nihon University School of Medicine, Tokyo, Japan
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| | - Kimino Minagawa
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| | - Yasuo Takahashi
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| | - Satoshi Asai
- Division of Pharmacology, Department of Biomedical Sciences, Nihon University School of Medicine, Tokyo, Japan
- Division of Genomic Epidemiology and Clinical Trials, Clinical Trials Research Center, Nihon University School of Medicine, Tokyo, Japan
| |
Collapse
|
2
|
Gliozzo J, Mesiti M, Notaro M, Petrini A, Patak A, Puertas-Gallardo A, Paccanaro A, Valentini G, Casiraghi E. Heterogeneous data integration methods for patient similarity networks. Brief Bioinform 2022; 23:6604996. [PMID: 35679533 PMCID: PMC9294435 DOI: 10.1093/bib/bbac207] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 04/14/2022] [Accepted: 05/04/2022] [Indexed: 12/29/2022] Open
Abstract
Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.
Collapse
Affiliation(s)
- Jessica Gliozzo
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,European Commission, Joint Research Centre (JRC), Ispra (VA), Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Mesiti
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Notaro
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alessandro Petrini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alex Patak
- European Commission, Joint Research Centre (JRC), Ispra (VA), Italy
| | | | - Alberto Paccanaro
- Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX UK.,School of Applied Mathematics (EMAp), Fundação Getúlio Vargas, Rio de Janeiro Brazil
| | - Giorgio Valentini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy.,DSRC UNIMI, Data Science Research Center, Milano, 20135, Italy.,ELLIS, European Laboratory for Learning and Intelligent Systems, Berlin, Germany
| | - Elena Casiraghi
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| |
Collapse
|
3
|
Di Credico A, Perpetuini D, Izzicupo P, Gaggi G, Cardone D, Filippini C, Merla A, Ghinassi B, Di Baldassarre A. Estimation of Heart Rate Variability Parameters by Machine Learning Approaches Applied to Facial Infrared Thermal Imaging. Front Cardiovasc Med 2022; 9:893374. [PMID: 35656402 PMCID: PMC9152459 DOI: 10.3389/fcvm.2022.893374] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/04/2022] [Indexed: 01/18/2023] Open
Abstract
Heart rate variability (HRV) is a reliable tool for the evaluation of several physiological factors modulating the heart rate (HR). Importantly, variations of HRV parameters may be indicative of cardiac diseases and altered psychophysiological conditions. Recently, several studies focused on procedures for contactless HR measurements from facial videos. However, the performances of these methods decrease when illumination is poor. Infrared thermography (IRT) could be useful to overcome this limitation. In fact, IRT can measure the infrared radiations emitted by the skin, working properly even in no visible light illumination conditions. This study investigated the capability of facial IRT to estimate HRV parameters through a face tracking algorithm and a cross-validated machine learning approach, employing photoplethysmography (PPG) as the gold standard for the HR evaluation. The results demonstrated a good capability of facial IRT in estimating HRV parameters. Particularly, strong correlations between the estimated and measured HR (r = 0.7), RR intervals (r = 0.67), TINN (r = 0.71), and pNN50 (%) (r = 0.70) were found, whereas moderate correlations for RMSSD (r = 0.58), SDNN (r = 0.44), and LF/HF (r = 0.48) were discovered. The proposed procedure allows for a contactless estimation of the HRV that could be beneficial for evaluating both cardiac and general health status in subjects or conditions where contact probe sensors cannot be used.
Collapse
Affiliation(s)
- Andrea Di Credico
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - David Perpetuini
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Pascal Izzicupo
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Giulia Gaggi
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Daniela Cardone
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Chiara Filippini
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Arcangelo Merla
- Department of Engineering and Geology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Barbara Ghinassi
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Angela Di Baldassarre
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| |
Collapse
|
4
|
Abstract
Clinical decision-making in healthcare is already being influenced by predictions or recommendations made by data-driven machines. Numerous machine learning applications have appeared in the latest clinical literature, especially for outcome prediction models, with outcomes ranging from mortality and cardiac arrest to acute kidney injury and arrhythmia. In this review article, we summarize the state-of-the-art in related works covering data processing, inference, and model evaluation, in the context of outcome prediction models developed using data extracted from electronic health records. We also discuss limitations of prominent modeling assumptions and highlight opportunities for future research.
Collapse
|
5
|
Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Chronopoulos AT. Computational intelligence approaches for classification of medical data: State-of-the-art, future challenges and research directions. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.01.126] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
6
|
Pölsterl S, Gupta P, Wang L, Conjeti S, Katouzian A, Navab N. Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Res 2016; 5:2676. [PMID: 28713544 PMCID: PMC5500862 DOI: 10.12688/f1000research.8231.1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/06/2017] [Indexed: 11/26/2023] Open
Abstract
Ensemble methods have been successfully applied in a wide range of scenarios, including survival analysis. However, most ensemble models for survival analysis consist of models that all optimize the same loss function and do not fully utilize the diversity in available models. We propose heterogeneous survival ensembles that combine several survival models, each optimizing a different loss during training. We evaluated our proposed technique in the context of the Prostate Cancer DREAM Challenge, where the objective was to predict survival of patients with metastatic, castrate-resistant prostate cancer from patient records of four phase III clinical trials. Results demonstrate that a diverse set of survival models were preferred over a single model and that our heterogeneous ensemble of survival models outperformed all competing methods with respect to predicting the exact time of death in the Prostate Cancer DREAM Challenge.
Collapse
Affiliation(s)
- Sebastian Pölsterl
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Pankaj Gupta
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Lichao Wang
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Sailesh Conjeti
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Amin Katouzian
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Nassir Navab
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
- Johns Hopkins University, Baltimore, USA
| |
Collapse
|
7
|
Pölsterl S, Gupta P, Wang L, Conjeti S, Katouzian A, Navab N. Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Res 2016; 5:2676. [PMID: 28713544 PMCID: PMC5500862 DOI: 10.12688/f1000research.8231.3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/06/2017] [Indexed: 11/20/2022] Open
Abstract
Ensemble methods have been successfully applied in a wide range of scenarios, including survival analysis. However, most ensemble models for survival analysis consist of models that all optimize the same loss function and do not fully utilize the diversity in available models. We propose heterogeneous survival ensembles that combine several survival models, each optimizing a different loss during training. We evaluated our proposed technique in the context of the Prostate Cancer DREAM Challenge, where the objective was to predict survival of patients with metastatic, castrate-resistant prostate cancer from patient records of four phase III clinical trials. Results demonstrate that a diverse set of survival models were preferred over a single model and that our heterogeneous ensemble of survival models outperformed all competing methods with respect to predicting the exact time of death in the Prostate Cancer DREAM Challenge.
Collapse
Affiliation(s)
- Sebastian Pölsterl
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Pankaj Gupta
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Lichao Wang
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Sailesh Conjeti
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Amin Katouzian
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany
| | - Nassir Navab
- Computer Aided Medical Procedures, Technical University of Munich, Munich, Germany.,Johns Hopkins University, Baltimore, USA
| |
Collapse
|
8
|
Haase L, May AC, Falahpour M, Isakovic S, Simmons AN, Hickman SD, Liu TT, Paulus MP. A pilot study investigating changes in neural processing after mindfulness training in elite athletes. Front Behav Neurosci 2015; 9:229. [PMID: 26379521 PMCID: PMC4550788 DOI: 10.3389/fnbeh.2015.00229] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 08/11/2015] [Indexed: 12/02/2022] Open
Abstract
The ability to pay close attention to the present moment can be a crucial factor for performing well in a competitive situation. Training mindfulness is one approach to potentially improve elite athletes’ ability to focus their attention on the present moment. However, virtually nothing is known about whether these types of interventions alter neural systems that are important for optimal performance. This pilot study examined whether an intervention aimed at improving mindfulness [Mindful Performance Enhancement, Awareness and Knowledge (mPEAK)] changes neural activation patterns during an interoceptive challenge. Participants completed a task involving anticipation and experience of loaded breathing during functional magnetic resonance imaging recording. There were five main results following mPEAK training: (1) elite athletes self-reported higher levels of interoceptive awareness and mindfulness and lower levels of alexithymia; (2) greater insula and anterior cingulate cortex (ACC) activation during anticipation and post-breathing load conditions; (3) increased ACC activation during the anticipation condition was associated with increased scores on the describing subscale of the Five Facet Mindfulness Questionnaire; (4) increased insula activation during the post-load condition was associated with decreases in the Toronto Alexithymia Scale identifying feelings subscale; (5) decreased resting state functional connectivity between the PCC and the right medial frontal cortex and the ACC. Taken together, this pilot study suggests that mPEAK training may lead to increased attention to bodily signals and greater neural processing during the anticipation and recovery from interoceptive perturbations. This association between attention to and processing of interoceptive afferents may result in greater adaptation during stressful situations in elite athletes.
Collapse
Affiliation(s)
- Lori Haase
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA ; Veteran's Affairs San Diego Healthcare System San Diego, CA, USA
| | - April C May
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA
| | - Maryam Falahpour
- Center for Functional MRI, Department of Radiology, University of California, San Diego San Diego, CA, USA
| | - Sara Isakovic
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA
| | - Alan N Simmons
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA ; Veteran's Affairs San Diego Healthcare System San Diego, CA, USA
| | - Steven D Hickman
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA
| | - Thomas T Liu
- Center for Functional MRI, Department of Radiology, University of California, San Diego San Diego, CA, USA
| | - Martin P Paulus
- Department of Psychiatry, University of California, San Diego San Diego, CA, USA ; Laureate Institute for Brain Research Tulsa, OK, USA
| |
Collapse
|
9
|
Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values. Comput Biol Med 2015; 59:125-133. [PMID: 25725446 DOI: 10.1016/j.compbiomed.2015.02.006] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 02/07/2015] [Accepted: 02/09/2015] [Indexed: 11/22/2022]
Abstract
Breast cancer is the most frequently diagnosed cancer in women. Using historical patient information stored in clinical datasets, data mining and machine learning approaches can be applied to predict the survival of breast cancer patients. A common drawback is the absence of information, i.e., missing data, in certain clinical trials. However, most standard prediction methods are not able to handle incomplete samples and, then, missing data imputation is a widely applied approach for solving this inconvenience. Therefore, and taking into account the characteristics of each breast cancer dataset, it is required to perform a detailed analysis to determine the most appropriate imputation and prediction methods in each clinical environment. This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information (most clinical data of the patients are incomplete), which is a challenge in terms of complexity. Four scenarios are evaluated: (I) 5-year survival prediction without imputation and 5-year survival prediction from cleaned dataset with (II) Mode imputation, (III) Expectation-Maximization imputation and (IV) K-Nearest Neighbors imputation. Prediction models for breast cancer survivability are constructed using four different methods: K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines. Experiments are performed in a nested ten-fold cross-validation procedure and, according to the obtained results, the best results are provided by the K-Nearest Neighbors algorithm: more than 81% of accuracy and more than 0.78 of area under the Receiver Operator Characteristic curve, which constitutes very good results in this complex scenario.
Collapse
|
10
|
Thomas M, De Brabanter K, Suykens JAK, De Moor B. Predicting breast cancer using an expression values weighted clinical classifier. BMC Bioinformatics 2014; 15:411. [PMID: 25551433 PMCID: PMC4308909 DOI: 10.1186/s12859-014-0411-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 12/05/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Clinical data, such as patient history, laboratory analysis, ultrasound parameters-which are the basis of day-to-day clinical decision support-are often used to guide the clinical management of cancer in the presence of microarray data. Several data fusion techniques are available to integrate genomics or proteomics data, but only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. To improve clinical management, these data should be fully exploited. This requires efficient algorithms to integrate these data sets and design a final classifier. LS-SVM classifiers and generalized eigenvalue/singular value decompositions are successfully used in many bioinformatics applications for prediction tasks. While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters. RESULTS We compared and evaluated the proposed methods on five breast cancer case studies. Compared to LS-SVM classifier on individual data sets, generalized eigenvalue decomposition (GEVD) and kernel GEVD, the proposed weighted LS-SVM classifier offers good prediction performance, in terms of test area under ROC Curve (AUC), on all breast cancer case studies. CONCLUSIONS Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy. The proposed model has been shown as a promising mathematical framework in both data fusion and non-linear classification problems.
Collapse
Affiliation(s)
- Minta Thomas
- KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics/iMinds Future Health Department, Kasteelpark Arenberg 10, Leuven, 3001, Belgium.
| | - Kris De Brabanter
- Department of Statistics & Computer Science, Iowa State University, Ames, IA, USA.
| | - Johan A K Suykens
- KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics/iMinds Future Health Department, Kasteelpark Arenberg 10, Leuven, 3001, Belgium.
| | - Bart De Moor
- KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics/iMinds Future Health Department, Kasteelpark Arenberg 10, Leuven, 3001, Belgium.
| |
Collapse
|
11
|
Thomas M, Daemen A, De Moor B. Maximum Likelihood Estimation of GEVD: Applications in Bioinformatics. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:673-680. [PMID: 26356338 DOI: 10.1109/tcbb.2014.2304292] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We propose a method, maximum likelihood estimation of generalized eigenvalue decomposition (MLGEVD) that employs a well known technique relying on the generalization of singular value decomposition (SVD). The main aim of the work is to show the tight equivalence between MLGEVD and generalized ridge regression. This relationship reveals an important mathematical property of GEVD in which the second argument act as prior information in the model. Thus we show that MLGEVD allows the incorporation of external knowledge about the quantities of interest into the estimation problem. We illustrate the importance of prior knowledge in clinical decision making/identifying differentially expressed genes with case studies for which microarray data sets with corresponding clinical/literature information are available. On all of these three case studies, MLGEVD outperformed GEVD on prediction in terms of test area under the ROC curve (test AUC). MLGEVD results in significantly improved diagnosis, prognosis and prediction of therapy response.
Collapse
|