51
|
De Ramón Fernández A, Ruiz Fernández D, Prieto Sánchez MT. A decision support system for predicting the treatment of ectopic pregnancies. Int J Med Inform 2019; 129:198-204. [DOI: 10.1016/j.ijmedinf.2019.06.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 04/22/2019] [Accepted: 06/03/2019] [Indexed: 10/26/2022]
|
52
|
Analysis of Factors Affecting Real-Time Ridesharing Vehicle Crash Severity. SUSTAINABILITY 2019. [DOI: 10.3390/su11123334] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The popular real-time ridesharing service has promoted social and environmental sustainability in various ways. Meanwhile, it also brings some traffic safety concerns. This paper aims to analyze factors affecting real-time ridesharing vehicle crash severity based on the classification and regression tree (CART) model. The Chicago police-reported crash data from January to December 2018 is collected. Crash severity in the original dataset is highly imbalanced: only 60 out of 2624 crashes are severe injury crashes. To fix the data imbalance problem, a hybrid data preprocessing approach which combines the over- and under-sampling is applied. Model results indicate that, by resampling the crash data, the successfully predicted severe crashes are increased from 0 to 40. Besides, the G-mean is increased from 0% to 73%, and the AUC (area under the receiver operating characteristics curve) is increased from 0.73 to 0.82. The classification tree reveals that following variables are the primary indicators of real-time ridesharing vehicle crash severity: pedestrian/pedalcyclist involvement, number of passengers, weather condition, trafficway type, vehicle manufacture year, traffic control device, driver gender, lighting condition, vehicle type, driver age and crash time. The current study could provide some valuable insights for the sustainable development of real-time ridesharing services and urban transportation.
Collapse
|
53
|
Khasha R, Sepehri MM, Mahdaviani SA. An ensemble learning method for asthma control level detection with leveraging medical knowledge-based classifier and supervised learning. J Med Syst 2019; 43:158. [PMID: 31028489 DOI: 10.1007/s10916-019-1259-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 03/27/2019] [Indexed: 12/25/2022]
Abstract
Approximately 300 million people are afflicted with asthma around the world, with the estimated death rate of 250,000 cases, indicating the significance of this disease. If not treated, it can turn into a serious public health problem. The best method to treat asthma is to control it. Physicians recommend continuous monitoring on asthma symptoms and offering treatment preventive plans based on the patient's control level. Therefore, successful detection of the disease control level plays a critical role in presenting treatment plans. In view of this objective, we collected the data of 96 asthma patients within a 9-month period from a specialized hospital for pulmonary diseases in Tehran. A new ensemble learning algorithm with combining physicians' knowledge in the form of a rule-based classifier and supervised learning algorithms is proposed to detect asthma control level in a multivariate dataset with multiclass response variable. The model outcome resulting from the balancing operations and feature selection on data yielded the accuracy of 91.66%. Our proposed model combines medical knowledge with machine learning algorithms to classify asthma control level more accurately. This model can be applied in electronic self-care systems to support the real-time decision and personalized warnings on possible deterioration of asthma control level. Such tools can centralize asthma treatment from the current reactive care models into a preventive approach in which the physician's therapeutic actions would be based on control level.
Collapse
Affiliation(s)
- Roghaye Khasha
- Group of Information Technology, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, 1411713116, Iran
| | - Mohammad Mehdi Sepehri
- Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, 1411713116, Iran.
| | - Seyed Alireza Mahdaviani
- Pediatric Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
54
|
Choice-predictive activity in parietal cortex during source memory decisions. Neuroimage 2019; 189:589-600. [DOI: 10.1016/j.neuroimage.2019.01.071] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 01/16/2019] [Accepted: 01/28/2019] [Indexed: 10/27/2022] Open
|
55
|
Wahab N, Khan A, Lee YS. Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images. Microscopy (Oxf) 2019; 68:216-233. [DOI: 10.1093/jmicro/dfz002] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 12/21/2018] [Accepted: 01/11/2019] [Indexed: 01/17/2023] Open
Affiliation(s)
- Noorul Wahab
- Pattern Recognition Lab, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, Islamabad
| | - Asifullah Khan
- Pattern Recognition Lab, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, Islamabad
- Deep Learning Lab, Centre for Mathematical Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, Islamabad
| | - Yeon Soo Lee
- Department of Biomedical Engineering, College of Medical Science, Catholic University of Daegu, Gyoungsangbuk-do, Republic of Korea
| |
Collapse
|
56
|
Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping P. Machine Learning and Integrative Analysis of Biomedical Big Data. Genes (Basel) 2019; 10:E87. [PMID: 30696086 PMCID: PMC6410075 DOI: 10.3390/genes10020087] [Citation(s) in RCA: 163] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/08/2019] [Accepted: 01/21/2019] [Indexed: 12/11/2022] Open
Abstract
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
Collapse
Affiliation(s)
- Bilal Mirza
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Wei Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Jie Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Howard Choi
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Neo Christopher Chung
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
| | - Peipei Ping
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Medicine (Cardiology), University of California Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
57
|
Banerjee P, Dehnbostel FO, Preissner R. Prediction Is a Balancing Act: Importance of Sampling Methods to Balance Sensitivity and Specificity of Predictive Models Based on Imbalanced Chemical Data Sets. Front Chem 2018; 6:362. [PMID: 30271769 PMCID: PMC6149243 DOI: 10.3389/fchem.2018.00362] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 07/30/2018] [Indexed: 12/24/2022] Open
Abstract
Increase in the number of new chemicals synthesized in past decades has resulted in constant growth in the development and application of computational models for prediction of activity as well as safety profiles of the chemicals. Most of the time, such computational models and its application must deal with imbalanced chemical data. It is indeed a challenge to construct a classifier using imbalanced data set. In this study, we analyzed and validated the importance of different sampling methods over non-sampling method, to achieve a well-balanced sensitivity and specificity of a machine learning model trained on imbalanced chemical data. Additionally, this study has achieved an accuracy of 93.00%, an AUC of 0.94, F1 measure of 0.90, sensitivity of 96.00% and specificity of 91.00% using SMOTE sampling and Random Forest classifier for the prediction of Drug Induced Liver Injury (DILI). Our results suggest that, irrespective of data set used, sampling methods can have major influence on reducing the gap between sensitivity and specificity of a model. This study demonstrates the efficacy of different sampling methods for class imbalanced problem using binary chemical data sets.
Collapse
Affiliation(s)
- Priyanka Banerjee
- Structural Bioinformatics Group, Institute for Physiology, Charité - University Medicine Berlin, Berlin, Germany
| | - Frederic O Dehnbostel
- Structural Bioinformatics Group, Institute for Physiology, Charité - University Medicine Berlin, Berlin, Germany
| | - Robert Preissner
- Structural Bioinformatics Group, Institute for Physiology, Charité - University Medicine Berlin, Berlin, Germany
| |
Collapse
|
58
|
|
59
|
Lavagnino L, Mwangi B, Cao B, Shott ME, Soares JC, Frank GK. Cortical thickness patterns as state biomarker of anorexia nervosa. Int J Eat Disord 2018; 51:241-249. [PMID: 29412456 PMCID: PMC5843530 DOI: 10.1002/eat.22828] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Revised: 01/08/2018] [Accepted: 01/08/2018] [Indexed: 12/30/2022]
Abstract
OBJECTIVE Only few studies have investigated cortical thickness in anorexia nervosa (AN), and it is unclear whether patterns of altered cortical thickness can be identified as biomarkers for AN. METHOD Cortical thickness was measured in 19 adult women with restricting-type AN, 24 individuals recovered from restricting-type AN (REC-AN) and 24 healthy controls. Those individuals with current or recovered from AN had previously shown altered regional cortical volumes across orbitofrontal cortex and insula. A linear relevance vector machine-learning algorithm estimated patterns of regional thickness across 24 subdivisions of those regions. RESULTS Region-based analysis showed higher cortical thickness in AN and REC-AN, compared to controls, in the right medial orbital (olfactory) sulcus, and greater cortical thickness for short insular gyri in REC-AN versus controls bilaterally. The machine-learning algorithm identified a pattern of relatively higher right orbital, right insular and left middle frontal cortical thickness, but lower left orbital, right middle and inferior frontal, and bilateral superior frontal cortical thickness specific to AN versus controls (74% specificity and 74% sensitivity, χ2 p < .004); predicted probabilities differed significantly between AN and controls (p < .023). No pattern significantly distinguished the REC-AN group from controls. CONCLUSIONS Higher cortical thickness in medial orbitofrontal cortex and insula probably contributes to higher gray matter volume in AN in those regions. The machine-learning algorithm identified a mixed pattern of mostly higher orbital and insular, but relatively lower superior frontal cortical thickness in individuals with current AN. These novel results suggest that regional cortical thickness patterns could be state markers for AN.
Collapse
Affiliation(s)
- Luca Lavagnino
- University of Texas Health Sciences Center at Houston, Department of Psychiatry and Behavioral Sciences, Houston, Texas, USA
| | - Benson Mwangi
- University of Texas Health Sciences Center at Houston, Department of Psychiatry and Behavioral Sciences, Houston, Texas, USA
| | - Bo Cao
- University of Texas Health Sciences Center at Houston, Department of Psychiatry and Behavioral Sciences, Houston, Texas, USA
| | - Megan E. Shott
- Departments of Psychiatry and Neuroscience, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Jair C. Soares
- University of Texas Health Sciences Center at Houston, Department of Psychiatry and Behavioral Sciences, Houston, Texas, USA
| | - Guido K.W. Frank
- Departments of Psychiatry and Neuroscience, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
60
|
Davoudi A, Ebadi A, Rashidi P, Ozrazgat-Baslanti T, Bihorac A, Bursian AC. Delirium Prediction using Machine Learning Models on Preoperative Electronic Health Records Data. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING 2018; 2017:568-573. [PMID: 30393788 DOI: 10.1109/bibe.2017.00014] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Electronic Health Records (EHR) are mainly designed to record relevant patient information during their stay in the hospital for administrative purposes. They additionally provide an efficient and inexpensive source of data for medical research, such as patient outcome prediction. In this study, we used preoperative Electronic Health Records to predict postoperative delirium. We compared the performance of seven machine learning models on delirium prediction: linear models, generalized additive models, random forests, support vector machine, neural networks, and extreme gradient boosting. Among the models evaluated in this study, random forests and generalized additive model outperformed the other models in terms of the overall performance metrics for prediction of delirium, particularly with respect to sensitivity. We found that age, alcohol or drug abuse, socioeconomic status, underlying medical issue, severity of medical problem, and attending surgeon can affect the risk of delirium.
Collapse
Affiliation(s)
- Anis Davoudi
- Department of Biomedical Engineering, University of Florida, Gainesville, USA
| | - Ashkan Ebadi
- Department of Biomedical Engineering, University of Florida, Gainesville, USA
| | - Parisa Rashidi
- Department of Biomedical Engineering, University of Florida, Gainesville, USA
| | | | - Azra Bihorac
- Department of Medicine, University of Florida, Gainesville, USA
| | - Alberto C Bursian
- Department of Anesthesiology, University of Florida, Gainesville, USA
| |
Collapse
|
61
|
Hotzy F, Theodoridou A, Hoff P, Schneeberger AR, Seifritz E, Olbrich S, Jäger M. Machine Learning: An Approach in Identifying Risk Factors for Coercion Compared to Binary Logistic Regression. Front Psychiatry 2018; 9:258. [PMID: 29946273 PMCID: PMC6005877 DOI: 10.3389/fpsyt.2018.00258] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 05/24/2018] [Indexed: 12/05/2022] Open
Abstract
Introduction: Although knowledge about negative effects of coercive measures in psychiatry exists, its prevalence is still high in clinical routine. This study aimed at define risk factors and test machine learning algorithms for their accuracy in the prediction of the risk to being subjected to coercive measures. Methods: In a sample of involuntarily hospitalized patients (n = 393) at the University Hospital of Psychiatry Zurich, we analyzed risk factors for the experience of coercion (n = 170 patients) using chi-square tests and Mann Whitney U tests. We trained machine learning algorithms [logistic regression, Supported Vector Machine (SVM), and decision trees] with these risk factors and tested obtained models for their accuracy via five-fold cross validation. To verify the results we compared them to binary logistic regression. Results: In a model with 8 risk-factors which were available at admission, the SVM algorithm identified 102 out of 170 patients, which had experienced coercion and 174 out of 223 patients without coercion (69% accuracy with 60% sensitivity and 78% specificity, AUC 0.74). In a model with 18 risk-factors, available after discharge, the logistic regression algorithm identified 121 out of 170 with and 176 out of 223 without coercion (75% accuracy, 71% sensitivity, and 79% specificity, AUC 0.82). Discussion: Incorporating both clinical and demographic variables can help to estimate the risk of experiencing coercion for psychiatric patients. This study could show that trained machine learning algorithms are comparable to binary logistic regression and can reach a good or even excellent area under the curve (AUC) in the prediction of the outcome coercion/no coercion when cross validation is used. Due to the better generalizability machine learning is a promising approach for further studies, especially when more variables are analyzed. More detailed knowledge about individual risk factors may help to prevent the occurrence of situations involving coercion.
Collapse
Affiliation(s)
- Florian Hotzy
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| | - Anastasia Theodoridou
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| | - Paul Hoff
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| | - Andres R Schneeberger
- Psychiatrische Dienste Graubuenden, Chur, Switzerland.,Universitaere Psychiatrische Kliniken Basel, Universitaet Basel, Basel, Switzerland.,Department of Psychiatry and Behavioral Sciences, Albert Einstein College of Medicine, New York, NY, United States
| | - Erich Seifritz
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| | - Sebastian Olbrich
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| | - Matthias Jäger
- Department for Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry Zurich, Zurich, Switzerland
| |
Collapse
|
62
|
Wang Q, Guo L, Thompson PM, Jack CR, Dodge H, Zhan L, Zhou J. The Added Value of Diffusion-Weighted MRI-Derived Structural Connectome in Evaluating Mild Cognitive Impairment: A Multi-Cohort Validation1. J Alzheimers Dis 2018; 64:149-169. [PMID: 29865049 PMCID: PMC6272125 DOI: 10.3233/jad-171048] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
T1-weighted MRI has been extensively used to extract imaging biomarkers and build classification models for differentiating Alzheimer's disease (AD) patients from healthy controls, but only recently have brain connectome networks derived from diffusion-weighted MRI been used to model AD progression and various stages of disease such as mild cognitive impairment (MCI). MCI, as a possible prodromal stage of AD, has gained intense interest recently, since it may be used to assess risk factors for AD. Little work has been done to combine information from both white matter and gray matter, and it is unknown how much classification power the diffusion-weighted MRI-derived structural connectome could provide beyond information available from T1-weighted MRI. In this paper, we focused on investigating whether diffusion-weighted MRI-derived structural connectome can improve differentiating healthy controls subjects from those with MCI. Specifically, we proposed a novel feature-ranking method to build classification models using the most highly ranked feature variables to classify MCI with healthy controls. We verified our method on two independent cohorts including the second stage of Alzheimer's Disease Neuroimaging Initiative (ADNI2) database and the National Alzheimer's Coordinating Center (NACC) database. Our results indicated that 1) diffusion-weighted MRI-derived structural connectome can complement T1-weighted MRI in the classification task; 2) the feature-rank method is effective because of the identified consistent T1-weighted MRI and network feature variables on ADNI2 and NACC. Furthermore, by comparing the top-ranked feature variables from ADNI2, NACC, and combined dataset, we concluded that cross-validation using independent cohorts is necessary and highly recommended.
Collapse
Affiliation(s)
- Qi Wang
- Computer Science and Engineering, Michigan State University, East Lansing, MI
| | - Lei Guo
- Mathematics, Statistics & Computer Science Department, University of Wisconsin-Stout, Menomonie, WI
| | - Paul M. Thompson
- Imaging Genetics Center, University of Southern California, Marina del Rey, CA
| | | | - Hiroko Dodge
- Michigan Alzheimer's Disease Center and Department of Neurology, University of Michigan, Ann Arbor, MI
- Layton Aging and Alzheimer's Disease Center and Department of Neurology, Oregon Health & Science University, Portland, OR
| | - Liang Zhan
- Computer Engineering Program, University of Wisconsin-Stout, Menomonie, WI
| | - Jiayu Zhou
- Computer Science and Engineering, Michigan State University, East Lansing, MI
| | | |
Collapse
|
63
|
Cao P, Liu X, Yang J, Zhao D, Huang M, Zhang J, Zaiane O. Nonlinearity-aware based dimensionality reduction and over-sampling for AD/MCI classification from MRI measures. Comput Biol Med 2017; 91:21-37. [DOI: 10.1016/j.compbiomed.2017.10.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Revised: 10/03/2017] [Accepted: 10/03/2017] [Indexed: 12/26/2022]
|
64
|
Guan H, Liu T, Jiang J, Tao D, Zhang J, Niu H, Zhu W, Wang Y, Cheng J, Kochan NA, Brodaty H, Sachdev P, Wen W. Classifying MCI Subtypes in Community-Dwelling Elderly Using Cross-Sectional and Longitudinal MRI-Based Biomarkers. Front Aging Neurosci 2017; 9:309. [PMID: 29085292 PMCID: PMC5649145 DOI: 10.3389/fnagi.2017.00309] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 09/12/2017] [Indexed: 01/18/2023] Open
Abstract
Amnestic MCI (aMCI) and non-amnestic MCI (naMCI) are considered to differ in etiology and outcome. Accurately classifying MCI into meaningful subtypes would enable early intervention with targeted treatment. In this study, we employed structural magnetic resonance imaging (MRI) for MCI subtype classification. This was carried out in a sample of 184 community-dwelling individuals (aged 73-85 years). Cortical surface based measurements were computed from longitudinal and cross-sectional scans. By introducing a feature selection algorithm, we identified a set of discriminative features, and further investigated the temporal patterns of these features. A voting classifier was trained and evaluated via 10 iterations of cross-validation. The best classification accuracies achieved were: 77% (naMCI vs. aMCI), 81% (aMCI vs. cognitively normal (CN)) and 70% (naMCI vs. CN). The best results for differentiating aMCI from naMCI were achieved with baseline features. Hippocampus, amygdala and frontal pole were found to be most discriminative for classifying MCI subtypes. Additionally, we observed the dynamics of classification of several MRI biomarkers. Learning the dynamics of atrophy may aid in the development of better biomarkers, as it may track the progression of cognitive impairment.
Collapse
Affiliation(s)
- Hao Guan
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Tao Liu
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beijing, China
- Beijing Advanced Innovation Center for Biomedical Engineering, Beijing, China
| | - Jiyang Jiang
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Neuropsychiatric Institute, Prince of Wales Hospital, Sydney, NSW, Australia
| | - Dacheng Tao
- UBTech Sydney Artificial Intelligence Institute, Faculty of Engineering and Information Technologies, University of Sydney, Darlington, NSW, Australia
- The School of Information Technologies, Faculty of Engineering and Information Technologies, University of Sydney, Darlington, NSW, Australia
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beijing, China
| | - Haijun Niu
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Beijing Advanced Innovation Center for Biomedical Engineering, Beijing, China
| | - Wanlin Zhu
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Yilong Wang
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Jian Cheng
- NIBIB, NICHD, National Institutes of Health, Bethesda, MD, United States
| | - Nicole A. Kochan
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Neuropsychiatric Institute, Prince of Wales Hospital, Sydney, NSW, Australia
| | - Henry Brodaty
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Dementia Collaborative Research Centre, University of New South Wales, Sydney, NSW, Australia
| | - Perminder Sachdev
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Neuropsychiatric Institute, Prince of Wales Hospital, Sydney, NSW, Australia
| | - Wei Wen
- Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, NSW, Australia
- Neuropsychiatric Institute, Prince of Wales Hospital, Sydney, NSW, Australia
| |
Collapse
|
65
|
Richardson AM, Lidbury BA. Enhancement of hepatitis virus immunoassay outcome predictions in imbalanced routine pathology data by data balancing and feature selection before the application of support vector machines. BMC Med Inform Decis Mak 2017; 17:121. [PMID: 28806936 PMCID: PMC5557531 DOI: 10.1186/s12911-017-0522-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 08/07/2017] [Indexed: 02/07/2023] Open
Abstract
Background Data mining techniques such as support vector machines (SVMs) have been successfully used to predict outcomes for complex problems, including for human health. Much health data is imbalanced, with many more controls than positive cases. Methods The impact of three balancing methods and one feature selection method is explored, to assess the ability of SVMs to classify imbalanced diagnostic pathology data associated with the laboratory diagnosis of hepatitis B (HBV) and hepatitis C (HCV) infections. Random forests (RFs) for predictor variable selection, and data reshaping to overcome a large imbalance of negative to positive test results in relation to HBV and HCV immunoassay results, are examined. The methodology is illustrated using data from ACT Pathology (Canberra, Australia), consisting of laboratory test records from 18,625 individuals who underwent hepatitis virus testing over the decade from 1997 to 2007. Results Overall, the prediction of HCV test results by immunoassay was more accurate than for HBV immunoassay results associated with identical routine pathology predictor variable data. HBV and HCV negative results were vastly in excess of positive results, so three approaches to handling the negative/positive data imbalance were compared. Generating datasets by the Synthetic Minority Oversampling Technique (SMOTE) resulted in significantly more accurate prediction than single downsizing or multiple downsizing (MDS) of the dataset. For downsized data sets, applying a RF for predictor variable selection had a small effect on the performance, which varied depending on the virus. For SMOTE, a RF had a negative effect on performance. An analysis of variance of the performance across settings supports these findings. Finally, age and assay results for alanine aminotransferase (ALT), sodium for HBV and urea for HCV were found to have a significant impact upon laboratory diagnosis of HBV or HCV infection using an optimised SVM model. Conclusions Laboratories looking to include machine learning via SVM as part of their decision support need to be aware that the balancing method, predictor variable selection and the virus type interact to affect the laboratory diagnosis of hepatitis virus infection with routine pathology laboratory variables in different ways depending on which combination is being studied. This awareness should lead to careful use of existing machine learning methods, thus improving the quality of laboratory diagnosis.
Collapse
Affiliation(s)
- Alice M Richardson
- Present address: National Centre for Epidemiology & Population Health, Australian National University, Canberra, ACT 2601, Australia. .,Pattern Recognition & Pathology, Department of Genome Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia.
| | - Brett A Lidbury
- Present address: National Centre for Epidemiology & Population Health, Australian National University, Canberra, ACT 2601, Australia.,Pattern Recognition & Pathology, Department of Genome Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| |
Collapse
|
66
|
Hettige NC, Nguyen TB, Yuan C, Rajakulendran T, Baddour J, Bhagwat N, Bani-Fatemi A, Voineskos AN, Mallar Chakravarty M, De Luca V. Classification of suicide attempters in schizophrenia using sociocultural and clinical features: A machine learning approach. Gen Hosp Psychiatry 2017; 47:20-28. [PMID: 28807134 DOI: 10.1016/j.genhosppsych.2017.03.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 03/01/2017] [Accepted: 03/03/2017] [Indexed: 12/16/2022]
Abstract
OBJECTIVE Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. METHODS We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. RESULTS All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. CONCLUSION Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts.
Collapse
Affiliation(s)
- Nuwan C Hettige
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada; Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada
| | - Thai Binh Nguyen
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada
| | - Chen Yuan
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada
| | - Thanara Rajakulendran
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada
| | - Jermeen Baddour
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada
| | - Nikhil Bhagwat
- Institute of Biomaterials & Biomedical Engineering, University of Toronto, Toronto, Canada
| | - Ali Bani-Fatemi
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada; Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada
| | - Aristotle N Voineskos
- Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada; Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
| | - M Mallar Chakravarty
- Douglas Mental Health University Institute, McGill University, Montreal, Canada; Department of Psychiatry, McGill University, Montreal, Canada; Biological and Biomedical Engineering, McGill University, Montreal, Canada
| | - Vincenzo De Luca
- Group for Suicide Studies, Centre for Addiction and Mental Health, 250 College Street, Toronto, Ontario M5T 1R8, Canada; Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada; Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
67
|
Schlieker L, Telaar A, Lueking A, Schulz-Knappe P, Theek C, Ickstadt K. Multivariate binary classification of imbalanced datasets-A case study based on high-dimensional multiplex autoimmune assay data. Biom J 2017. [PMID: 28626952 DOI: 10.1002/bimj.201600207] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The classification of a population by a specific trait is a major task in medicine, for example when in a diagnostic setting groups of patients with specific diseases are identified, but also when in predictive medicine a group of patients is classified into specific disease severity classes that might profit from different treatments. When the sizes of those subgroups become small, for example in rare diseases, imbalances between the classes are more the rule than the exception and make statistical classification problematic when the error rate of the minority class is high. Many observations are classified as belonging to the majority class, while the error rate of the majority class is low. This case study aims to investigate class imbalance for Random Forests and Powered Partial Least Squares Discriminant Analysis (PPLS-DA) and to evaluate the performance of these classifiers when they are combined with methods to compensate imbalance (sampling methods, cost-sensitive learning approaches). We evaluate all approaches with a scoring system taking the classification results into consideration. This case study is based on one high-dimensional multiplex autoimmune assay dataset describing immune response to antigens and consisting of two classes of patients: Rheumatoid Arthritis (RA) and Systemic Lupus Erythemathodes (SLE). Datasets with varying degrees of imbalance are created by successively reducing the class of RA patients. Our results indicate possible benefit of cost-sensitive learning approaches for Random Forests. Although further research is needed to verify our findings by investigating other datasets or large-scale simulation studies, we claim that this work has the potential to increase awareness of practitioners to this problem of class imbalance and stresses the importance of considering methods to compensate class imbalance.
Collapse
Affiliation(s)
- Laura Schlieker
- ClinStat GmbH, Max-Planck-Str. 22a, 50858 Cologne, formerly Protagen AG, Otto-Hahn-Str. 15, 44227, Dortmund, Germany
| | - Anna Telaar
- Berufskolleg am Wassertum, 46399 Bocholt, formerly Protagen AG, Otto-Hahn-Str. 15, 44227, Dortmund, Germany
| | | | | | - Carmen Theek
- Chiltern International GmbH, Am Kronberger Hang 3, 65824 Schwalbach, formerly Protagen AG, Otto-Hahn-Str. 15, 44227, Dortmund, Germany
| | - Katja Ickstadt
- Department of Mathematical Statistics with Applications in Biometrics, Faculty of Statistics, Technical University Dortmund, Vogelpothsweg 87, 44227, Dortmund, Germany
| |
Collapse
|
68
|
Xu Y. Maximum Margin of Twin Spheres Support Vector Machine for Imbalanced Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:1540-1550. [PMID: 27116760 DOI: 10.1109/tcyb.2016.2551735] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Twin support vector machine (TSVM) finds two nonparallel planes by solving a pair of smaller-sized quadratic programming problems (QPPs) rather than a single large one as in the conventional support vector machine (SVM); this makes the learning speed of TSVM approximately four times faster than that of the standard SVM. One major limitation of TSVM is that it involves an expensive matrix inverse operation when solving the dual problem. In addition, TSVM is less effective when dealing with the imbalanced data. In this paper, we propose a maximum margin of twin spheres support vector machine (MMTSSVM) for imbalanced data classification. MMTSSVM only needs to find two homocentric spheres. On one hand, the small sphere captures as many samples in the majority class as possible; on the other hand, the large sphere pushes out most samples in the minority class by increasing the margin between two homocentric spheres. MMTSSVM involves a QPP and a linear programming problem as opposed to a pair of QPPs as in classical TSVM or a larger-sized QPP in SVM, thus it greatly increases the computational speed. More importantly, MMTSSVM avoids the matrix inverse operation. The property of parameters in MMTSSVM is discussed and testified by one artificial experiment. Experimental results on nine benchmark datasets demonstrate the effectiveness of the proposed MMTSSVM in comparison with state-of-the-art algorithms. Finally, we apply MMTSSVM into Alzheimer's disease medical experiment and also obtain a better experimental result.
Collapse
|
69
|
Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Green RC, Harvey D, Jack CR, Jagust W, Morris JC, Petersen RC, Saykin AJ, Shaw LM, Toga AW, Trojanowski JQ. Recent publications from the Alzheimer's Disease Neuroimaging Initiative: Reviewing progress toward improved AD clinical trials. Alzheimers Dement 2017; 13:e1-e85. [PMID: 28342697 DOI: 10.1016/j.jalz.2016.11.007] [Citation(s) in RCA: 177] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 11/21/2016] [Accepted: 11/28/2016] [Indexed: 01/31/2023]
Abstract
INTRODUCTION The Alzheimer's Disease Neuroimaging Initiative (ADNI) has continued development and standardization of methodologies for biomarkers and has provided an increased depth and breadth of data available to qualified researchers. This review summarizes the over 400 publications using ADNI data during 2014 and 2015. METHODS We used standard searches to find publications using ADNI data. RESULTS (1) Structural and functional changes, including subtle changes to hippocampal shape and texture, atrophy in areas outside of hippocampus, and disruption to functional networks, are detectable in presymptomatic subjects before hippocampal atrophy; (2) In subjects with abnormal β-amyloid deposition (Aβ+), biomarkers become abnormal in the order predicted by the amyloid cascade hypothesis; (3) Cognitive decline is more closely linked to tau than Aβ deposition; (4) Cerebrovascular risk factors may interact with Aβ to increase white-matter (WM) abnormalities which may accelerate Alzheimer's disease (AD) progression in conjunction with tau abnormalities; (5) Different patterns of atrophy are associated with impairment of memory and executive function and may underlie psychiatric symptoms; (6) Structural, functional, and metabolic network connectivities are disrupted as AD progresses. Models of prion-like spreading of Aβ pathology along WM tracts predict known patterns of cortical Aβ deposition and declines in glucose metabolism; (7) New AD risk and protective gene loci have been identified using biologically informed approaches; (8) Cognitively normal and mild cognitive impairment (MCI) subjects are heterogeneous and include groups typified not only by "classic" AD pathology but also by normal biomarkers, accelerated decline, and suspected non-Alzheimer's pathology; (9) Selection of subjects at risk of imminent decline on the basis of one or more pathologies improves the power of clinical trials; (10) Sensitivity of cognitive outcome measures to early changes in cognition has been improved and surrogate outcome measures using longitudinal structural magnetic resonance imaging may further reduce clinical trial cost and duration; (11) Advances in machine learning techniques such as neural networks have improved diagnostic and prognostic accuracy especially in challenges involving MCI subjects; and (12) Network connectivity measures and genetic variants show promise in multimodal classification and some classifiers using single modalities are rivaling multimodal classifiers. DISCUSSION Taken together, these studies fundamentally deepen our understanding of AD progression and its underlying genetic basis, which in turn informs and improves clinical trial design.
Collapse
Affiliation(s)
- Michael W Weiner
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA; Department of Radiology, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Psychiatry, University of California, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, CA, USA.
| | - Dallas P Veitch
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA
| | - Paul S Aisen
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | - Laurel A Beckett
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | - Nigel J Cairns
- Knight Alzheimer's Disease Research Center, Washington University School of Medicine, Saint Louis, MO, USA; Department of Neurology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Robert C Green
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Danielle Harvey
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | | | - William Jagust
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - John C Morris
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | | | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Leslie M Shaw
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Arthur W Toga
- Laboratory of Neuroimaging, Institute of Neuroimaging and Informatics, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - John Q Trojanowski
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Institute on Aging, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Alzheimer's Disease Core Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Udall Parkinson's Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | | |
Collapse
|
70
|
Zarogianni E, Storkey AJ, Johnstone EC, Owens DGC, Lawrie SM. Improved individualized prediction of schizophrenia in subjects at familial high risk, based on neuroanatomical data, schizotypal and neurocognitive features. Schizophr Res 2017; 181:6-12. [PMID: 27613509 DOI: 10.1016/j.schres.2016.08.027] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 08/29/2016] [Accepted: 08/29/2016] [Indexed: 01/11/2023]
Abstract
To date, there are no reliable markers for predicting onset of schizophrenia in individuals at high risk (HR). Substantial promise is, however, shown by a variety of pattern classification approaches to neuroimaging data. Here, we examined the predictive accuracy of support vector machine (SVM) in later diagnosing schizophrenia, at a single-subject level, using a cohort of HR individuals drawn from multiply affected families and a combination of neuroanatomical, schizotypal and neurocognitive variables. Baseline structural magnetic resonance imaging (MRI), schizotypal and neurocognitive data from 17 HR subjects, who subsequently developed schizophrenia and a matched group of 17 HR subjects who did not make the transition, yet had psychotic symptoms, were included in the analysis. We employed recursive feature elimination (RFE), in a nested cross-validation scheme to identify the most significant predictors of disease transition and enhance diagnostic performance. Classification accuracy was 94% when a self-completed measure of schizotypy, a declarative memory test and structural MRI data were combined into a single learning algorithm; higher than when either quantitative measure was used alone. The discriminative neuroanatomical pattern involved gray matter volume differences in frontal, orbito-frontal and occipital lobe regions bilaterally as well as parts of the superior, medial temporal lobe and cerebellar regions. Our findings suggest that an early SVM-based prediction of schizophrenia is possible and can be improved by combining schizotypal and neurocognitive features with neuroanatomical variables. However, our predictive model needs to be tested by classifying a new, independent HR cohort in order to estimate its validity.
Collapse
Affiliation(s)
- Eleni Zarogianni
- Division of Psychiatry, School of Clinical Sciences, University of Edinburgh, The Royal Edinburgh Hospital, Morningside Park, UK.
| | - Amos J Storkey
- Institute for Adaptive and Neural Computation, University of Edinburgh, UK
| | - Eve C Johnstone
- Division of Psychiatry, School of Clinical Sciences, University of Edinburgh, The Royal Edinburgh Hospital, Morningside Park, UK
| | - David G C Owens
- Division of Psychiatry, School of Clinical Sciences, University of Edinburgh, The Royal Edinburgh Hospital, Morningside Park, UK
| | - Stephen M Lawrie
- Division of Psychiatry, School of Clinical Sciences, University of Edinburgh, The Royal Edinburgh Hospital, Morningside Park, UK
| |
Collapse
|
71
|
CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION − MICCAI 2017 2017. [DOI: 10.1007/978-3-319-66179-7_73] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
72
|
Liu Y, Yieh L, Yang T, Drinkenburg W, Peeters P, Steckler T, Narayan VA, Wittenberg G, Ye J. Metabolomic biosignature differentiates melancholic depressive patients from healthy controls. BMC Genomics 2016; 17:669. [PMID: 27549765 PMCID: PMC4994306 DOI: 10.1186/s12864-016-2953-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Accepted: 07/19/2016] [Indexed: 12/28/2022] Open
Abstract
Background Major depressive disorder (MDD) is a heterogeneous disease at the level of clinical symptoms, and this heterogeneity is likely reflected at the level of biology. Two clinical subtypes within MDD that have garnered interest are “melancholic depression” and “anxious depression”. Metabolomics enables us to characterize hundreds of small molecules that comprise the metabolome, and recent work suggests the blood metabolome may be able to inform treatment decisions for MDD, however work is at an early stage. Here we examine a metabolomics data set to (1) test whether clinically homogenous MDD subtypes are also more biologically homogeneous, and hence more predictiable, (2) devise a robust machine learning framework that preserves biological meaning, and (3) describe the metabolomic biosignature for melancholic depression. Results With the proposed computational system we achieves around 80 % classification accuracy, sensitivity and specificity for melancholic depression, but only ~72 % for anxious depression or MDD, suggesting the blood metabolome contains more information about melancholic depression.. We develop an ensemble feature selection framework (EFSF) in which features are first clustered, and learning then takes place on the cluster centroids, retaining information about correlated features during the feature selection process rather than discarding them as most machine learning methods will do. Analysis of the most discriminative feature clusters revealed differences in metabolic classes such as amino acids and lipids as well as pathways studied extensively in MDD such as the activation of cortisol in chronic stress. Conclusions We find the greater clinical homogeneity does indeed lead to better prediction based on biological measurements in the case of melancholic depression. Melancholic depression is shown to be associated with changes in amino acids, catecholamines, lipids, stress hormones, and immune-related metabolites. The proposed computational framework can be adapted to analyze data from many other biomedical applications where the data has similar characteristics. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2953-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yashu Liu
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA
| | - Lynn Yieh
- Janssen Research & Development, LLC, 3210 Merryfield Row, San Diego, CA, 92121, USA
| | - Tao Yang
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA
| | | | - Pieter Peeters
- Janssen Research & Development, LLC, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Thomas Steckler
- Janssen Research & Development, LLC, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Vaibhav A Narayan
- Janssen Research & Development, LLC, 1125 Trenton-Harbourton Road, Titusville, NJ, USA
| | - Gayle Wittenberg
- Janssen Research & Development, LLC, 1125 Trenton-Harbourton Road, Titusville, NJ, USA
| | - Jieping Ye
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA.
| |
Collapse
|
73
|
Jian C, Gao J, Ao Y. A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.02.006] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
74
|
Wei R, Li C, Fogelson N, Li L. Prediction of Conversion from Mild Cognitive Impairment to Alzheimer's Disease Using MRI and Structural Network Features. Front Aging Neurosci 2016; 8:76. [PMID: 27148045 PMCID: PMC4836149 DOI: 10.3389/fnagi.2016.00076] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 03/29/2016] [Indexed: 12/30/2022] Open
Abstract
Optimized magnetic resonance imaging (MRI) features and abnormalities of brain network architectures may allow earlier detection and accurate prediction of the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD). In this study, we proposed a classification framework to distinguish MCI converters (MCIc) from MCI non-converters (MCInc) by using a combination of FreeSurfer-derived MRI features and nodal features derived from the thickness network. At the feature selection step, we first employed sparse linear regression with stability selection, for the selection of discriminative features in the iterative combinations of MRI and network measures. Subsequently the top K features of available combinations were selected as optimal features for classification. To obtain unbiased results, support vector machine (SVM) classifiers with nested cross validation were used for classification. The combination of 10 features including those from MRI and network measures attained accuracies of 66.04, 76.39, 74.66, and 73.91% for mixed conversion time, 6, 12, and 18 months before diagnosis of probable AD, respectively. Analysis of the diagnostic power of different time periods before diagnosis of probable AD showed that short-term prediction (6 and 12 months) achieved more stable and higher AUC scores compared with long-term prediction (18 months), with K-values from 1 to 30. The present results suggest that meaningful predictors composed of MRI and network measures may offer the possibility for early detection of progression from MCI to AD.
Collapse
Affiliation(s)
- Rizhen Wei
- Key Laboratory for NeuroInformation of Ministry of Education, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, Center for Information in Medicine, School of Life Science and Technology, University of Electronic Science and Technology of China Chengdu, China
| | - Chuhan Li
- Key Laboratory for NeuroInformation of Ministry of Education, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, Center for Information in Medicine, School of Life Science and Technology, University of Electronic Science and Technology of ChinaChengdu, China; School of Computer Science and Engineering, University of Electronic Science and Technology of ChinaChengdu, China
| | - Noa Fogelson
- EEG and Cognition Laboratory, University of A Coruña A Coruña, Spain
| | - Ling Li
- Key Laboratory for NeuroInformation of Ministry of Education, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, Center for Information in Medicine, School of Life Science and Technology, University of Electronic Science and Technology of China Chengdu, China
| |
Collapse
|
75
|
An empirical study of a hybrid imbalanced-class DT-RST classification procedure to elucidate therapeutic effects in uremia patients. Med Biol Eng Comput 2016; 54:983-1001. [DOI: 10.1007/s11517-016-1482-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 03/04/2016] [Indexed: 12/13/2022]
|
76
|
Sakurai K, Imabayashi E, Tokumaru AM, Ito K, Shimoji K, Nakagawa M, Ozawa Y, Shimohira M, Ogawa M, Morimoto S, Aiba I, Matsukawa N, Shibamoto Y. Volume of Interest Analysis of Spatially Normalized PRESTO Imaging to Differentiate between Parkinson Disease and Atypical Parkinsonian Syndrome. Magn Reson Med Sci 2016; 16:16-22. [PMID: 27001391 PMCID: PMC5600039 DOI: 10.2463/mrms.mp.2015-0132] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Purpose: Various magnetic resonance imaging (MRI) techniques including T2*-weighted imaging, susceptibility-weighted imaging, and MR relaxometry had been performed to evaluate different patterns of brain iron depositions in Parkinsonian syndrome. The aim of the present study was to evaluate the diagnostic value of a volume of interest (VOI) analysis on the principles of echo shifting with a train of observations (PRESTO) imaging using the statistical parametric mapping (SPM) 8 and the WFU PickAtlas program for the diagnosis of Parkinsonian syndrome. Methods: Fifty subjects, including 13 with the Parkinsonian variant of multiple system atrophy (MSA-P), 12 with progressive supranuclear palsy (PSP), 12 with Parkinson’s disease (PD) and 13 controls were evaluated in this study. After the spatial normalization of PRESTO images on SPM8, the WFU PickAtlas program was performed to create target VOIs in the putamen, red nucleus, substantia nigra, subthalamic nucleus, and dentate nucleus. The signal intensity ratio (SIR) was calculated by normalizing the signal of each VOI to that of the cerebrospinal fluid space. These SIRs were used as determinants in receiver operating characteristic (ROC) analyses. Results: SIR of the putamen was significantly lower in MSA-P than in PSP (P = 0.0051) and controls (P = 0.0004). In contrast, SIR of the red nucleus was significantly lower in PSP than in MSA-P (P = 0.0003), PD (P = 0.0029), and controls (P = 0.0011). In ROC analyses, SIR of the putamen exhibited the highest areas under the curves (AUCs) of 0.83 (vs. PSP) and 0.91 (vs. controls) in the diagnosis of MSA-P. On the other hand, SIR of the red nucleus exhibited the highest AUCs of 0.87 (vs. MSA-P), 0.90 (vs. PD), and 0.89 (vs. controls) in the diagnosis of PSP. Conclusions: The VOI analysis based on spatially normalized PRESTO images may be useful for depicting hypointensity, indicative of abnormal iron depositions, of the putamen and red nucleus in the diagnosis of MSA-P and PSP.
Collapse
Affiliation(s)
- Keita Sakurai
- Department of Radiology, Nagoya City University Graduate School of Medical Sciences
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
77
|
Passos IC, Mwangi B, Cao B, Hamilton JE, Wu MJ, Zhang XY, Zunta-Soares GB, Quevedo J, Kauer-Sant'Anna M, Kapczinski F, Soares JC. Identifying a clinical signature of suicidality among patients with mood disorders: A pilot study using a machine learning approach. J Affect Disord 2016; 193:109-16. [PMID: 26773901 PMCID: PMC4744514 DOI: 10.1016/j.jad.2015.12.066] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/11/2015] [Revised: 12/09/2015] [Accepted: 12/26/2015] [Indexed: 12/31/2022]
Abstract
OBJECTIVE A growing body of evidence has put forward clinical risk factors associated with patients with mood disorders that attempt suicide. However, what is not known is how to integrate clinical variables into a clinically useful tool in order to estimate the probability of an individual patient attempting suicide. METHOD A total of 144 patients with mood disorders were included. Clinical variables associated with suicide attempts among patients with mood disorders and demographic variables were used to 'train' a machine learning algorithm. The resulting algorithm was utilized in identifying novel or 'unseen' individual subjects as either suicide attempters or non-attempters. Three machine learning algorithms were implemented and evaluated. RESULTS All algorithms distinguished individual suicide attempters from non-attempters with prediction accuracy ranging between 65% and 72% (p<0.05). In particular, the relevance vector machine (RVM) algorithm correctly predicted 103 out of 144 subjects translating into 72% accuracy (72.1% sensitivity and 71.3% specificity) and an area under the curve of 0.77 (p<0.0001). The most relevant predictor variables in distinguishing attempters from non-attempters included previous hospitalizations for depression, a history of psychosis, cocaine dependence and post-traumatic stress disorder (PTSD) comorbidity. CONCLUSION Risk for suicide attempt among patients with mood disorders can be estimated at an individual subject level by incorporating both demographic and clinical variables. Future studies should examine the performance of this model in other populations and its subsequent utility in facilitating selection of interventions to prevent suicide.
Collapse
Affiliation(s)
- Ives Cavalcante Passos
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA,Bipolar Disorder Program and Laboratory of Molecular Psychiatry, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Benson Mwangi
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| | - Bo Cao
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| | - Jane E Hamilton
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| | - Mon-Ju Wu
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| | - Xiang Yang Zhang
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA,Beijing HuiLongGuan Hospital, Peking University, Beijing, China
| | - Giovana B. Zunta-Soares
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| | - Joao Quevedo
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| | - Marcia Kauer-Sant'Anna
- Bipolar Disorder Program and Laboratory of Molecular Psychiatry, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Flávio Kapczinski
- Bipolar Disorder Program and Laboratory of Molecular Psychiatry, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Jair C. Soares
- Center of Excellence on Mood Disorder, Department of Psychiatry and Behavioral Sciences, The University of Texas Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
78
|
Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Cedarbaum J, Green RC, Harvey D, Jack CR, Jagust W, Luthman J, Morris JC, Petersen RC, Saykin AJ, Shaw L, Shen L, Schwarz A, Toga AW, Trojanowski JQ. 2014 Update of the Alzheimer's Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimers Dement 2015; 11:e1-120. [PMID: 26073027 PMCID: PMC5469297 DOI: 10.1016/j.jalz.2014.11.001] [Citation(s) in RCA: 214] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 04/18/2013] [Indexed: 01/18/2023]
Abstract
The Alzheimer's Disease Neuroimaging Initiative (ADNI) is an ongoing, longitudinal, multicenter study designed to develop clinical, imaging, genetic, and biochemical biomarkers for the early detection and tracking of Alzheimer's disease (AD). The initial study, ADNI-1, enrolled 400 subjects with early mild cognitive impairment (MCI), 200 with early AD, and 200 cognitively normal elderly controls. ADNI-1 was extended by a 2-year Grand Opportunities grant in 2009 and by a competitive renewal, ADNI-2, which enrolled an additional 550 participants and will run until 2015. This article reviews all papers published since the inception of the initiative and summarizes the results to the end of 2013. The major accomplishments of ADNI have been as follows: (1) the development of standardized methods for clinical tests, magnetic resonance imaging (MRI), positron emission tomography (PET), and cerebrospinal fluid (CSF) biomarkers in a multicenter setting; (2) elucidation of the patterns and rates of change of imaging and CSF biomarker measurements in control subjects, MCI patients, and AD patients. CSF biomarkers are largely consistent with disease trajectories predicted by β-amyloid cascade (Hardy, J Alzheimer's Dis 2006;9(Suppl 3):151-3) and tau-mediated neurodegeneration hypotheses for AD, whereas brain atrophy and hypometabolism levels show predicted patterns but exhibit differing rates of change depending on region and disease severity; (3) the assessment of alternative methods of diagnostic categorization. Currently, the best classifiers select and combine optimum features from multiple modalities, including MRI, [(18)F]-fluorodeoxyglucose-PET, amyloid PET, CSF biomarkers, and clinical tests; (4) the development of blood biomarkers for AD as potentially noninvasive and low-cost alternatives to CSF biomarkers for AD diagnosis and the assessment of α-syn as an additional biomarker; (5) the development of methods for the early detection of AD. CSF biomarkers, β-amyloid 42 and tau, as well as amyloid PET may reflect the earliest steps in AD pathology in mildly symptomatic or even nonsymptomatic subjects and are leading candidates for the detection of AD in its preclinical stages; (6) the improvement of clinical trial efficiency through the identification of subjects most likely to undergo imminent future clinical decline and the use of more sensitive outcome measures to reduce sample sizes. Multimodal methods incorporating APOE status and longitudinal MRI proved most highly predictive of future decline. Refinements of clinical tests used as outcome measures such as clinical dementia rating-sum of boxes further reduced sample sizes; (7) the pioneering of genome-wide association studies that leverage quantitative imaging and biomarker phenotypes, including longitudinal data, to confirm recently identified loci, CR1, CLU, and PICALM and to identify novel AD risk loci; (8) worldwide impact through the establishment of ADNI-like programs in Japan, Australia, Argentina, Taiwan, China, Korea, Europe, and Italy; (9) understanding the biology and pathobiology of normal aging, MCI, and AD through integration of ADNI biomarker and clinical data to stimulate research that will resolve controversies about competing hypotheses on the etiopathogenesis of AD, thereby advancing efforts to find disease-modifying drugs for AD; and (10) the establishment of infrastructure to allow sharing of all raw and processed data without embargo to interested scientific investigators throughout the world.
Collapse
Affiliation(s)
- Michael W Weiner
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA; Department of Radiology, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Psychiatry, University of California, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, CA, USA.
| | - Dallas P Veitch
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA
| | - Paul S Aisen
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
| | - Laurel A Beckett
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | - Nigel J Cairns
- Knight Alzheimer's Disease Research Center, Washington University School of Medicine, Saint Louis, MO, USA; Department of Neurology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Jesse Cedarbaum
- Neurology Early Clinical Development, Biogen Idec, Cambridge, MA, USA
| | - Robert C Green
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Danielle Harvey
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | | | - William Jagust
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Johan Luthman
- Neuroscience Clinical Development, Neuroscience & General Medicine Product Creation Unit, Eisai Inc., Philadelphia, PA, USA
| | - John C Morris
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
| | | | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Leslie Shaw
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Li Shen
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Adam Schwarz
- Tailored Therapeutics, Eli Lilly and Company, Indianapolis, IN, USA
| | - Arthur W Toga
- Laboratory of Neuroimaging, Institute of Neuroimaging and Informatics, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - John Q Trojanowski
- Institute on Aging, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Alzheimer's Disease Core Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Udall Parkinson's Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Department of Pathology and Laboratory Medicine, Center for Neurodegenerative Research, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
79
|
Rahman HAA, Wah YB, He H, Bulgiba A. Comparisons of ADABOOST, KNN, SVM and Logistic Regression in Classification of Imbalanced Dataset. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2015:54-64. [DOI: 10.1007/978-981-287-936-3_6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
80
|
Nie Z, Yang T, Liu Y, Lin B, Li Q, Narayan VA, Wittenberg G, Ye J. Melancholic depression prediction by identifying representative features in metabolic and microarray profiles with missing values. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2015:455-466. [PMID: 25592604 PMCID: PMC4299923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Recent studies have revealed that melancholic depression, one major subtype of depression, is closely associated with the concentration of some metabolites and biological functions of certain genes and pathways. Meanwhile, recent advances in biotechnologies have allowed us to collect a large amount of genomic data, e.g., metabolites and microarray gene expression. With such a huge amount of information available, one approach that can give us new insights into the understanding of the fundamental biology underlying melancholic depression is to build disease status prediction models using classification or regression methods. However, the existence of strong empirical correlations, e.g., those exhibited by genes sharing the same biological pathway in microarray profiles, tremendously limits the performance of these methods. Furthermore, the occurrence of missing values which are ubiquitous in biomedical applications further complicates the problem. In this paper, we hypothesize that the problem of missing values might in some way benefit from the correlation between the variables and propose a method to learn a compressed set of representative features through an adapted version of sparse coding which is capable of identifying correlated variables and addressing the issue of missing values simultaneously. An efficient algorithm is also developed to solve the proposed formulation. We apply the proposed method on metabolic and microarray profiles collected from a group of subjects consisting of both patients with melancholic depression and healthy controls. Results show that the proposed method can not only produce meaningful clusters of variables but also generate a set of representative features that achieve superior classification performance over those generated by traditional clustering and data imputation techniques. In particular, on both datasets, we found that in comparison with the competing algorithms, the representative features learned by the proposed method give rise to significantly improved sensitivity scores, suggesting that the learned features allow prediction with high accuracy of disease status in those who are diagnosed with melancholic depression. To our best knowledge, this is the first work that applies sparse coding to deal with high feature correlations and missing values, which are common challenges in many biomedical applications. The proposed method can be readily adapted to other biomedical applications involving incomplete and high-dimensional data.
Collapse
Affiliation(s)
- Zhi Nie
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Tao Yang
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Yashu Liu
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Binbin Lin
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Qingyang Li
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Vaibhav A Narayan
- Johnson & Johnson Pharmaceutical Research & Development, LLC, Titusville, NJ, USA
| | - Gayle Wittenberg
- Johnson & Johnson Pharmaceutical Research & Development, LLC, Titusville, NJ, USA
| | - Jieping Ye
- Department of Computer Science and Engineering, Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
81
|
Yang T, Zhao X, Lin B, Zeng T, Ji S, Ye J. Automated gene expression pattern annotation in the mouse brain. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2015; 20:144-155. [PMID: 25592576 PMCID: PMC4299912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Brain tumor is a fatal central nervous system disease that occurs in around 250,000 people each year globally and it is the second cause of cancer in children. It has been widely acknowledged that genetic factor is one of the significant risk factors for brain cancer. Thus, accurate descriptions of the locations of where the relative genes are active and how these genes express are critical for understanding the pathogenesis of brain tumor and for early detection. The Allen Developing Mouse Brain Atlas is a project on gene expression over the course of mouse brain development stages. Utilizing mouse models allows us to use a relatively homogeneous system to reveal the genetic risk factor of brain cancer. In the Allen atlas, about 435,000 high-resolution spatiotemporal in situ hybridization images have been generated for approximately 2,100 genes and currently the expression patterns over specific brain regions are manually annotated by experts, which does not scale with the continuously expanding collection of images. In this paper, we present an efficient computational approach to perform automated gene expression pattern annotation on brain images. First, the gene expression information in the brain images is captured by invariant features extracted from local image patches. Next, we adopt an augmented sparse coding method, called Stochastic Coordinate Coding, to construct high-level representations. Different pooling methods are then applied to generate gene-level features. To discriminate gene expression patterns at specific brain regions, we employ supervised learning methods to build accurate models for both binary-class and multi-class cases. Random undersampling and majority voting strategies are utilized to deal with the inherently imbalanced class distribution within each annotation task in order to further improve predictive performance. In addition, we propose a novel structure-based multi-label classification approach, which makes use of label hierarchy based on brain ontology during model learning. Extensive experiments have been conducted on the atlas and results show that the proposed approach produces higher annotation accuracy than several baseline methods. Our approach is shown to be robust on both binary-class and multi-class tasks and even with a relatively low training ratio. Our results also show that the use of label hierarchy can significantly improve the annotation accuracy at all brain ontology levels.
Collapse
Affiliation(s)
- Tao Yang
- Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA.
| | | | | | | | | | | |
Collapse
|
82
|
The utility of cerebral perfusion SPECT analysis using SPM8, eZIS and vbSEE for the diagnosis of multiple system atrophy-parkinsonism. Ann Nucl Med 2014; 29:206-13. [DOI: 10.1007/s12149-014-0928-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2014] [Accepted: 11/09/2014] [Indexed: 12/12/2022]
|
83
|
An assessment on producing synthetic samples by fuzzy C-means for limited number of data in prediction models. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2014.06.056] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
84
|
Sakurai K, Imabayashi E, Tokumaru AM, Hasebe S, Murayama S, Morimoto S, Kanemaru K, Takao M, Shibamoto Y, Matsukawa N. The feasibility of white matter volume reduction analysis using SPM8 plus DARTEL for the diagnosis of patients with clinically diagnosed corticobasal syndrome and Richardson's syndrome. NEUROIMAGE-CLINICAL 2014; 7:605-10. [PMID: 26082887 PMCID: PMC4459051 DOI: 10.1016/j.nicl.2014.02.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Revised: 02/17/2014] [Accepted: 02/19/2014] [Indexed: 11/29/2022]
Abstract
Purpose Diagnosing corticobasal degeneration (CBD) and progressive supranuclear palsy (PSP) is often difficult due to the wide variety of symptoms and overlaps in the similar clinical courses and neurological findings. The purpose of this study was to evaluate the utility of white matter (WM) atrophy for the diagnosis of patients with clinically diagnosed CBD (corticobasal syndrome, CBS) and PSP (Richardson’s syndrome, RS). Methods We randomly divided the 3D T1-weighted MR images of 18 CBS patients, 33 RS patients, and 32 age-matched controls into two groups. We obtained segmented WM images in the first group using Voxel-based specific regional analysis system for Alzheimer’s disease (VSRAD) based on statistical parametric mapping (SPM) 8 plus diffeomorphic anatomical registration through exponentiated Lie algebra. A target volume of interest (VOI) for disease-specific atrophy was subsequently determined in this group using SPM8 group analyses of WM atrophy between patients groups and controls. We then evaluated the utility of these VOIs for diagnosing CBS and RS patients in the second group. Z score values in these VOIs were used as the determinant in receiver operating characteristic (ROC) analyses. Results Specific target VOIs were determined in the bilateral frontal subcortical WM for CBS and in the midbrain tegmentum for RS. In ROC analyses, the target VOIs of CBS and RS compared to those of controls exhibited an area under curve (AUC) of 0.99 and 0.84, respectively, which indicated an adequate diagnostic power. The VOI of CBS revealed a higher AUC than that of RS for differentiating between CBS and RS (AUC, 0.75 vs 0.53). Conclusions Bilateral frontal WM volume reduction demonstrated a higher power for differentiating CBS from RS. This VOI analysis is useful for clinically diagnosing CBS and RS. ・We evaluate the utility of white matter (WM) atrophy for the diagnosis of patients with corticobasal syndrome (CBS) and Richardson’s syndrome (RS). ・We obtained segmented WM images using Voxel-based specific regional analysis system for Alzheimer’ s disease based on statistical parametric mapping 8 plus diffeomorphic anatomical registration through exponentiated Lie algebra. ・The most significant areas of atrophy observed in CBS patients compared to the controls were in the bilateral frontal subcortical WM. ・The most significant areas of atrophy observed in RS patients compared to the controls were in the midbrain. ・The volume of interest analysis using bilateral frontal WM volume reduction demonstrated a higher power for differentiating CBS from RS.
Collapse
Affiliation(s)
- Keita Sakurai
- Department of Diagnostic Radiology, Tokyo Metropolitan Medical Center of Gerontology
| | - Etsuko Imabayashi
- Department of Diagnostic Radiology, Tokyo Metropolitan Medical Center of Gerontology
| | - Aya M Tokumaru
- Department of Diagnostic Radiology, Tokyo Metropolitan Medical Center of Gerontology
| | - Shin Hasebe
- Department of Diagnostic Radiology, Tokyo Metropolitan Medical Center of Gerontology
| | - Shigeo Murayama
- Department of Neurology, Tokyo Metropolitan Geriatric Hospital
| | - Satoru Morimoto
- Department of Neurology, Tokyo Metropolitan Geriatric Hospital
| | | | - Masaki Takao
- Department of Neuropathology (the Brain Bank for Aging Research), Tokyo Metropolitan Geriatric Hospital, Tokyo Metropolitan Geriatric Hospital and Institute of Gerontology
| | - Yuta Shibamoto
- Department of Radiology, Nagoya City University Graduate School of Medical Sciences
| | - Noriyuki Matsukawa
- Department of Neurology and Neuroscience, Nagoya City University Graduate School of Medical Sciences
| |
Collapse
|