1
|
Wang Y, Huang C, Li P, Niu B, Fan T, Wang H, Zhou Y, Chai Y. Machine learning-based discrimination of unipolar depression and bipolar disorder with streamlined shortlist in adolescents of different ages. Comput Biol Med 2024; 182:109107. [PMID: 39288554 DOI: 10.1016/j.compbiomed.2024.109107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 08/30/2024] [Accepted: 09/02/2024] [Indexed: 09/19/2024]
Abstract
BACKGROUND Variations in symptoms and indistinguishable depression episodes of unipolar depression (UD) and bipolar disorder (BD) make the discrimination difficult and time-consuming. For adolescents with high disease prevalence, an efficient diagnostic tool is important for the discrimination and treatment of BU and UD. METHODS This multi-center cross-sectional study involved 1587 UD and 246 BD adolescents aged 12-18. A combination of standard questionnaires and demographic information was collected for the construction of a full-item list. The unequal patient number was balanced with three data balancing algorithms, and 4 machine learning algorithms were compared for the discrimination ability of UD and BD in three age groups: all ages, 12-15 and 16-18. Random forest (RF) with the highest accuracy were used to rank the importance of features/items and construct the 25-item shortlist. A separate dataset was used for the final performance evaluation with the shortlist, and the discrimination ability for UD and BD was investigated. RESULTS RF performed the best for UD and BD discrimination in all 3 age groups (AUC 0.88-0.90). The most important features that differentiate UD from BD belong to Parental Bonding Instrument (PBI) and Loneliness Scale of the University of California at Los Angeles (UCLA). With RF and the 25-item shortlist, the diagnostic accuracy can still reach around 80 %, achieving 95 % of the accuracy levels obtained with all features. CONCLUSIONS Through machine learning algorithms, the most influencing factors for UD and BD classification were recombined and applied for rapid diagnosis. This highly feasible method holds the potential for convenient and accurate diagnosis of young patients in research and clinical practice.
Collapse
Affiliation(s)
- Yang Wang
- College of Management, Shenzhen University, Shenzhen, China
| | - Cheng Huang
- Greater Bay Area International Institute for Innovations, Shenzhen University, Shenzhen, China
| | - Pingping Li
- Greater Bay Area International Institute for Innovations, Shenzhen University, Shenzhen, China
| | - Ben Niu
- College of Management, Shenzhen University, Shenzhen, China
| | - Tingxuan Fan
- Greater Bay Area International Institute for Innovations, Shenzhen University, Shenzhen, China
| | - Hairong Wang
- Greater Bay Area International Institute for Innovations, Shenzhen University, Shenzhen, China
| | | | - Yujuan Chai
- School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, 518060, China.
| |
Collapse
|
2
|
Zhu T, Liu X, Wang J, Kou R, Hu Y, Yuan M, Yuan C, Luo L, Zhang W. Explainable machine-learning algorithms to differentiate bipolar disorder from major depressive disorder using self-reported symptoms, vital signs, and blood-based markers. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107723. [PMID: 37480646 DOI: 10.1016/j.cmpb.2023.107723] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 06/26/2023] [Accepted: 07/15/2023] [Indexed: 07/24/2023]
Abstract
BACKGROUND AND OBJECTIVE Caused by shared genetic risk factors and similar neuropsychological symptoms, bipolar disorder (BD) and major depressive disorder (MDD) are at high risk of misdiagnosis, which is associated with ineffective treatment and worsening of outcomes. We aimed to develop a machine learning (ML)-based diagnostic system, based on electronic medical records (EMR) data, to mimic the clinical reasoning of human physicians to differentiate MDD and BD (especially BD depressive episodes) patients about to be admitted to a hospital and, hence, reduce the misdiagnosis of BD as MDD on admission. In addition, we examined to what extent our ML model could be made interpretable by quantifying and visualizing the features that drive the predictions. METHODS By identifying 16,311 patients admitted to a hospital located in western China between 2009 and 2018 with a recorded main diagnosis of MDD or BD, we established three sub-cohorts with different combinations of features for both the MDD-BD cohort and the MDD-BD depressive episodes cohort, respectively. Four different ML algorithms (logistic regression, extreme gradient boosting (XGBoost), random forest, and support vector machine) and four train-test splits were used to train and validate diagnostic models, and explainable methods (SHAP and Break Down) were utilized to analyze the contribution of each of the features at both population-level and individual-level, including feature importance, feature interaction, and feature effect on prediction decision for a specific subject. RESULTS The XGBoost algorithm provided the best test performance (AUC: 0.838 (0.810-0.867), PPV: 0.810 and NPV: 0.834) for separating patients with BD from those with MDD. Core predictors included symptoms (mood-up, exciting, bad sleep, loss of interest, talking, mood-down, provoke), along with age, job, myocardial enzyme markers (creatine kinase, hydroxybutyrate dehydrogenase), diabetes-associated marker (glucose), bone function marker (alkaline phosphatase), non-enzymatic antioxidant (uric acid), markers of immune/inflammation (white blood cell count, lymphocyte count, basophil percentage, monocyte count), cardiovascular function marker (low density lipoprotein), renal marker (total protein), liver biochemistry marker (indirect bilirubin), and vital signs like pulse. For separating patients with BD depressive episodes from those with MDD, the test AUC was 0.777 (0.732-0.822), with PPV 0.576 and NPV 0.899. Additional validation in models built with self-reported symptoms removed from the feature set, showed test AUC of 0.701 (0.666-0.736) for differentiating BD and MDD, and AUC of 0.564 (0.515-0.614) for detecting patients in BD depressive episodes from MDD patients. Validation in the datasets without removing the patients with comorbidity showed an AUC of 0.826 (0.806-0.846). CONCLUSION The diagnostic system accurately identified patients with BD in various clinical scenarios, and differences in patterns of peripheral markers between BD and MDD could enrich our understanding of potential underlying pathophysiological mechanisms of them.
Collapse
Affiliation(s)
- Ting Zhu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China
| | - Xiaofei Liu
- Business School, Sichuan University, Chengdu, China
| | - Junren Wang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China
| | - Ran Kou
- Business School, Sichuan University, Chengdu, China
| | - Yao Hu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China
| | - Minlan Yuan
- Mental Health Center of West China Hospital, Sichuan University, Chengdu, China
| | - Cui Yuan
- Sichuan Provincial Center for Mental Health, The Center of Psychosomatic Medicine of Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Li Luo
- Business School, Sichuan University, Chengdu, China
| | - Wei Zhang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China; Mental Health Center of West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
3
|
Zeng J, Zhang Y, Xiang Y, Liang S, Xue C, Zhang J, Ran Y, Cao M, Huang F, Huang S, Deng W, Li T. Optimizing multi-domain hematologic biomarkers and clinical features for the differential diagnosis of unipolar depression and bipolar depression. NPJ MENTAL HEALTH RESEARCH 2023; 2:4. [PMID: 38609642 PMCID: PMC10955811 DOI: 10.1038/s44184-023-00024-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 03/01/2023] [Indexed: 04/14/2024]
Abstract
There is a lack of objective features for the differential diagnosis of unipolar and bipolar depression, especially those that are readily available in practical settings. We investigated whether clinical features of disease course, biomarkers from complete blood count, and blood biochemical markers could accurately classify unipolar and bipolar depression using machine learning methods. This retrospective study included 1160 eligible patients (918 with unipolar depression and 242 with bipolar depression). Patient data were randomly split into training (85%) and open test (15%) sets 1000 times, and the average performance was reported. XGBoost achieved the optimal open-test performance using selected biomarkers and clinical features-AUC 0.889, sensitivity 0.831, specificity 0.839, and accuracy 0.863. The importance of features for differential diagnosis was measured using SHapley Additive exPlanations (SHAP) values. The most informative features include (1) clinical features of disease duration and age of onset, (2) biochemical markers of albumin, low density lipoprotein (LDL), and potassium, and (3) complete blood count-derived biomarkers of white blood cell count (WBC), platelet-to-lymphocyte ratio (PLR), and monocytes (MONO). Overall, onset features and hematologic biomarkers appear to be reliable information that can be readily obtained in clinical settings to facilitate the differential diagnosis of unipolar and bipolar depression.
Collapse
Affiliation(s)
- Jinkun Zeng
- Hangzhou Seventh People's Hospital, Affiliated Mental Health Center, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yaoyun Zhang
- Alibaba Damo Academy, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, China
| | - Yutao Xiang
- Center for Cognition and Brain Sciences, Unit of Psychiatry, Institute of Translational Medicine, University of Macau, Macao, China
| | - Sugai Liang
- Hangzhou Seventh People's Hospital, Affiliated Mental Health Center, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Chuang Xue
- Hangzhou Seventh People's Hospital, Affiliated Mental Health Center, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Junhang Zhang
- Hangzhou Seventh People's Hospital, Affiliated Mental Health Center, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Ya Ran
- West China Hospital, Sichuan University, Sichuan, China
| | - Minne Cao
- Hangzhou Seventh People's Hospital, Affiliated Mental Health Center, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Fei Huang
- Alibaba Damo Academy, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, China
| | - Songfang Huang
- Alibaba Damo Academy, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, China
| | - Wei Deng
- Department of Neurobiology, Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, 311121, Hangzhou, China.
| | - Tao Li
- Department of Neurobiology, Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, 311121, Hangzhou, China.
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, 310058, Hangzhou, China.
| |
Collapse
|
4
|
Jan Z, Ai-Ansari N, Mousa O, Abd-Alrazaq A, Ahmed A, Alam T, Househ M. The Role of Machine Learning in Diagnosing Bipolar Disorder: Scoping Review. J Med Internet Res 2021; 23:e29749. [PMID: 34806996 PMCID: PMC8663682 DOI: 10.2196/29749] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 07/02/2021] [Accepted: 07/27/2021] [Indexed: 01/10/2023] Open
Abstract
Background Bipolar disorder (BD) is the 10th most common cause of frailty in young individuals and has triggered morbidity and mortality worldwide. Patients with BD have a life expectancy 9 to 17 years lower than that of normal people. BD is a predominant mental disorder, but it can be misdiagnosed as depressive disorder, which leads to difficulties in treating affected patients. Approximately 60% of patients with BD are treated for depression. However, machine learning provides advanced skills and techniques for better diagnosis of BD. Objective This review aims to explore the machine learning algorithms used for the detection and diagnosis of bipolar disorder and its subtypes. Methods The study protocol adopted the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines. We explored 3 databases, namely Google Scholar, ScienceDirect, and PubMed. To enhance the search, we performed backward screening of all the references of the included studies. Based on the predefined selection criteria, 2 levels of screening were performed: title and abstract review, and full review of the articles that met the inclusion criteria. Data extraction was performed independently by all investigators. To synthesize the extracted data, a narrative synthesis approach was followed. Results We retrieved 573 potential articles were from the 3 databases. After preprocessing and screening, only 33 articles that met our inclusion criteria were identified. The most commonly used data belonged to the clinical category (19, 58%). We identified different machine learning models used in the selected studies, including classification models (18, 55%), regression models (5, 16%), model-based clustering methods (2, 6%), natural language processing (1, 3%), clustering algorithms (1, 3%), and deep learning–based models (3, 9%). Magnetic resonance imaging data were most commonly used for classifying bipolar patients compared to other groups (11, 34%), whereas microarray expression data sets and genomic data were the least commonly used. The maximum ratio of accuracy was 98%, whereas the minimum accuracy range was 64%. Conclusions This scoping review provides an overview of recent studies based on machine learning models used to diagnose patients with BD regardless of their demographics or if they were compared to patients with psychiatric diagnoses. Further research can be conducted to provide clinical decision support in the health industry.
Collapse
Affiliation(s)
- Zainab Jan
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| | - Noor Ai-Ansari
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| | - Osama Mousa
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| | - Alaa Abd-Alrazaq
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| | - Arfan Ahmed
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar.,Department of Psychiatry, Weill Cornell Medicine, Education City, Doha, Qatar
| | - Tanvir Alam
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| | - Mowafa Househ
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Education City, Doha, Qatar
| |
Collapse
|
5
|
Jan Z, Ai-ansari N, Mousa O, Abd-alrazaq A, Ahmed A, Alam T, Househ M. The Role of Machine Learning in Diagnosing Bipolar Disorder: Scoping Review (Preprint).. [DOI: 10.2196/preprints.29749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
BACKGROUND
Bipolar disorder (BD) is the 10th most common cause of frailty in young individuals and has triggered morbidity and mortality worldwide. Patients with BD have a life expectancy 9 to 17 years lower than that of normal people. BD is a predominant mental disorder, but it can be misdiagnosed as depressive disorder, which leads to difficulties in treating affected patients. Approximately 60% of patients with BD are treated for depression. However, machine learning provides advanced skills and techniques for better diagnosis of BD.
OBJECTIVE
This review aims to explore the machine learning algorithms used for the detection and diagnosis of bipolar disorder and its subtypes.
METHODS
The study protocol adopted the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines. We explored 3 databases, namely Google Scholar, ScienceDirect, and PubMed. To enhance the search, we performed backward screening of all the references of the included studies. Based on the predefined selection criteria, 2 levels of screening were performed: title and abstract review, and full review of the articles that met the inclusion criteria. Data extraction was performed independently by all investigators. To synthesize the extracted data, a narrative synthesis approach was followed.
RESULTS
We retrieved 573 potential articles were from the 3 databases. After preprocessing and screening, only 33 articles that met our inclusion criteria were identified. The most commonly used data belonged to the clinical category (19, 58%). We identified different machine learning models used in the selected studies, including classification models (18, 55%), regression models (5, 16%), model-based clustering methods (2, 6%), natural language processing (1, 3%), clustering algorithms (1, 3%), and deep learning–based models (3, 9%). Magnetic resonance imaging data were most commonly used for classifying bipolar patients compared to other groups (11, 34%), whereas microarray expression data sets and genomic data were the least commonly used. The maximum ratio of accuracy was 98%, whereas the minimum accuracy range was 64%.
CONCLUSIONS
This scoping review provides an overview of recent studies based on machine learning models used to diagnose patients with BD regardless of their demographics or if they were compared to patients with psychiatric diagnoses. Further research can be conducted to provide clinical decision support in the health industry.
Collapse
|