1
|
Tabashum T, Snyder RC, O'Brien MK, Albert MV. Machine Learning Models for Parkinson Disease: Systematic Review. JMIR Med Inform 2024; 12:e50117. [PMID: 38771237 PMCID: PMC11112052 DOI: 10.2196/50117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 02/12/2024] [Accepted: 04/01/2024] [Indexed: 05/22/2024] Open
Abstract
Background With the increasing availability of data, computing resources, and easier-to-use software libraries, machine learning (ML) is increasingly used in disease detection and prediction, including for Parkinson disease (PD). Despite the large number of studies published every year, very few ML systems have been adopted for real-world use. In particular, a lack of external validity may result in poor performance of these systems in clinical practice. Additional methodological issues in ML design and reporting can also hinder clinical adoption, even for applications that would benefit from such data-driven systems. Objective To sample the current ML practices in PD applications, we conducted a systematic review of studies published in 2020 and 2021 that used ML models to diagnose PD or track PD progression. Methods We conducted a systematic literature review in accordance with PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines in PubMed between January 2020 and April 2021, using the following exact string: "Parkinson's" AND ("ML" OR "prediction" OR "classification" OR "detection" or "artificial intelligence" OR "AI"). The search resulted in 1085 publications. After a search query and review, we found 113 publications that used ML for the classification or regression-based prediction of PD or PD-related symptoms. Results Only 65.5% (74/113) of studies used a holdout test set to avoid potentially inflated accuracies, and approximately half (25/46, 54%) of the studies without a holdout test set did not state this as a potential concern. Surprisingly, 38.9% (44/113) of studies did not report on how or if models were tuned, and an additional 27.4% (31/113) used ad hoc model tuning, which is generally frowned upon in ML model optimization. Only 15% (17/113) of studies performed direct comparisons of results with other models, severely limiting the interpretation of results. Conclusions This review highlights the notable limitations of current ML systems and techniques that may contribute to a gap between reported performance in research and the real-life applicability of ML models aiming to detect and predict diseases such as PD.
Collapse
Affiliation(s)
- Thasina Tabashum
- Department of Computer Science and Engineering, University of North Texas, Denton, TX, United States
| | - Robert Cooper Snyder
- Department of Computer Science and Engineering, University of North Texas, Denton, TX, United States
| | - Megan K O'Brien
- Technology and Innovation Hub, Shirley Ryan AbilityLab, Chicago, IL, United States
- Department of Physical Medicine & Rehabilitation, Northwestern University, Chicago, IL, United States
| | - Mark V Albert
- Department of Computer Science and Engineering, University of North Texas, Denton, TX, United States
- Department of Biomedical Engineering, University of North Texas, Denton, TX, United States
| |
Collapse
|
2
|
Bhidayasiri R, Sringean J, Phumphid S, Anan C, Thanawattano C, Deoisres S, Panyakaew P, Phokaewvarangkul O, Maytharakcheep S, Buranasrikul V, Prasertpan T, Khontong R, Jagota P, Chaisongkram A, Jankate W, Meesri J, Chantadunga A, Rattanajun P, Sutaphan P, Jitpugdee W, Chokpatcharavate M, Avihingsanon Y, Sittipunt C, Sittitrai W, Boonrach G, Phonsrithong A, Suvanprakorn P, Vichitcholchai J, Bunnag T. The rise of Parkinson's disease is a global challenge, but efforts to tackle this must begin at a national level: a protocol for national digital screening and "eat, move, sleep" lifestyle interventions to prevent or slow the rise of non-communicable diseases in Thailand. Front Neurol 2024; 15:1386608. [PMID: 38803644 PMCID: PMC11129688 DOI: 10.3389/fneur.2024.1386608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/19/2024] [Indexed: 05/29/2024] Open
Abstract
The rising prevalence of Parkinson's disease (PD) globally presents a significant public health challenge for national healthcare systems, particularly in low-to-middle income countries, such as Thailand, which may have insufficient resources to meet these escalating healthcare needs. There are also many undiagnosed cases of early-stage PD, a period when therapeutic interventions would have the most value and least cost. The traditional "passive" approach, whereby clinicians wait for patients with symptomatic PD to seek treatment, is inadequate. Proactive, early identification of PD will allow timely therapeutic interventions, and digital health technologies can be scaled up in the identification and early diagnosis of cases. The Parkinson's disease risk survey (TCTR20231025005) aims to evaluate a digital population screening platform to identify undiagnosed PD cases in the Thai population. Recognizing the long prodromal phase of PD, the target demographic for screening is people aged ≥ 40 years, approximately 20 years before the usual emergence of motor symptoms. Thailand has a highly rated healthcare system with an established universal healthcare program for citizens, making it ideal for deploying a national screening program using digital technology. Designed by a multidisciplinary group of PD experts, the digital platform comprises a 20-item questionnaire about PD symptoms along with objective tests of eight digital markers: voice vowel, voice sentences, resting and postural tremor, alternate finger tapping, a "pinch-to-size" test, gait and balance, with performance recorded using a mobile application and smartphone's sensors. Machine learning tools use the collected data to identify subjects at risk of developing, or with early signs of, PD. This article describes the selection and validation of questionnaire items and digital markers, with results showing the chosen parameters and data analysis methods to be robust, reliable, and reproducible. This digital platform could serve as a model for similar screening strategies for other non-communicable diseases in Thailand.
Collapse
Affiliation(s)
- Roongroj Bhidayasiri
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
- The Academy of Science, The Royal Society of Thailand, Bangkok, Thailand
| | - Jirada Sringean
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Saisamorn Phumphid
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Chanawat Anan
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | | | - Suwijak Deoisres
- National Electronics and Computer Technology Centre, Pathum Thani, Thailand
| | - Pattamon Panyakaew
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Onanong Phokaewvarangkul
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Suppata Maytharakcheep
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Vijittra Buranasrikul
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Tittaya Prasertpan
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
- Sawanpracharak Hospital, Nakhon Sawan, Thailand
| | | | - Priya Jagota
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Araya Chaisongkram
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Worawit Jankate
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Jeeranun Meesri
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Araya Chantadunga
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Piyaporn Rattanajun
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Phantakarn Sutaphan
- Chulalongkorn Centre of Excellence for Parkinson’s Disease and Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Weerachai Jitpugdee
- Department of Rehabilitation Medicine, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
| | - Marisa Chokpatcharavate
- Chulalongkorn Parkinson's Disease Support Group, Department of Medicine, Faculty of Medicine, Chulalongkorn Centre of Excellence for Parkinson's Disease and Related Disorders, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Bangkok, Thailand
| | - Yingyos Avihingsanon
- Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Thai Red Cross Society, Bangkok, Thailand
| | - Chanchai Sittipunt
- Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Thai Red Cross Society, Bangkok, Thailand
| | | | | | | | | | | | - Tej Bunnag
- Thai Red Cross Society, Bangkok, Thailand
| |
Collapse
|
3
|
Mohsen F, Al-Absi HRH, Yousri NA, El Hajj N, Shah Z. A scoping review of artificial intelligence-based methods for diabetes risk prediction. NPJ Digit Med 2023; 6:197. [PMID: 37880301 PMCID: PMC10600138 DOI: 10.1038/s41746-023-00933-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 09/25/2023] [Indexed: 10/27/2023] Open
Abstract
The increasing prevalence of type 2 diabetes mellitus (T2DM) and its associated health complications highlight the need to develop predictive models for early diagnosis and intervention. While many artificial intelligence (AI) models for T2DM risk prediction have emerged, a comprehensive review of their advancements and challenges is currently lacking. This scoping review maps out the existing literature on AI-based models for T2DM prediction, adhering to the PRISMA extension for Scoping Reviews guidelines. A systematic search of longitudinal studies was conducted across four databases, including PubMed, Scopus, IEEE-Xplore, and Google Scholar. Forty studies that met our inclusion criteria were reviewed. Classical machine learning (ML) models dominated these studies, with electronic health records (EHR) being the predominant data modality, followed by multi-omics, while medical imaging was the least utilized. Most studies employed unimodal AI models, with only ten adopting multimodal approaches. Both unimodal and multimodal models showed promising results, with the latter being superior. Almost all studies performed internal validation, but only five conducted external validation. Most studies utilized the area under the curve (AUC) for discrimination measures. Notably, only five studies provided insights into the calibration of their models. Half of the studies used interpretability methods to identify key risk predictors revealed by their models. Although a minority highlighted novel risk predictors, the majority reported commonly known ones. Our review provides valuable insights into the current state and limitations of AI-based models for T2DM prediction and highlights the challenges associated with their development and clinical integration.
Collapse
Affiliation(s)
- Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Hamada R H Al-Absi
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Noha A Yousri
- Genetic Medicine, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- Computer and Systems Engineering, Alexandria University, Alexandria, Egypt
| | - Nady El Hajj
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Zubair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar.
| |
Collapse
|
4
|
Dipietro L, Gonzalez-Mego P, Ramos-Estebanez C, Zukowski LH, Mikkilineni R, Rushmore RJ, Wagner T. The evolution of Big Data in neuroscience and neurology. JOURNAL OF BIG DATA 2023; 10:116. [PMID: 37441339 PMCID: PMC10333390 DOI: 10.1186/s40537-023-00751-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 05/08/2023] [Indexed: 07/15/2023]
Abstract
Neurological diseases are on the rise worldwide, leading to increased healthcare costs and diminished quality of life in patients. In recent years, Big Data has started to transform the fields of Neuroscience and Neurology. Scientists and clinicians are collaborating in global alliances, combining diverse datasets on a massive scale, and solving complex computational problems that demand the utilization of increasingly powerful computational resources. This Big Data revolution is opening new avenues for developing innovative treatments for neurological diseases. Our paper surveys Big Data's impact on neurological patient care, as exemplified through work done in a comprehensive selection of areas, including Connectomics, Alzheimer's Disease, Stroke, Depression, Parkinson's Disease, Pain, and Addiction (e.g., Opioid Use Disorder). We present an overview of research and the methodologies utilizing Big Data in each area, as well as their current limitations and technical challenges. Despite the potential benefits, the full potential of Big Data in these fields currently remains unrealized. We close with recommendations for future research aimed at optimizing the use of Big Data in Neuroscience and Neurology for improved patient outcomes. Supplementary Information The online version contains supplementary material available at 10.1186/s40537-023-00751-2.
Collapse
Affiliation(s)
| | - Paola Gonzalez-Mego
- Spaulding Rehabilitation/Neuromodulation Lab, Harvard Medical School, Cambridge, MA USA
| | | | | | | | | | - Timothy Wagner
- Highland Instruments, Cambridge, MA USA
- Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA USA
| |
Collapse
|
5
|
Rani S, Jain A. Optimizing healthcare system by amalgamation of text processing and deep learning: a systematic review. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-25. [PMID: 37362695 PMCID: PMC10183315 DOI: 10.1007/s11042-023-15539-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 05/18/2022] [Accepted: 04/19/2023] [Indexed: 06/28/2023]
Abstract
The explosion of clinical textual data has drawn the attention of researchers. Owing to the abundance of clinical data, it is becoming difficult for healthcare professionals to take real-time measures. The tools and methods are lacking when compared to the amount of clinical data generated every day. This review aims to survey the text processing pipeline with deep learning methods such as CNN, RNN, LSTM, and GRU in the healthcare domain and discuss various applications such as clinical concept detection and extraction, medically aware dialogue systems, sentiment analysis of drug reviews shared online, clinical trial matching, and pharmacovigilance. In addition, we highlighted the major challenges in deploying text processing with deep learning to clinical textual data and identified the scope of research in this domain. Furthermore, we have discussed various resources that can be used in the future to optimize the healthcare domain by amalgamating text processing and deep learning.
Collapse
Affiliation(s)
- Somiya Rani
- Department of Computer Science and Engineering, NSUT East Campus (erstwhile AIACTR), Affiliated to Guru Gobind Singh Indraprastha University, Delhi, India
| | - Amita Jain
- Department of Computer Science and Engineering, Netaji Subhas University of Technology, Delhi, India
| |
Collapse
|
6
|
Xu C, Neuroth T, Fujiwara T, Liang R, Ma KL. A Predictive Visual Analytics System for Studying Neurodegenerative Disease Based on DTI Fiber Tracts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:2020-2035. [PMID: 34965212 DOI: 10.1109/tvcg.2021.3137174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Diffusion tensor imaging (DTI) has been used to study the effects of neurodegenerative diseases on neural pathways, which may lead to more reliable and early diagnosis of these diseases as well as a better understanding of how they affect the brain. We introduce a predictive visual analytics system for studying patient groups based on their labeled DTI fiber tract data and corresponding statistics. The system's machine-learning-augmented interface guides the user through an organized and holistic analysis space, including the statistical feature space, the physical space, and the space of patients over different groups. We use a custom machine learning pipeline to help narrow down this large analysis space and then explore it pragmatically through a range of linked visualizations. We conduct several case studies using DTI and T1-weighted images from the research database of Parkinson's Progression Markers Initiative.
Collapse
|
7
|
Merkelbach K, Schaper S, Diedrich C, Fritsch SJ, Schuppert A. Novel architecture for gated recurrent unit autoencoder trained on time series from electronic health records enables detection of ICU patient subgroups. Sci Rep 2023; 13:4053. [PMID: 36906642 PMCID: PMC10008580 DOI: 10.1038/s41598-023-30986-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/03/2023] [Indexed: 03/13/2023] Open
Abstract
Electronic health records (EHRs) are used in hospitals to store diagnoses, clinician notes, examinations, lab results, and interventions for each patient. Grouping patients into distinct subsets, for example, via clustering, may enable the discovery of unknown disease patterns or comorbidities, which could eventually lead to better treatment through personalized medicine. Patient data derived from EHRs is heterogeneous and temporally irregular. Therefore, traditional machine learning methods like PCA are ill-suited for analysis of EHR-derived patient data. We propose to address these issues with a new methodology based on training a gated recurrent unit (GRU) autoencoder directly on health record data. Our method learns a low-dimensional feature space by training on patient data time series, where the time of each data point is expressed explicitly. We use positional encodings for time, allowing our model to better handle the temporal irregularity of the data. We apply our method to data from the Medical Information Mart for Intensive Care (MIMIC-III). Using our data-derived feature space, we can cluster patients into groups representing major classes of disease patterns. Additionally, we show that our feature space exhibits a rich substructure at multiple scales.
Collapse
Affiliation(s)
- Kilian Merkelbach
- JRC-COMBINE, RWTH Aachen University, MTZ, Pauwelsstrasse 19, Level 3, 52074, Aachen, Germany
| | - Steffen Schaper
- Pharmacometrics / Modeling and Simulation, Bayer AG - Pharmaceuticals, Leverkusen, Germany
| | - Christian Diedrich
- Pharmacometrics / Modeling and Simulation, Bayer AG - Pharmaceuticals, Leverkusen, Germany
| | - Sebastian Johannes Fritsch
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Pauwelsstrasse 30, 52074, Aachen, Germany.,Juelich Supercomputing Centre, Forschungszentrum Juelich, Wilhelm-Johnen-Straße, 52428, Juelich, Germany
| | - Andreas Schuppert
- JRC-COMBINE, RWTH Aachen University, MTZ, Pauwelsstrasse 19, Level 3, 52074, Aachen, Germany.
| |
Collapse
|
8
|
Kline A, Wang H, Li Y, Dennis S, Hutch M, Xu Z, Wang F, Cheng F, Luo Y. Multimodal machine learning in precision health: A scoping review. NPJ Digit Med 2022; 5:171. [PMID: 36344814 PMCID: PMC9640667 DOI: 10.1038/s41746-022-00712-8] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/14/2022] [Indexed: 11/09/2022] Open
Abstract
Machine learning is frequently being leveraged to tackle problems in the health sector including utilization for clinical decision-support. Its use has historically been focused on single modal data. Attempts to improve prediction and mimic the multimodal nature of clinical expert decision-making has been met in the biomedical field of machine learning by fusing disparate data. This review was conducted to summarize the current studies in this field and identify topics ripe for future research. We conducted this review in accordance with the PRISMA extension for Scoping Reviews to characterize multi-modal data fusion in health. Search strings were established and used in databases: PubMed, Google Scholar, and IEEEXplore from 2011 to 2021. A final set of 128 articles were included in the analysis. The most common health areas utilizing multi-modal methods were neurology and oncology. Early fusion was the most common data merging strategy. Notably, there was an improvement in predictive performance when using data fusion. Lacking from the papers were clear clinical deployment strategies, FDA-approval, and analysis of how using multimodal approaches from diverse sub-populations may improve biases and healthcare disparities. These findings provide a summary on multimodal data fusion as applied to health diagnosis/prognosis problems. Few papers compared the outputs of a multimodal approach with a unimodal prediction. However, those that did achieved an average increase of 6.4% in predictive accuracy. Multi-modal machine learning, while more robust in its estimations over unimodal methods, has drawbacks in its scalability and the time-consuming nature of information concatenation.
Collapse
Affiliation(s)
- Adrienne Kline
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA
| | - Hanyin Wang
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA
| | - Yikuan Li
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA
| | - Saya Dennis
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA
| | - Meghan Hutch
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA
| | - Zhenxing Xu
- Department of Population Health Sciences, Cornell University, New York, 10065, NY, USA
| | - Fei Wang
- Department of Population Health Sciences, Cornell University, New York, 10065, NY, USA
| | - Feixiong Cheng
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, 44195, OH, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Chicago, 60201, IL, USA.
| |
Collapse
|
9
|
Kowalczyk A, Kosiek K, Godycki-Cwirko M, Zakowska I. Community determinants of COPD exacerbations in elderly patients in Lodz province, Poland: a retrospective observational Big Data cohort study. BMJ Open 2022; 12:e060247. [PMID: 36270759 PMCID: PMC9594524 DOI: 10.1136/bmjopen-2021-060247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES To evaluate the prevalence and identify demographic, economic and environmental local community determinants of chronic obstructive pulmonary disease (COPD) exacerbations in elderly in primary care using Big Data approach. DESIGN Retrospective observational case-control study based on Big Data from the National Health Found, Tax Office and National Statistics Center databases in 2016. SETTING Primary care clinics in the Lodz province in Poland. PARTICIPANTS 472 314 patients aged 65 and older in primary care, including 17 240 patients with COPD and 1784 with exacerbations (including deaths). OUTCOME MEASURES Exacerbations with demographic, economic and environmental local community determinants were retrieved. Conditional logistic regression for matched pairs was used to evaluate the local community determinants of COPD exacerbations among patients with COPD. RESULTS The overall prevalence of COPD in the population of elderly patients registered in primary healthcare clinic clinics in Lodz province in 2016 was 3.65%, 95% CI (3.60% to 3.70%) and the prevalence of exacerbations was 10.35%, 95% CI (9.89% to 10.80%). The high number of consultations in primary care clinics was associated with higher risk of COPD exacerbations (p=0.0687).High-income patients were less likely to have exacerbations than low-income patients (high vs low OR 0.601, 95% CI (0.385 to 0.939)). The specialisation of the primary care physician did not have an effect on exacerbations (OR 1.076, 95% CI (0.920 to 1.257)). Neither the forest cover per gmina (high vs low OR 0.897, 95% CI (0.605 to 1.331); medium vs low OR 0.925, 95% CI (0.648 to 1.322)), nor location of gmina (urban vs urban-rural OR 1.044; 95% CI (0.673 to 1.620)), (rural vs urban-rural OR 0.897, 95% CI (0.630 to 1.277)) appears to influence COPD exacerbations. CONCLUSIONS Big Data statistical analysis facilitated the evaluation of the prevalence and determinants of COPD exacerbation in the elderly residents of Lodz province, Poland.Modification of identified local community determinants may potentially decrease the number of exacerbations in elderly patients with COPD.
Collapse
Affiliation(s)
- Anna Kowalczyk
- Centre for Family and Community Medicine, Faculty of Medical Sciences, Medical University of Lodz, Lodz, Poland
| | | | - Maciek Godycki-Cwirko
- Centre for Family and Community Medicine, Faculty of Medical Sciences, Medical University of Lodz, Lodz, Poland
| | - Izabela Zakowska
- Centre for Family and Community Medicine, Faculty of Medical Sciences, Medical University of Lodz, Lodz, Poland
| |
Collapse
|
10
|
Abstract
Artificial intelligence is already innovating in the provision of neurologic care. This review explores key artificial intelligence concepts; their application to neurologic diagnosis, prognosis, and treatment; and challenges that await their broader adoption. The development of new diagnostic biomarkers, individualization of prognostic information, and improved access to treatment are among the plethora of possibilities. These advances, however, reflect only the tip of the iceberg for the ways in which artificial intelligence may transform neurologic care in the future.
Collapse
Affiliation(s)
- James M Hillis
- Digital Clinical Research Organization, Data Science Office, Mass General Brigham, Boston, Massachusetts.,Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Bernardo C Bizzo
- Digital Clinical Research Organization, Data Science Office, Mass General Brigham, Boston, Massachusetts.,Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
11
|
Diagnostic classification of Parkinson’s disease based on non-motor manifestations and machine learning strategies. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07256-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractNon-motor manifestations of Parkinson’s disease (PD) appear early and have a significant impact on the quality of life of patients, but few studies have evaluated their predictive potential with machine learning algorithms. We evaluated 9 algorithms for discriminating PD patients from controls using a wide collection of non-motor clinical PD features from two databases: Biocruces (96 subjects) and PPMI (687 subjects). In addition, we evaluated whether the combination of both databases could improve the individual results. For each database 2 versions with different granularity were created and a feature selection process was performed. We observed that most of the algorithms were able to detect PD patients with high accuracy (>80%). Support Vector Machine and Multi-Layer Perceptron obtained the best performance, with an accuracy of 86.3% and 84.7%, respectively. Likewise, feature selection led to a significant reduction in the number of variables and to better performance. Besides, the enrichment of Biocruces database with data from PPMI moderately benefited the performance of the classification algorithms, especially the recall and to a lesser extent the accuracy, while the precision worsened slightly. The use of interpretable rules obtained by the RIPPER algorithm showed that simply using two variables (autonomic manifestations and olfactory dysfunction), it was possible to achieve an accuracy of 84.4%. Our study demonstrates that the analysis of non-motor parameters of PD through machine learning techniques can detect PD patients with high accuracy and recall, and allows us to select the most discriminative non-motor variables to create potential tools for PD screening.
Collapse
|
12
|
Salari N, Kazeminia M, Sagha H, Daneshkhah A, Ahmadi A, Mohammadi M. The performance of various machine learning methods for Parkinson’s disease recognition: a systematic review. CURRENT PSYCHOLOGY 2022. [DOI: 10.1007/s12144-022-02949-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
13
|
Personalizing decision-making for persons with Parkinson's disease: where do we stand and what to improve? J Neurol 2022; 269:3569-3578. [PMID: 35084559 PMCID: PMC9217860 DOI: 10.1007/s00415-022-10969-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 11/05/2022]
Abstract
Background The large variety in symptoms and treatment effects across different persons with Parkinson’s disease (PD) warrants a personalized approach, ensuring that the best decision is made for each individual. We aimed to further clarify this process of personalized decision-making, from the perspective of medical professionals. Methods We audio-taped 52 consultations with PD patients and their neurologist or PD nurse-specialist, in 6 outpatient clinics. We focused coding of the transcripts on which decisions were made and on if and how decisions were personalized. We subsequently interviewed professionals to elaborate on how and why decisions were personalized, and which decisions would benefit most from a more personalized approach. Results Most decisions were related to medication, referral or lifestyle. Professionals balanced clinical factors, including individual (disease-) characteristics, and non-clinical factors, including patients’ preference, for each type of decision. These factors were often not explicitly discussed with the patient. Professionals experienced difficulties in personalizing decisions, mostly because evidence on the impact of characteristics of an individual patient on the outcome of the decision is unavailable. Categories of decisions for which professionals emphasized the importance of a more personalized perspective include choices not only for medication and advanced treatments, but also for referrals, lifestyle and diagnosis. Conclusions Clinical decision-making is a complex process, influenced by many different factors that differ for each decision and for each individual. In daily practice, it proves difficult to tailor decisions to individual (disease-) characteristics, probably because sufficient evidence on the impact of these individual characteristics on outcomes is lacking.
Collapse
|
14
|
Mining imaging and clinical data with machine learning approaches for the diagnosis and early detection of Parkinson's disease. NPJ Parkinsons Dis 2022; 8:13. [PMID: 35064123 PMCID: PMC8783003 DOI: 10.1038/s41531-021-00266-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 12/10/2021] [Indexed: 12/14/2022] Open
Abstract
Parkinson’s disease (PD) is a common, progressive, and currently incurable neurodegenerative movement disorder. The diagnosis of PD is challenging, especially in the differential diagnosis of parkinsonism and in early PD detection. Due to the advantages of machine learning such as learning complex data patterns and making inferences for individuals, machine-learning techniques have been increasingly applied to the diagnosis of PD, and have shown some promising results. Machine-learning-based imaging applications have made it possible to help differentiate parkinsonism and detect PD at early stages automatically in a number of neuroimaging studies. Comparative studies have shown that machine-learning-based SPECT image analysis applications in PD have outperformed conventional semi-quantitative analysis in detecting PD-associated dopaminergic degeneration, performed comparably well as experts’ visual inspection, and helped improve PD diagnostic accuracy of radiologists. Using combined multi-modal (imaging and clinical) data in these applications may further enhance PD diagnosis and early detection. To integrate machine-learning-based diagnostic applications into clinical systems, further validation and optimization of these applications are needed to make them accurate and reliable. It is anticipated that machine-learning techniques will further help improve differential diagnosis of parkinsonism and early detection of PD, which may reduce the error rate of PD diagnosis and help detect PD at pre-motor stage to make it possible for early treatments (e.g., neuroprotective treatment) to slow down PD progression, prevent severe motor symptoms from emerging, and relieve patients from suffering.
Collapse
|
15
|
E E, Carey JJ, Wang T, Yang L, Chan WP, Whelan B, Silke C, O'Sullivan M, Rooney B, McPartland A, O'Malley G, Brennan A, Yu M, Dempsey M. Conceptual design of the dual X-ray absorptiometry health informatics prediction system for osteoporosis care. Health Informatics J 2022; 28:14604582211066465. [PMID: 35257612 DOI: 10.1177/14604582211066465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Osteoporotic fractures are a major and growing public health problem, which is strongly associated with other illnesses and multi-morbidity. Big data analytics has the potential to improve care for osteoporotic fractures and other non-communicable diseases (NCDs), reduces healthcare costs and improves healthcare decision-making for patients with multi-disorders. However, robust and comprehensive utilization of healthcare big data in osteoporosis care practice remains unsatisfactory. In this paper, we present a conceptual design of an intelligent analytics system, namely, the dual X-ray absorptiometry (DXA) health informatics prediction (HIP) system, for healthcare big data research and development. Comprising data source, extraction, transformation, loading, modelling and application, the DXA HIP system was applied in an osteoporosis healthcare context for fracture risk prediction and the investigation of multi-morbidity risk. Data was sourced from four DXA machines located in three healthcare centres in Ireland. The DXA HIP system is novel within the Irish context as it enables the study of fracture-related issues in a larger and more representative Irish population than previous studies. We propose this system is applicable to investigate other NCDs which have the potential to improve the overall quality of patient care and substantially reduce the burden and cost of all NCDs.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Attracta Brennan
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | | | | |
Collapse
|
16
|
Rainey S, Erden YJ, Resseguier A. AIM, Philosophy, and Ethics. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
17
|
Li W, Zhou X, Yang Q. Designing medical artificial intelligence for in- and out-groups. COMPUTERS IN HUMAN BEHAVIOR 2021. [DOI: 10.1016/j.chb.2021.106929] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
18
|
Gul S, Bano S, Shah T. Exploring data mining: facets and emerging trends. DIGITAL LIBRARY PERSPECTIVES 2021. [DOI: 10.1108/dlp-08-2020-0078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an emerging field and manifests itself in the form of different techniques such as information mining; big data mining; big data mining and Internet of Things (IoT); and educational data mining. This paper aims to discuss how these technologies and techniques are used to derive information and, eventually, knowledge from data.
Design/methodology/approach
An extensive review of literature on data mining and its allied techniques was carried to ascertain the emerging procedures and techniques in the domain of data mining. Clarivate Analytic’s Web of Science and Sciverse Scopus were explored to discover the extent of literature published on Data Mining and its varied facets. Literature was searched against various keywords such as data mining; information mining; big data; big data and IoT; and educational data mining. Further, the works citing the literature on data mining were also explored to visualize a broad gamut of emerging techniques about this growing field.
Findings
The study validates that knowledge discovery in databases has rendered data mining as an emerging field; the data present in these databases paves the way for data mining techniques and analytics. This paper provides a unique view about the usage of data, and logical patterns derived from it, how new procedures, algorithms and mining techniques are being continuously upgraded for their multipurpose use for the betterment of human life and experiences.
Practical implications
The paper highlights different aspects of data mining, its different technological approaches, and how these emerging data technologies are used to derive logical insights from data and make data more meaningful.
Originality/value
The paper tries to highlight the current trends and facets of data mining.
Collapse
|
19
|
Anderson C, Bekele Z, Qiu Y, Tschannen D, Dinov ID. Modeling and prediction of pressure injury in hospitalized patients using artificial intelligence. BMC Med Inform Decis Mak 2021; 21:253. [PMID: 34461876 PMCID: PMC8406893 DOI: 10.1186/s12911-021-01608-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 08/08/2021] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Hospital-acquired pressure injuries (PIs) induce significant patient suffering, inflate healthcare costs, and increase clinical co-morbidities. PIs are mostly due to bed-immobility, sensory impairment, bed positioning, and length of hospital stay. In this study, we use electronic health records and administrative data to examine the contributing factors to PI development using artificial intelligence (AI). METHODS We used advanced data science techniques to first preprocess the data and then train machine learning classifiers to predict the probability of developing PIs. The AI training was based on large, incongruent, incomplete, heterogeneous, and time-varying data of hospitalized patients. Both model-based statistical methods and model-free AI strategies were used to forecast PI outcomes and determine the salient features that are highly predictive of the outcomes. RESULTS Our findings reveal that PI prediction by model-free techniques outperform model-based forecasts. The performance of all AI methods is improved by rebalancing the training data and by including the Braden in the model learning phase. Compared to neural networks and linear modeling, with and without rebalancing or using Braden scores, Random forest consistently generated the optimal PI forecasts. CONCLUSIONS AI techniques show promise to automatically identify patients at risk for hospital acquired PIs in different surgical services. Our PI prediction model provide a first generation of AI guidance to prescreen patients at risk for developing PIs. CLINICAL IMPACT This study provides a foundation for designing, implementing, and assessing novel interventions addressing specific healthcare needs. Specifically, this approach allows examining the impact of various dynamic, personalized, and clinical-environment effects on PI prevention for hospital patients receiving care from various surgical services.
Collapse
Affiliation(s)
- Christine Anderson
- grid.214458.e0000000086837370School of Nursing, University of Michigan, Ann Arbor, MI 48109 USA
| | - Zerihun Bekele
- grid.214458.e0000000086837370Statistics Online Computational Resource (SOCR), University of Michigan, Ann Arbor, MI 48109 USA
| | - Yongkai Qiu
- grid.131063.60000 0001 2168 0066Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556 USA
| | - Dana Tschannen
- grid.214458.e0000000086837370School of Nursing, University of Michigan, Ann Arbor, MI 48109 USA
| | - Ivo D. Dinov
- grid.214458.e0000000086837370School of Nursing, University of Michigan, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Statistics Online Computational Resource (SOCR), University of Michigan, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Department of Health Behavior and Biological Sciences (HBBS), School of Nursing, University of Michigan, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Michigan Institute for Data Science (MIDAS), University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
20
|
Vellameeran FA, Brindha T. An integrated review on machine learning approaches for heart disease prediction: Direction towards future research gaps. BIO-ALGORITHMS AND MED-SYSTEMS 2021. [DOI: 10.1515/bams-2020-0069] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Abstract
Objectives
To make a clear literature review on state-of-the-art heart disease prediction models.
Methods
It reviews 61 research papers and states the significant analysis. Initially, the analysis addresses the contributions of each literature works and observes the simulation environment. Here, different types of machine learning algorithms deployed in each contribution. In addition, the utilized dataset for existing heart disease prediction models was observed.
Results
The performance measures computed in entire papers like prediction accuracy, prediction error, specificity, sensitivity, f-measure, etc., are learned. Further, the best performance is also checked to confirm the effectiveness of entire contributions.
Conclusions
The comprehensive research challenges and the gap are portrayed based on the development of intelligent methods concerning the unresolved challenges in heart disease prediction using data mining techniques.
Collapse
Affiliation(s)
| | - Thomas Brindha
- Department of Information Technology , Noorul Islam Centre for Higher Education , Kanyakumari , India
| |
Collapse
|
21
|
Zhao L, Batta I, Matloff W, O'Driscoll C, Hobel S, Toga AW. Neuroimaging PheWAS (Phenome-Wide Association Study): A Free Cloud-Computing Platform for Big-Data, Brain-Wide Imaging Association Studies. Neuroinformatics 2021; 19:285-303. [PMID: 32822005 DOI: 10.1007/s12021-020-09486-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Large-scale, case-control genome-wide association studies (GWASs) have revealed genetic variations associated with diverse neurological and psychiatric disorders. Recent advances in neuroimaging and genomic databases of large healthy and diseased cohorts have empowered studies to characterize effects of the discovered genetic factors on brain structure and function, implicating neural pathways and genetic mechanisms in the underlying biology. However, the unprecedented scale and complexity of the imaging and genomic data requires new advanced biomedical data science tools to manage, process and analyze the data. In this work, we introduce Neuroimaging PheWAS (phenome-wide association study): a web-based system for searching over a wide variety of brain-wide imaging phenotypes to discover true system-level gene-brain relationships using a unified genotype-to-phenotype strategy. This design features a user-friendly graphical user interface (GUI) for anonymous data uploading, study definition and management, and interactive result visualizations as well as a cloud-based computational infrastructure and multiple state-of-art methods for statistical association analysis and multiple comparison correction. We demonstrated the potential of Neuroimaging PheWAS with a case study analyzing the influences of the apolipoprotein E (APOE) gene on various brain morphological properties across the brain in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Benchmark tests were performed to evaluate the system's performance using data from UK Biobank. The Neuroimaging PheWAS system is freely available. It simplifies the execution of PheWAS on neuroimaging data and provides an opportunity for imaging genetics studies to elucidate routes at play for specific genetic variants on diseases in the context of detailed imaging phenotypic data.
Collapse
Affiliation(s)
- Lu Zhao
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Ishaan Batta
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - William Matloff
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Caroline O'Driscoll
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Samuel Hobel
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Arthur W Toga
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
22
|
Mei J, Desrosiers C, Frasnelli J. Machine Learning for the Diagnosis of Parkinson's Disease: A Review of Literature. Front Aging Neurosci 2021; 13:633752. [PMID: 34025389 PMCID: PMC8134676 DOI: 10.3389/fnagi.2021.633752] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 03/22/2021] [Indexed: 12/26/2022] Open
Abstract
Diagnosis of Parkinson's disease (PD) is commonly based on medical observations and assessment of clinical signs, including the characterization of a variety of motor symptoms. However, traditional diagnostic approaches may suffer from subjectivity as they rely on the evaluation of movements that are sometimes subtle to human eyes and therefore difficult to classify, leading to possible misclassification. In the meantime, early non-motor symptoms of PD may be mild and can be caused by many other conditions. Therefore, these symptoms are often overlooked, making diagnosis of PD at an early stage challenging. To address these difficulties and to refine the diagnosis and assessment procedures of PD, machine learning methods have been implemented for the classification of PD and healthy controls or patients with similar clinical presentations (e.g., movement disorders or other Parkinsonian syndromes). To provide a comprehensive overview of data modalities and machine learning methods that have been used in the diagnosis and differential diagnosis of PD, in this study, we conducted a literature review of studies published until February 14, 2020, using the PubMed and IEEE Xplore databases. A total of 209 studies were included, extracted for relevant information and presented in this review, with an investigation of their aims, sources of data, types of data, machine learning methods and associated outcomes. These studies demonstrate a high potential for adaptation of machine learning methods and novel biomarkers in clinical decision making, leading to increasingly systematic, informed diagnosis of PD.
Collapse
Affiliation(s)
- Jie Mei
- Chemosensory Neuroanatomy Lab, Department of Anatomy, Université du Québec à Trois-Rivières (UQTR), Trois-Rivières, QC, Canada
| | - Christian Desrosiers
- Laboratoire d'Imagerie, de Vision et d'Intelligence Artificielle (LIVIA), Department of Software and IT Engineering, École de Technologie Supérieure, Montreal, QC, Canada
| | - Johannes Frasnelli
- Chemosensory Neuroanatomy Lab, Department of Anatomy, Université du Québec à Trois-Rivières (UQTR), Trois-Rivières, QC, Canada
- Centre de Recherche de l'Hôpital du Sacré-Coeur de Montréal, Centre Intégré Universitaire de Santé et de Services Sociaux du Nord-de-l'Île-de-Montréal (CIUSSS du Nord-de-l'Île-de-Montréal), Montreal, QC, Canada
| |
Collapse
|
23
|
van den Heuvel L, Dorsey RR, Prainsack B, Post B, Stiggelbout AM, Meinders MJ, Bloem BR. Quadruple Decision Making for Parkinson's Disease Patients: Combining Expert Opinion, Patient Preferences, Scientific Evidence, and Big Data Approaches to Reach Precision Medicine. JOURNAL OF PARKINSONS DISEASE 2021; 10:223-231. [PMID: 31561387 PMCID: PMC7029360 DOI: 10.3233/jpd-191712] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Clinical decision making for Parkinson’s disease patients is supported by a combination of three distinct information resources: best available scientific evidence, professional expertise, and the personal needs and preferences of patients. All three sources have clear value but also share several important limitations, mainly regarding subjectivity, generalizability and variability. For example, current scientific evidence, especially from controlled clinical trials, is often based on selected study populations, making it difficult to translate the outcome to the care for individual patients in everyday clinical practice. Big data, including data from real-life unselected Parkinson populations, can help to bridge this information gap. Fine-grained patient profiles created from big data have the potential to aid in identifying therapeutic approaches that will be most effective given each patient’s individual characteristics, which is particularly important for a disorder characterized by such tremendous interindividual variability as Parkinson’s disease. In this viewpoint, we argue that big data approaches should be acknowledged and harnessed, not to replace existing information resources, but rather as a fourth and complimentary source of information in clinical decision making, helping to represent the full complexity of individual patients. We introduce the ‘quadruple decision making’ model and illustrate its mode of action by showing how this can be used to pursue precision medicine for persons living with Parkinson’s disease.
Collapse
Affiliation(s)
- Lieneke van den Heuvel
- Department of Neurology, Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Ray R Dorsey
- Department of Neurology, University of Rochester Medical Centre, Rochester, NY, USA
| | - Barbara Prainsack
- Department of Political Science, University of Vienna, AT; and Department of Global Health & Social Medicine, King's College London, London, UK
| | - Bart Post
- Department of Neurology, Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Anne M Stiggelbout
- Medical Decision Making, Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, The Netherlands
| | - Marjan J Meinders
- Radboud University Medical Centre, Radboud Institute for Health Sciences; Scientific Centre for Quality of Healthcare, Nijmegen, The Netherlands
| | - Bastiaan R Bloem
- Department of Neurology, Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| |
Collapse
|
24
|
Veronesi G, Grassi G, Savelli G, Quatto P, Zambon A. Big data, observational research and P-value: a recipe for false-positive findings? A study of simulated and real prospective cohorts. Int J Epidemiol 2021; 49:876-884. [PMID: 31620789 PMCID: PMC7394945 DOI: 10.1093/ije/dyz206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/11/2019] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND An increasing number of observational studies combine large sample sizes with low participation rates, which could lead to standard inference failing to control the false-discovery rate. We investigated if the 'empirical calibration of P-value' method (EPCV), reliant on negative controls, can preserve type I error in the context of survival analysis. METHODS We used simulated cohort studies with 50% participation rate and two different selection bias mechanisms, and a real-life application on predictors of cancer mortality using data from four population-based cohorts in Northern Italy (n = 6976 men and women aged 25-74 years at baseline and 17 years of median follow-up). RESULTS Type I error for the standard Cox model was above the 5% nominal level in 15 out of 16 simulated settings; for n = 10 000, the chances of a null association with hazard ratio = 1.05 having a P-value < 0.05 were 42.5%. Conversely, EPCV with 10 negative controls preserved the 5% nominal level in all the simulation settings, reducing bias in the point estimate by 80-90% when its main assumption was verified. In the real case, 15 out of 21 (71%) blood markers with no association with cancer mortality according to literature had a P-value < 0.05 in age- and gender-adjusted Cox models. After calibration, only 1 (4.8%) remained statistically significant. CONCLUSIONS In the analyses of large observational studies prone to selection bias, the use of empirical distribution to calibrate P-values can substantially reduce the number of trivial results needing further screening for relevance and external validity.
Collapse
Affiliation(s)
- Giovanni Veronesi
- Research Center in Epidemiology and Preventive Medicine, Department of Medicine and Surgery, University of Insubria, Varese, Italy
| | - Guido Grassi
- Clinica Medica, Department of Medicine and Surgery, University of Milano-Bicocca, Milano, Italy
| | - Giordano Savelli
- U.O. Medicina Nucleare, Fondazione Poliambulanza Istituto Ospedaliero, Brescia, Italy
| | - Piero Quatto
- Department of Economics, Management and Statistics
| | - Antonella Zambon
- Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milano, Italy
| |
Collapse
|
25
|
Varrecchia T, Castiglia SF, Ranavolo A, Conte C, Tatarelli A, Coppola G, Di Lorenzo C, Draicchio F, Pierelli F, Serrao M. An artificial neural network approach to detect presence and severity of Parkinson's disease via gait parameters. PLoS One 2021; 16:e0244396. [PMID: 33606730 PMCID: PMC7894951 DOI: 10.1371/journal.pone.0244396] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 12/08/2020] [Indexed: 01/16/2023] Open
Abstract
Introduction Gait deficits are debilitating in people with Parkinson’s disease (PwPD), which inevitably deteriorate over time. Gait analysis is a valuable method to assess disease-specific gait patterns and their relationship with the clinical features and progression of the disease. Objectives Our study aimed to i) develop an automated diagnostic algorithm based on machine-learning techniques (artificial neural networks [ANNs]) to classify the gait deficits of PwPD according to disease progression in the Hoehn and Yahr (H-Y) staging system, and ii) identify a minimum set of gait classifiers. Methods We evaluated 76 PwPD (H-Y stage 1–4) and 67 healthy controls (HCs) by computerized gait analysis. We computed the time-distance parameters and the ranges of angular motion (RoMs) of the hip, knee, ankle, trunk, and pelvis. Principal component analysis was used to define a subset of features including all gait variables. An ANN approach was used to identify gait deficits according to the H-Y stage. Results We identified a combination of a small number of features that distinguished PwPDs from HCs (one combination of two features: knee and trunk rotation RoMs) and identified the gait patterns between different H-Y stages (two combinations of four features: walking speed and hip, knee, and ankle RoMs; walking speed and hip, knee, and trunk rotation RoMs). Conclusion The ANN approach enabled automated diagnosis of gait deficits in several symptomatic stages of Parkinson’s disease. These results will inspire future studies to test the utility of gait classifiers for the evaluation of treatments that could modify disease progression.
Collapse
Affiliation(s)
- Tiwana Varrecchia
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone Rome, Rome, Italy
- * E-mail:
| | - Stefano Filippo Castiglia
- Department of Medico-Surgical Sciences and Biotechnologies, University of Rome Sapienza, Latina, Italy
| | - Alberto Ranavolo
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone Rome, Rome, Italy
| | | | - Antonella Tatarelli
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone Rome, Rome, Italy
- Department of Human Neurosciences, University of Rome Sapienza, Rome, Italy
| | - Gianluca Coppola
- Department of Medico-Surgical Sciences and Biotechnologies, University of Rome Sapienza, Latina, Italy
| | - Cherubino Di Lorenzo
- Department of Medico-Surgical Sciences and Biotechnologies, University of Rome Sapienza, Latina, Italy
| | - Francesco Draicchio
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone Rome, Rome, Italy
| | - Francesco Pierelli
- Department of Medico-Surgical Sciences and Biotechnologies, University of Rome Sapienza, Latina, Italy
| | - Mariano Serrao
- Department of Medico-Surgical Sciences and Biotechnologies, University of Rome Sapienza, Latina, Italy
| |
Collapse
|
26
|
Folador JP, Vieira MF, Pereira AA, Andrade ADO. Open-source data management system for Parkinson's disease follow-up. PeerJ Comput Sci 2021; 7:e396. [PMID: 33817042 PMCID: PMC7959639 DOI: 10.7717/peerj-cs.396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 01/26/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND Parkinson's disease (PD) is a neurodegenerative condition of the central nervous system that causes motor and non-motor dysfunctions. The disease affects 1% of the world population over 60 years and remains cureless. Knowledge and monitoring of PD are essential to provide better living conditions for patients. Thus, diagnostic exams and monitoring of the disease can generate a large amount of data from a given patient. This study proposes the development and usability evaluation of an integrated system, which can be used in clinical and research settings to manage biomedical data collected from PD patients. METHODS A system, so-called Sistema Integrado de Dados Biomédicos (SIDABI) (Integrated Biomedical Data System), was designed following the model-view-controller (MVC) standard. A modularized architecture was created in which all the other modules are connected to a central security module. Thirty-six examiners evaluated the system usability through the System Usability Scale (SUS). The agreement between examiners was measured by Kendall's coefficient with a significance level of 1%. RESULTS The free and open-source web-based system was implemented using modularized and responsive methods to adapt the system features on multiple platforms. The mean SUS score was 82.99 ± 13.97 points. The overall agreement was 70.2%, as measured by Kendall's coefficient (p < 0.001). CONCLUSION According to the SUS scores, the developed system has good usability. The system proposed here can help researchers to organize and share information, avoiding data loss and fragmentation. Furthermore, it can help in the follow-up of PD patients, in the training of professionals involved in the treatment of the disorder, and in studies that aim to find hidden correlations in data.
Collapse
Affiliation(s)
- João Paulo Folador
- Centre for Innovation and Technology Assessment in Health, Postgraduate Program in Electrical and Biomedical Engineering, Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
| | - Marcus Fraga Vieira
- Bioengineering and Biomechanics Laboratory, Federal University of Goiás, Goiânia, Goiás, Brazil
| | - Adriano Alves Pereira
- Centre for Innovation and Technology Assessment in Health, Postgraduate Program in Electrical and Biomedical Engineering, Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
| | - Adriano de Oliveira Andrade
- Centre for Innovation and Technology Assessment in Health, Postgraduate Program in Electrical and Biomedical Engineering, Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
| |
Collapse
|
27
|
Kedia S, Pahwa B, Bali O, Goyal S. Applications of Machine Learning in Pediatric Hydrocephalus: A Systematic Review. Neurol India 2021; 69:S380-S389. [DOI: 10.4103/0028-3886.332287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
28
|
AIM, Philosophy and Ethics. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_243-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
29
|
Xicoy H, Vila M, Laguna A. Systems Medicine in Parkinson׳s Disease: Joining Efforts to Change History. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11612-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
30
|
Bhidayasiri R, Mari Z. Digital phenotyping in Parkinson's disease: Empowering neurologists for measurement-based care. Parkinsonism Relat Disord 2020; 80:35-40. [DOI: 10.1016/j.parkreldis.2020.08.038] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 08/26/2020] [Accepted: 08/28/2020] [Indexed: 12/24/2022]
|
31
|
Su C, Tong J, Wang F. Mining genetic and transcriptomic data using machine learning approaches in Parkinson's disease. NPJ PARKINSONS DISEASE 2020; 6:24. [PMID: 32964109 PMCID: PMC7481248 DOI: 10.1038/s41531-020-00127-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 08/13/2020] [Indexed: 01/08/2023]
Abstract
High-throughput techniques have generated abundant genetic and transcriptomic data of Parkinson’s disease (PD) patients but data analysis approaches such as traditional statistical methods have not provided much in the way of insightful integrated analysis or interpretation of the data. As an advanced computational approach, machine learning, which enables people to identify complex patterns and insight from data, has consequently been harnessed to analyze and interpret large, highly complex genetic and transcriptomic data toward a better understanding of PD. In particular, machine learning models have been developed to integrate patient genotype data alone or combined with demographic, clinical, neuroimaging, and other information, for PD outcome study. They have also been used to identify biomarkers of PD based on transcriptomic data, e.g., gene expression profiles from microarrays. This study overviews the relevant literature on using machine learning models for genetic and transcriptomic data analysis in PD, points out remaining challenges, and suggests future directions accordingly. Undoubtedly, the use of machine learning is amplifying PD genetic and transcriptomic achievements for accelerating the study of PD. Existing studies have demonstrated the great potential of machine learning in discovering hidden patterns within genetic or transcriptomic information and thus revealing clues underpinning pathology and pathogenesis. Moving forward, by addressing the remaining challenges, machine learning may advance our ability to precisely diagnose, prognose, and treat PD.
Collapse
Affiliation(s)
- Chang Su
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY USA
| | - Jie Tong
- Department of Mechanical and Aerospace Engineering, New York University, New York, NY USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY USA
| |
Collapse
|
32
|
Katapodi MC, Ming C, Northouse LL, Duffy SA, Duquette D, Mendelsohn-Victor KE, Milliron KJ, Merajver SD, Dinov ID, Janz NK. Genetic Testing and Surveillance of Young Breast Cancer Survivors and Blood Relatives: A Cluster Randomized Trial. Cancers (Basel) 2020; 12:cancers12092526. [PMID: 32899538 PMCID: PMC7563571 DOI: 10.3390/cancers12092526] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 09/02/2020] [Accepted: 09/03/2020] [Indexed: 01/11/2023] Open
Abstract
Simple Summary Identifying breast cancer patients with pathogenic mutations that run in their families may improve the follow-up care they receive and breast cancer screening of their close relatives. In this study we identified breast cancer patients with high chances of having a pathogenic mutation and their close female relatives. We developed and tested two different kinds of letters and booklets that presented either personalized or generic information about screening and breast cancer that runs in families, and we encouraged participants to seek genetic evaluation. We found that both types of letters worked equally well for breast cancer patients and for relatives, regardless of their racial background. The personalized letters had slightly better outcomes. Some breast cancer patients and their relatives used genetic services and improved their screening practices. Black patients and their relatives were more satisfied with the booklets than other participants. Abstract We compared a tailored and a targeted intervention designed to increase genetic testing, clinical breast exam (CBE), and mammography in young breast cancer survivors (YBCS) (diagnosed <45 years old) and their blood relatives. A two-arm cluster randomized trial recruited a random sample of YBCS from the Michigan cancer registry and up to two of their blood relatives. Participants were stratified according to race and randomly assigned as family units to the tailored (n = 637) or the targeted (n = 595) intervention. Approximately 40% of participants were Black. Based on intention-to-treat analyses, YBCS in the tailored arm reported higher self-efficacy for genetic services (p = 0.0205) at 8-months follow-up. Genetic testing increased approximately 5% for YBCS in the tailored and the targeted arm (p ≤ 0.001; p < 0.001) and for Black and White/Other YBCS (p < 0.001; p < 0.001). CBEs and mammograms increased significantly in both arms, 5% for YBCS and 10% for relatives and were similar for Blacks and White/Others. YBCS and relatives needing less support from providers reported significantly higher self-efficacy and intention for genetic testing and surveillance. Black participants reported significantly higher satisfaction and acceptability. Effects of these two low-resource interventions were comparable to previous studies. Materials are suitable for Black women at risk for hereditary breast/ovarian cancer (HBOC).
Collapse
Affiliation(s)
- Maria C. Katapodi
- Department of Clinical Research, Faculty of Medicine, University of Basel, 4055 Basel, Switzerland;
- School of Nursing, University of Michigan, Ann Arbor, MI 48109-5482, USA; (L.L.N.); (K.E.M.-V.)
- Correspondence: ; Tel.: +41-61-207-04-30
| | - Chang Ming
- Department of Clinical Research, Faculty of Medicine, University of Basel, 4055 Basel, Switzerland;
| | - Laurel L. Northouse
- School of Nursing, University of Michigan, Ann Arbor, MI 48109-5482, USA; (L.L.N.); (K.E.M.-V.)
| | - Sonia A. Duffy
- College of Nursing, Ohio State University, Columbus, OH 43210, USA;
| | - Debra Duquette
- Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA;
| | | | - Kara J. Milliron
- Comprehensive Cancer Center, University of Michigan, Ann Arbor, MI 48109-5618, USA;
| | - Sofia D. Merajver
- School of Public Health, University of Michigan, Ann Arbor, MI 48109-5618, USA; (S.D.M.); (N.K.J.)
| | - Ivo D. Dinov
- Statistics Online Computational Resource, School of Nursing, University of Michigan, Ann Arbor, MI 48109-2003, USA;
| | - Nancy K. Janz
- School of Public Health, University of Michigan, Ann Arbor, MI 48109-5618, USA; (S.D.M.); (N.K.J.)
| |
Collapse
|
33
|
Machine learning-based lifetime breast cancer risk reclassification compared with the BOADICEA model: impact on screening recommendations. Br J Cancer 2020; 123:860-867. [PMID: 32565540 PMCID: PMC7463251 DOI: 10.1038/s41416-020-0937-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 05/13/2020] [Accepted: 05/29/2020] [Indexed: 12/17/2022] Open
Abstract
Background The clinical utility of machine-learning (ML) algorithms for breast cancer risk prediction and screening practices is unknown. We compared classification of lifetime breast cancer risk based on ML and the BOADICEA model. We explored the differences in risk classification and their clinical impact on screening practices. Methods We used three different ML algorithms and the BOADICEA model to estimate lifetime breast cancer risk in a sample of 112,587 individuals from 2481 families from the Oncogenetic Unit, Geneva University Hospitals. Performance of algorithms was evaluated using the area under the receiver operating characteristic (AU-ROC) curve. Risk reclassification was compared for 36,146 breast cancer-free women of ages 20–80. The impact on recommendations for mammography surveillance was based on the Swiss Surveillance Protocol. Results The predictive accuracy of ML-based algorithms (0.843 ≤ AU-ROC ≤ 0.889) was superior to BOADICEA (AU-ROC = 0.639) and reclassified 35.3% of women in different risk categories. The largest reclassification (20.8%) was observed in women characterised as ‘near population’ risk by BOADICEA. Reclassification had the largest impact on screening practices of women younger than 50. Conclusion ML-based reclassification of lifetime breast cancer risk occurred in approximately one in three women. Reclassification is important for younger women because it impacts clinical decision- making for the initiation of screening.
Collapse
|
34
|
Investigating the Impact of Big Data Analytics on Perceived Sales Performance: The Mediating Role of Customer Relationship Management Capabilities. COMPLEXITY 2020. [DOI: 10.1155/2020/5186870] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
A persistent question for information technology researchers and practitioners is how big data analytics (BDA) can improve sales performance. Therefore, this study proposed a research model to investigate the impact of BDA on perceived sales performance in accordance with the resource-based view (RBV) and dynamic capability theory. The 416 valid responses collected from the employees of pharmaceutical organizations were analyzed using structural equation modelling to test the proposed research model. Results indicated that the BDA and customer relationship management (CRM) capabilities shared a strong positive impact on perceived sales performance. BDA, as organizational resources, creates organizational dynamic capabilities, such as CRM capabilities. BDA and CRM capabilities can influence perceived sales performance. Furthermore, CRM capabilities have a significant mediating impact on the relationships between BDA and perceived sales performance. This study also highlighted the practical and theoretical implications of the proposed model, the research limitations, and the future research directions.
Collapse
|
35
|
Oh B, Yun JY, Yeo EC, Kim DH, Kim J, Cho BJ. Prediction of Suicidal Ideation among Korean Adults Using Machine Learning: A Cross-Sectional Study. Psychiatry Investig 2020; 17:331-340. [PMID: 32213803 PMCID: PMC7176567 DOI: 10.30773/pi.2019.0270] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 02/07/2020] [Indexed: 12/30/2022] Open
Abstract
OBJECTIVE Suicidal ideation (SI) precedes actual suicidal event. Thus, it is important for the prevention of suicide to screen the individuals with SI. This study aimed to identify the factors associated with SI and to build prediction models in Korean adults using machine learning methods. METHODS The 2010-2013 dataset of the Korea National Health and Nutritional Examination Survey was used as the training dataset (n=16,437), and the subset collected in 2015 was used as the testing dataset (n=3,788). Various machine learning algorithms were applied and compared to the conventional logistic regression (LR)-based model. RESULTS Common risk factors for SI included stress awareness, experience of continuous depressive mood, EQ-5D score, depressive disorder, household income, educational status, alcohol abuse, and unmet medical service needs. The prediction performances of the machine learning models, as measured by the area under receiver-operating curve, ranged from 0.794 to 0.877, some of which were better than that of the conventional LR model (0.867). The Bayesian network, LogitBoost with LR, and ANN models outperformed the conventional LR model. CONCLUSION A machine learning-based approach could provide better SI prediction performance compared to a conventional LR-based model. These may help primary care physicians to identify patients at risk of SI and will facilitate the early prevention of suicide.
Collapse
Affiliation(s)
- Bumjo Oh
- Department of Family Medicine, SMG-SNU Boramae Medical Center, Seoul, Republic of Korea
| | - Je-Yeon Yun
- Seoul National University Hospital, Seoul, Republic of Korea.,Yeongeon Student Support Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Eun Chong Yeo
- School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Dong-Hoi Kim
- School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Jin Kim
- School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Bum-Joo Cho
- Department of Ophthalmology, Hallym University Sacred Heart Hospital, Hallym University College of Medicine, Anyang, Republic of Korea.,Medical Artificial Intelligence Center, Hallym University Medical Center, Anyang, Republic of Korea.,Institute of New Frontier Research, Hallym University College of Medicine, Chuncheon, Republic of Korea
| |
Collapse
|
36
|
Dinov ID. Modernizing the Methods and Analytics Curricula for Health Science Doctoral Programs. Front Public Health 2020; 8:22. [PMID: 32117857 PMCID: PMC7031195 DOI: 10.3389/fpubh.2020.00022] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/23/2020] [Indexed: 12/24/2022] Open
Abstract
This perspective provides a rationale for redesigning and a framework for expanding the graduate health science analytics and biomedical doctoral program curricula. It responds to digital revolution pressures, ubiquitous proliferation of big biomedical data, substantial recent advances in scientific technologies, and rapid progress in health analytics. Specifically, the paper presents a set of common prerequisites, a proposal for core computational and data analytic curriculum, and a list of expected outcome competencies for graduates of doctoral health science and biomedical programs. The manuscript emphasizes the necessity for coordinated efforts of all stakeholders, including trainees, educators, academic institutions, funding agencies, and policy makers. Concrete recommendations are presented of how to ensure graduates with terminal health science analytics and biomedical degrees are trained and able to continuously self-learn, effectively communicate across disciplines, and promote adaptation and change to counteract the relentless pace of automation and the law of diminishing returns.
Collapse
Affiliation(s)
- Ivo D. Dinov
- Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, United States
- Neuroscience Graduate Program, University of Michigan, Ann Arbor, MI, United States
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, United States
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
37
|
Tang M, Gao C, Goutman SA, Kalinin A, Mukherjee B, Guan Y, Dinov ID. Model-Based and Model-Free Techniques for Amyotrophic Lateral Sclerosis Diagnostic Prediction and Patient Clustering. Neuroinformatics 2019; 17:407-421. [PMID: 30460455 PMCID: PMC6527505 DOI: 10.1007/s12021-018-9406-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is a complex progressive neurodegenerative disorder with an estimated prevalence of about 5 per 100,000 people in the United States. In this study, the ALS disease progression is measured by the change of Amyotrophic Lateral Sclerosis Functional Rating Scale (ALSFRS) score over time. The study aims to provide clinical decision support for timely forecasting of the ALS trajectory as well as accurate and reproducible computable phenotypic clustering of participants. Patient data are extracted from DREAM-Phil Bowen ALS Prediction Prize4Life Challenge data, most of which are from the Pooled Resource Open-Access ALS Clinical Trials Database (PRO-ACT) archive. We employed model-based and model-free machine-learning methods to predict the change of the ALSFRS score over time. Using training and testing data we quantified and compared the performance of different techniques. We also used unsupervised machine learning methods to cluster the patients into separate computable phenotypes and interpret the derived subcohorts. Direct prediction of univariate clinical outcomes based on model-based (linear models) or model-free (machine learning based techniques - random forest and Bayesian adaptive regression trees) was only moderately successful. The correlation coefficients between clinically observed changes in ALSFRS scores relative to the model-based/model-free predicted counterparts were 0.427 (random forest) and 0.545(BART). The reliability of these results were assessed using internal statistical cross validation and well as external data validation. Unsupervised clustering generated very reliable and consistent partitions of the patient cohort into four computable phenotypic subgroups. These clusters were explicated by identifying specific salient clinical features included in the PRO-ACT archive that discriminate between the derived subcohorts. There are differences between alternative analytical methods in forecasting specific clinical phenotypes. Although predicting univariate clinical outcomes may be challenging, our results suggest that modern data science strategies are useful in clustering patients and generating evidence-based ALS hypotheses about complex interactions of multivariate factors. Predicting univariate clinical outcomes using the PRO-ACT data yields only marginal accuracy (about 70%). However, unsupervised clustering of participants into sub-groups generates stable, reliable and consistent (exceeding 95%) computable phenotypes whose explication requires interpretation of multivariate sets of features. HIGHLIGHTS: • Used a large ALS data archive of 8,000 patients consisting of 3 million records, including 200 clinical features tracked over 12 months. • Employed model-based and model-free methods to predict ALSFRS changes over time, cluster patients into cohorts, and derive computable phenotypes. • Research findings include stable, reliable, and consistent (95%) patient stratification into computable phenotypes. However, clinical explication of the results requires interpretation of multivariate information. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Ming Tang
- Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Chao Gao
- Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Stephen A Goutman
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Alexandr Kalinin
- Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ivo D Dinov
- Statistics Online Computational Resource, Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
38
|
Ming C, Viassolo V, Probst-Hensch N, Chappuis PO, Dinov ID, Katapodi MC. Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models. Breast Cancer Res 2019; 21:75. [PMID: 31221197 PMCID: PMC6585114 DOI: 10.1186/s13058-019-1158-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 05/28/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Comprehensive breast cancer risk prediction models enable identifying and targeting women at high-risk, while reducing interventions in those at low-risk. Breast cancer risk prediction models used in clinical practice have low discriminatory accuracy (0.53-0.64). Machine learning (ML) offers an alternative approach to standard prediction modeling that may address current limitations and improve accuracy of those tools. The purpose of this study was to compare the discriminatory accuracy of ML-based estimates against a pair of established methods-the Breast Cancer Risk Assessment Tool (BCRAT) and Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) models. METHODS We quantified and compared the performance of eight different ML methods to the performance of BCRAT and BOADICEA using eight simulated datasets and two retrospective samples: a random population-based sample of U.S. breast cancer patients and their cancer-free female relatives (N = 1143), and a clinical sample of Swiss breast cancer patients and cancer-free women seeking genetic evaluation and/or testing (N = 2481). RESULTS Predictive accuracy (AU-ROC curve) reached 88.28% using ML-Adaptive Boosting and 88.89% using ML-random forest versus 62.40% with BCRAT for the U.S. population-based sample. Predictive accuracy reached 90.17% using ML-adaptive boosting and 89.32% using ML-Markov chain Monte Carlo generalized linear mixed model versus 59.31% with BOADICEA for the Swiss clinic-based sample. CONCLUSIONS There was a striking improvement in the accuracy of classification of women with and without breast cancer achieved with ML algorithms compared to the state-of-the-art model-based approaches. High-accuracy prediction techniques are important in personalized medicine because they facilitate stratification of prevention strategies and individualized clinical management.
Collapse
Affiliation(s)
- Chang Ming
- Nursing Science, Faculty of Medicine, University of Basel, Bernoullistrasse 28, Room 118, 4056, Basel, Switzerland.
| | - Valeria Viassolo
- Oncogenetics and Cancer Prevention, Geneva University Hospitals, Geneva, Switzerland
| | - Nicole Probst-Hensch
- Swiss Tropical and Public Health Institute, University of Basel, Basel, Switzerland
| | - Pierre O Chappuis
- Oncogenetics and Cancer Prevention, Geneva University Hospitals, Geneva, Switzerland.,Genetic Medicine, Geneva University Hospitals, Geneva, Switzerland
| | - Ivo D Dinov
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, USA.,Statistics Online Computational resource, University of Michigan, Ann Arbor, MI, USA.,University of Michigan School of Nursing, Ann Arbor, MI, USA
| | - Maria C Katapodi
- Nursing Science, Faculty of Medicine, University of Basel, Bernoullistrasse 28, Room 118, 4056, Basel, Switzerland.,University of Michigan School of Nursing, Ann Arbor, MI, USA
| |
Collapse
|
39
|
Xu J, Zhang M. Use of Magnetic Resonance Imaging and Artificial Intelligence in Studies of Diagnosis of Parkinson's Disease. ACS Chem Neurosci 2019; 10:2658-2667. [PMID: 31083923 DOI: 10.1021/acschemneuro.9b00207] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Parkinson's disease (PD) is a common neurodegenerative disorder. It has a delitescent onset and a slow progress. The clinical manifestations of PD in patients are highly heterogeneous. Thus, PD diagnosis process is complex and mainly depends on the professional knowledge and experience of the physician. Magnetic resonance imaging (MRI) could detect the small changes in the brain of PD patients, and quantitative analysis of brain MRI may improve the clinical diagnosis efficiency. However, due to the complexity of clinical courses in PD and the high dimensionality in multimodal MRI data, traditional mathematical analysis could not effectively extract the huge information in them. Up to now, the accuracy of PD diagnosis in large sample size is still unsatisfying. As artificial intelligence (AI) is becoming more mature, varieties of statistical models and machine learning (ML) algorithms have been used for quantitative imaging data analysis to explore a diagnostic result. This review aims to state an overview of existing research recently that used statistical ML/AI methods to perform quantitative analysis of MR image data for the study of PD diagnosis. First we review the recent research in three subareas: diagnosis, differential diagnosis, and subtyping of PD. Then we described the overall workflow from MR image to classification result. Finally, we summarized a critical assessment of the current research and provide some recommendations for likely future research developments and trends.
Collapse
Affiliation(s)
- Jingjing Xu
- Department of Radiology, the Second Affiliated Hospital of Zhejiang University, School of Medicine, No.88 Jiefang Road, Shangcheng District, Hangzhou 31000, China
| | - Minming Zhang
- Department of Radiology, the Second Affiliated Hospital of Zhejiang University, School of Medicine, No.88 Jiefang Road, Shangcheng District, Hangzhou 31000, China
| |
Collapse
|
40
|
Godfrey A, Brodie M, van Schooten KS, Nouredanesh M, Stuart S, Robinson L. Inertial wearables as pragmatic tools in dementia. Maturitas 2019; 127:12-17. [PMID: 31351515 DOI: 10.1016/j.maturitas.2019.05.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 05/22/2019] [Accepted: 05/23/2019] [Indexed: 01/02/2023]
Abstract
Dementia is a critically important issue due to its wide impact on health services as well as its personal and societal costs. Limitations exist for current dementia protocols, and there are calls to introduce modern technology that facilitates the addition of digital biomarkers to routine clinical practice. Wearable technology (wearables) are nearly ubiquitous in everyday life, gathering discrete and continuous digital data on habitual activities, but their utility in modern medicine remains low. Due to advances in data analytics, wearables are now commonly discussed as pragmatic tools to aid the diagnosis and treatment of a range of neurological disorders. Inertial sensor-based wearables are one such technology; they offer a low-cost approach to quantify routine movements that are fundamental to normal activities of daily living, most notably postural control and gait. Here, we provide a narrative review of how wearables are providing useful postural control and gait data to facilitate the capture of digital markers to aid dementia research. We outline the history of wearables, from their humble beginnings to their current use beyond the clinic, and explore their integration into modern systems, as well as the ongoing standardisation and regulatory efforts to integrate their use in clinical trials.
Collapse
Affiliation(s)
- A Godfrey
- Department of Computer and Information Sciences, Northumbria University, Newcastle, UK.
| | - M Brodie
- Falls Balance & Injury Research Centre, Neuroscience Research Australia, NSW, Australia; Graduate School of Biomedical Engineering, University of New South Wales, NSW, Australia
| | - K S van Schooten
- Neuroscience Research Australia, University of New South Wales, Sydney, Australia; School of Public Health and Community Medicine, University of New South Wales, NSW, Australia
| | - M Nouredanesh
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Canada
| | - S Stuart
- Department of Neurology, Oregon Health & Science University, Portland, Oregon, USA
| | - L Robinson
- Institute for Ageing, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
41
|
Olivera P, Danese S, Jay N, Natoli G, Peyrin-Biroulet L. Big data in IBD: a look into the future. Nat Rev Gastroenterol Hepatol 2019; 16:312-321. [PMID: 30659247 DOI: 10.1038/s41575-019-0102-5] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Big data methodologies, made possible with the increasing generation and availability of digital data and enhanced analytical capabilities, have produced new insights to improve outcomes in many disciplines. Application of big data in the health-care sector is in its early stages, although the potential for leveraging underutilized data to gain a better understanding of disease and improve quality of care is enormous. Owing to the intrinsic characteristics of inflammatory bowel disease (IBD) and the management dilemmas that it imposes, the implementation of big data research strategies not only can complement current research efforts but also could represent the only way to disentangle the complexity of the disease. In this Review, we explore important potential applications of big data in IBD research, including predictive models of disease course and response to therapy, characterization of disease heterogeneity, drug safety and development, precision medicine and cost-effectiveness of care. We also discuss the strengths and limitations of potential data sources that big data analytics could draw from in the field of IBD, including electronic health records, clinical trial data, e-health applications and genomic, transcriptomic, proteomic, metabolomic and microbiomic data.
Collapse
Affiliation(s)
- Pablo Olivera
- Gastroenterology Section, Department of Internal Medicine, Centro de Educación Médica e Investigaciones Clínicas (CEMIC), Buenos Aires, Argentina
| | - Silvio Danese
- IBD Center, Department of Gastroenterology, Humanitas Clinical and Research Centre, Rozzano, Milan, Italy.,Humanitas Clinical Research Hospital, Rozzano, Milan, Italy
| | - Nicolas Jay
- Orpailleur and Department of Medical Information, LORIA and Nancy University Hospital, Vandoeuvre-lès-Nancy, Nancy, France
| | | | - Laurent Peyrin-Biroulet
- INSERM U954 and Department of Hepatogastroenterology, Nancy University Hospital, Université de Lorraine, Vandoeuvre-lès-Nancy, Nancy, France.
| |
Collapse
|
42
|
Zhou Y, Zhao L, Zhou N, Zhao Y, Marino S, Wang T, Sun H, Toga AW, Dinov ID. Predictive Big Data Analytics using the UK Biobank Data. Sci Rep 2019; 9:6012. [PMID: 30979917 PMCID: PMC6461626 DOI: 10.1038/s41598-019-41634-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 03/13/2019] [Indexed: 12/04/2022] Open
Abstract
The UK Biobank is a rich national health resource that provides enormous opportunities for international researchers to examine, model, and analyze census-like multisource healthcare data. The archive presents several challenges related to aggregation and harmonization of complex data elements, feature heterogeneity and salience, and health analytics. Using 7,614 imaging, clinical, and phenotypic features of 9,914 subjects we performed deep computed phenotyping using unsupervised clustering and derived two distinct sub-cohorts. Using parametric and nonparametric tests, we determined the top 20 most salient features contributing to the cluster separation. Our approach generated decision rules to predict the presence and progression of depression or other mental illnesses by jointly representing and modeling the significant clinical and demographic variables along with the derived salient neuroimaging features. We reported consistency and reliability measures of the derived computed phenotypes and the top salient imaging biomarkers that contributed to the unsupervised clustering. This clinical decision support system identified and utilized holistically the most critical biomarkers for predicting mental health, e.g., depression. External validation of this technique on different populations may lead to reducing healthcare expenses and improving the processes of diagnosis, forecasting, and tracking of normal and pathological aging.
Collapse
Affiliation(s)
- Yiwang Zhou
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA.,Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Lu Zhao
- Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Nina Zhou
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA.,Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Yi Zhao
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Simeone Marino
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Tuo Wang
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA.,Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Hanbo Sun
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA.,Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Arthur W Toga
- Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA
| | - Ivo D Dinov
- Statistics Online Computational Resource (SOCR), Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, MI, USA. .,Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, USA. .,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA. .,Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
43
|
Madduri R, Chard K, D’Arcy M, Jung SC, Rodriguez A, Sulakhe D, Deutsch E, Funk C, Heavner B, Richards M, Shannon P, Glusman G, Price N, Kesselman C, Foster I. Reproducible big data science: A case study in continuous FAIRness. PLoS One 2019; 14:e0213013. [PMID: 30973881 PMCID: PMC6459504 DOI: 10.1371/journal.pone.0213013] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 02/13/2019] [Indexed: 01/22/2023] Open
Abstract
Big biomedical data create exciting opportunities for discovery, but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperable, and reusable (FAIR). In response, we describe tools that make it easy to capture, and assign identifiers to, data and code throughout the data lifecycle. We illustrate the use of these tools via a case study involving a multi-step analysis that creates an atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data. We show how the tools automate routine but complex tasks, capture analysis algorithms in understandable and reusable forms, and harness fast networks and powerful cloud computers to process data rapidly, all without sacrificing usability or reproducibility-thus ensuring that big data are not hard-to-(re)use data. We evaluate our approach via a user study, and show that 91% of participants were able to replicate a complex analysis involving considerable data volumes.
Collapse
Affiliation(s)
- Ravi Madduri
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Kyle Chard
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Mike D’Arcy
- Information Sciences Institute, University of Southern California, Los Angeles, California, United States of America
| | - Segun C. Jung
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Alexis Rodriguez
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Dinanath Sulakhe
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Eric Deutsch
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Cory Funk
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Ben Heavner
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington, United States of America
| | - Matthew Richards
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Paul Shannon
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Gustavo Glusman
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Nathan Price
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Carl Kesselman
- Information Sciences Institute, University of Southern California, Los Angeles, California, United States of America
| | - Ian Foster
- Globus, University of Chicago, Chicago, Illinois, United States of America
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, United States of America
- Department of Computer Science, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
44
|
Silverio A, Cavallo P, De Rosa R, Galasso G. Big Health Data and Cardiovascular Diseases: A Challenge for Research, an Opportunity for Clinical Care. Front Med (Lausanne) 2019; 6:36. [PMID: 30873409 PMCID: PMC6401640 DOI: 10.3389/fmed.2019.00036] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 02/05/2019] [Indexed: 12/12/2022] Open
Abstract
Cardiovascular disease (CVD) accounts for the majority of death and hospitalization, health care expenditures and loss of productivity in developed country. CVD research, thus, plays a key role for improving patients' outcomes as well as for the sustainability of health systems. The increasing costs and complexity of modern medicine along with the fragmentation in healthcare organizations interfere with improving quality care and represent a missed opportunity for research. The advancement in diagnosis, therapy and prognostic evaluation of patients with CVD, indeed, is frustrated by limited data access to selected small patient populations, not standardized nor computable definition of disease and lack of approved relevant patient-centered outcomes. These critical issues results in a deep mismatch between randomized controlled trials and real-world setting, heterogeneity in treatment response and wide inter-individual variation in prognosis. Big data approach combines millions of people's electronic health records (EHR) from different resources and provides a new methodology expanding data collection in three direction: high volume, wide variety and extreme acquisition speed. Large population studies based on EHR holds much promise due to low costs, diminished study participant burden, and reduced selection bias, thus offering an alternative to traditional ascertainment through biomedical screening and tracing processes. By merging and harmonizing large data sets, the researchers aspire to build algorithms that allow targeted and personalized CVD treatments. In current paper, we provide a critical review of big health data for cardiovascular research, focusing on the opportunities of this largely free data analytics and the challenges in its realization.
Collapse
Affiliation(s)
- Angelo Silverio
- Cardiology Unit, Cardiovascular and Thoracic Department, University Hospital "San Giovanni di Dio e Ruggi d'Aragona", Salerno, Italy
| | - Pierpaolo Cavallo
- Department of Physics "E.R. Caianiello", University of Salerno, Salerno, Italy
| | - Roberta De Rosa
- Cardiology Unit, Cardiovascular and Thoracic Department, University Hospital "San Giovanni di Dio e Ruggi d'Aragona", Salerno, Italy
| | - Gennaro Galasso
- Cardiology Unit, Cardiovascular and Thoracic Department, University Hospital "San Giovanni di Dio e Ruggi d'Aragona", Salerno, Italy
| |
Collapse
|
45
|
Chaudhuri KR, Titova N. Societal Burden and Persisting Unmet Needs of Parkinson’s Disease. ACTA ACUST UNITED AC 2019. [DOI: 10.17925/enr.2019.14.1.28] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
46
|
Zhang X, He L, Chen K, Luo Y, Zhou J, Wang F. Multi-View Graph Convolutional Network and Its Applications on Neuroimage Analysis for Parkinson's Disease. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:1147-1156. [PMID: 30815157 PMCID: PMC6371363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Parkinson's Disease (PD) is one of the most prevalent neurodegenerative diseases that affects tens of millions of Americans. PD is highly progressive and heterogeneous. Quite a few studies have been conducted in recent years on predictive or disease progression modeling of PD using clinical and biomarkers data. Neuroimaging, as another important information source for neurodegenerative disease, has also arisen considerable interests from the PD community. In this paper, we propose a deep learning method based on Graph Convolutional Networks (GCN) for fusing multiple modalities of brain images in relationship prediction which is useful for distinguishing PD cases from controls. On Parkinson's Progression Markers Initiative (PPMI) cohort, our approach achieved 0.9537±0.0587 AUC, compared with 0.6443±0.0223 AUC achieved by traditional approaches such as PCA.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Healthcare Policy and Research, Weill Cornell Medical College, Cornell University, NY
- Equal Contribution. Corresponding author,
| | - Lifang He
- Department of Healthcare Policy and Research, Weill Cornell Medical College, Cornell University, NY
- Equal Contribution. Corresponding author,
| | - Kun Chen
- Department of Statistics, University of Connecticut, CT
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, IL
| | - Jiayu Zhou
- Department of Computer Science and Engineering, Michigan State University, MI
| | - Fei Wang
- Department of Healthcare Policy and Research, Weill Cornell Medical College, Cornell University, NY
- Equal Contribution. Corresponding author,
| |
Collapse
|
47
|
Oelsner EC, Balte PP, Cassano PA, Couper D, Enright PL, Folsom AR, Hankinson J, Jacobs DR, Kalhan R, Kaplan R, Kronmal R, Lange L, Loehr LR, London SJ, Navas Acien A, Newman AB, O’Connor GT, Schwartz JE, Smith LJ, Yeh F, Zhang Y, Moran AE, Mwasongwe S, White WB, Yende S, Barr RG. Harmonization of Respiratory Data From 9 US Population-Based Cohorts: The NHLBI Pooled Cohorts Study. Am J Epidemiol 2018; 187:2265-2278. [PMID: 29982273 PMCID: PMC6211239 DOI: 10.1093/aje/kwy139] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 04/13/2018] [Accepted: 04/17/2018] [Indexed: 12/13/2022] Open
Abstract
Chronic lower respiratory diseases (CLRDs) are the fourth leading cause of death in the United States. To support investigations into CLRD risk determinants and new approaches to primary prevention, we aimed to harmonize and pool respiratory data from US general population-based cohorts. Data were obtained from prospective cohorts that performed prebronchodilator spirometry and were harmonized following 2005 ATS/ERS standards. In cohorts conducting follow-up for noncardiovascular events, CLRD events were defined as hospitalizations/deaths adjudicated as CLRD-related or assigned relevant administrative codes. Coding and variable names were applied uniformly. The pooled sample included 65,251 adults in 9 cohorts followed-up for CLRD-related mortality over 653,380 person-years during 1983-2016. Average baseline age was 52 years; 56% were female; 49% were never-smokers; and racial/ethnic composition was 44% white, 22% black, 28% Hispanic/Latino, and 5% American Indian. Over 96% had complete data on smoking, clinical CLRD diagnoses, and dyspnea. After excluding invalid spirometry examinations (13%), there were 105,696 valid examinations (median, 2 per participant). Of 29,351 participants followed for CLRD hospitalizations, median follow-up was 14 years; only 5% were lost to follow-up at 10 years. The NHLBI Pooled Cohorts Study provides a harmonization standard applied to a large, US population-based sample that may be used to advance epidemiologic research on CLRD.
Collapse
MESH Headings
- Adolescent
- Adult
- Aged
- Aged, 80 and over
- Body Weights and Measures
- Bronchiectasis/epidemiology
- Bronchiectasis/physiopathology
- Chronic Disease
- Cohort Studies
- Ethnicity/statistics & numerical data
- Female
- Hispanic or Latino/statistics & numerical data
- Hospitalization/statistics & numerical data
- Humans
- Indians, North American/statistics & numerical data
- Inhalation Exposure/statistics & numerical data
- Lung Diseases, Obstructive/epidemiology
- Lung Diseases, Obstructive/ethnology
- Lung Diseases, Obstructive/mortality
- Lung Diseases, Obstructive/physiopathology
- Male
- Middle Aged
- National Heart, Lung, and Blood Institute (U.S.)/organization & administration
- National Heart, Lung, and Blood Institute (U.S.)/standards
- Phenotype
- Racial Groups/statistics & numerical data
- Respiratory Function Tests
- Risk Factors
- Smoking/epidemiology
- Socioeconomic Factors
- United States/epidemiology
- White People/statistics & numerical data
- Young Adult
Collapse
Affiliation(s)
- Elizabeth C Oelsner
- Division of General Medicine, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| | - Pallavi P Balte
- Division of General Medicine, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
| | - Patricia A Cassano
- Division of Nutritional Sciences, Weill Cornell Medical College, Ithaca, New York
| | - David Couper
- Collaborative Studies Coordinating Center, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, Chapel Hill, North Carolina
| | - Paul L Enright
- Department of Medicine, College of Medicine, University of Arizona, Tucson, Arizona
| | - Aaron R Folsom
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| | | | - David R Jacobs
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| | | | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, New York
| | - Richard Kronmal
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| | - Leslie Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado, Denver, Colorado
| | - Laura R Loehr
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, Chapel Hill, North Carolina
| | - Stephanie J London
- National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina
| | - Ana Navas Acien
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York
| | - Anne B Newman
- Department of Epidemiology, Pitt Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - George T O’Connor
- Department of Medicine, School of Medicine, Boston University, Boston, Massachusetts
| | - Joseph E Schwartz
- Division of Cardiology, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
- Department of Psychiatry and Behavioral Sciences, School of Medicine, Stony Brook University, Stony Brook, New York
| | | | - Fawn Yeh
- Biostatistics and Epidemiology, College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma
| | - Yiyi Zhang
- Division of General Medicine, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
| | - Andrew E Moran
- Division of General Medicine, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
| | | | - Wendy B White
- Jackson Heart Study, Undergraduate Training and Education Center, Tougaloo College, Tougaloo, Mississippi
| | - Sachin Yende
- Division of Pulmonary and Critical Care, Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - R Graham Barr
- Division of General Medicine, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| |
Collapse
|
48
|
Kamble SS, Gunasekaran A, Goswami M, Manda J. A systematic perspective on the applications of big data analytics in healthcare management. INTERNATIONAL JOURNAL OF HEALTHCARE MANAGEMENT 2018. [DOI: 10.1080/20479700.2018.1531606] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Sachin S. Kamble
- Operations and Supply Chain Management, National Institute of Industrial Engineering, Mumbai, India
| | - Angappa Gunasekaran
- School of Business and Public Administration, California State University, Bakersfield, Bakersfield, CA, USA
| | - Milind Goswami
- National Institute of Industrial Engineering, Mumbai, India
| | - Jaswant Manda
- National Institute of Industrial Engineering, Mumbai, India
| |
Collapse
|
49
|
Marshall LJ, Willett C. Parkinson's disease research: adopting a more human perspective to accelerate advances. Drug Discov Today 2018; 23:1950-1961. [PMID: 30240875 DOI: 10.1016/j.drudis.2018.09.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 08/20/2018] [Accepted: 09/12/2018] [Indexed: 12/21/2022]
Abstract
Parkinson's disease (PD) affects 1% of the population over 60 years old and, with global increases in the aging population, presents huge economic and societal burdens. The etiology of PD remains unknown; most cases are idiopathic, presumed to result from genetic and environmental risk factors. Despite 200 years since the first description of PD, the mechanisms behind initiation and progression of the characteristic neurodegenerative processes are not known. Here, we review progress and limitations of the multiple PD animal models available and identify advances that could be implemented to better understand pathological processes, improve disease outcome, and reduce dependence on animal models. Lessons learned from reducing animal use in PD research could serve as guideposts for wider biomedical research.
Collapse
Affiliation(s)
- Lindsay J Marshall
- Humane Society International, The Humane Society of the United States, 700 Professional Drive, Gaithersburg, MD 20879, USA
| | - Catherine Willett
- Humane Society International, The Humane Society of the United States, 700 Professional Drive, Gaithersburg, MD 20879, USA.
| |
Collapse
|
50
|
Abstract
Population health management and specifically chronic disease management depend on the ability of providers to prevent development of high-cost and high-risk conditions such as diabetes, heart failure, and chronic respiratory diseases and to control them. The advent of big data analytics has potential to empower health care providers to make timely and truly evidence-based informed decisions to provide more effective and personalized treatment while reducing the costs of this care to patients. The goal of this study was to identify real-world health care applications of big data analytics to determine its effectiveness in both patient outcomes and the relief of financial burdens. The methodology for this study was a literature review utilizing 49 articles. Evidence of big data analytics being largely beneficial in the areas of risk prediction, diagnostic accuracy and patient outcome improvement, hospital readmission reduction, treatment guidance, and cost reduction was noted. Initial applications of big data analytics have proved useful in various phases of chronic disease management and could help reduce the chronic disease burden.
Collapse
|