1
|
Dipietro L, Eden U, Elkin-Frankston S, El-Hagrassy MM, Camsari DD, Ramos-Estebanez C, Fregni F, Wagner T. Integrating Big Data, Artificial Intelligence, and motion analysis for emerging precision medicine applications in Parkinson's Disease. JOURNAL OF BIG DATA 2024; 11:155. [PMID: 39493349 PMCID: PMC11525280 DOI: 10.1186/s40537-024-01023-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Accepted: 10/13/2024] [Indexed: 11/05/2024]
Abstract
One of the key challenges in Big Data for clinical research and healthcare is how to integrate new sources of data, whose relation to disease processes are often not well understood, with multiple classical clinical measurements that have been used by clinicians for years to describe disease processes and interpret therapeutic outcomes. Without such integration, even the most promising data from emerging technologies may have limited, if any, clinical utility. This paper presents an approach to address this challenge, illustrated through an example in Parkinson's Disease (PD) management. We show how data from various sensing sources can be integrated with traditional clinical measurements used in PD; furthermore, we show how leveraging Big Data frameworks, augmented by Artificial Intelligence (AI) algorithms, can distinctively enrich the data resources available to clinicians. We showcase the potential of this approach in a cohort of 50 PD patients who underwent both evaluations with an Integrated Motion Analysis Suite (IMAS) composed of a battery of multimodal, portable, and wearable sensors and traditional Unified Parkinson's Disease Rating Scale (UPDRS)-III evaluations. Through techniques including Principal Component Analysis (PCA), elastic net regression, and clustering analysis we demonstrate how this combined approach can be used to improve clinical motor assessments and to develop personalized treatments. The scalability of our approach enables systematic data generation and analysis on increasingly larger datasets, confirming the integration potential of IMAS, whose use in PD assessments is validated herein, within Big Data paradigms. Compared to existing approaches, our solution offers a more comprehensive, multi-dimensional view of patient data, enabling deeper clinical insights and greater potential for personalized treatment strategies. Additionally, we show how IMAS can be integrated into established clinical practices, facilitating its adoption in routine care and complementing emerging methods, for instance, non-invasive brain stimulation. Future work will aim to augment our data repositories with additional clinical data, such as imaging and biospecimen data, to further broaden and enhance these foundational methodologies, leveraging the full potential of Big Data and AI.
Collapse
Affiliation(s)
| | - Uri Eden
- Boston University, Boston, MA USA
| | - Seth Elkin-Frankston
- U.S. Army DEVCOM Soldier Center, Natick, MA USA
- Center for Applied Brain and Cognitive Sciences, Tufts University, Medford, MA USA
| | - Mirret M. El-Hagrassy
- Department of Neurology, UMass Chan Medical School, UMass Memorial, Worcester, MA USA
| | - Deniz Doruk Camsari
- Mindpath College Health, Isla Vista, Goleta, CA USA
- Mayo Clinic, Rochester, MN USA
| | | | - Felipe Fregni
- Spaulding Rehabilitation/Neuromodulation Lab, Harvard Medical School, Cambridge, MA USA
| | - Timothy Wagner
- Highland Instruments, Cambridge, MA USA
- Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA USA
| |
Collapse
|
2
|
Raschka T, Li Z, Gaßner H, Kohl Z, Jukic J, Marxreiter F, Fröhlich H. Unraveling progression subtypes in people with Huntington's disease. EPMA J 2024; 15:275-287. [PMID: 38841617 PMCID: PMC11148000 DOI: 10.1007/s13167-024-00368-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 05/09/2024] [Indexed: 06/07/2024]
Abstract
Background Huntington's disease (HD) is a progressive neurodegenerative disease caused by a CAG trinucleotide expansion in the huntingtin gene. The length of the CAG repeat is inversely correlated with disease onset. HD is characterized by hyperkinetic movement disorder, psychiatric symptoms, and cognitive deficits, which greatly impact patient's quality of life. Despite this clear genetic course, high variability of HD patients' symptoms can be observed. Current clinical diagnosis of HD solely relies on the presence of motor signs, disregarding the other important aspects of the disease. By incorporating a broader approach that encompasses motor as well as non-motor aspects of HD, predictive, preventive, and personalized (3P) medicine can enhance diagnostic accuracy and improve patient care. Methods Multisymptom disease trajectories of HD patients collected from the Enroll-HD study were first aligned on a common disease timescale to account for heterogeneity in disease symptom onset and diagnosis. Following this, the aligned disease trajectories were clustered using the previously published Variational Deep Embedding with Recurrence (VaDER) algorithm and resulting progression subtypes were clinically characterized. Lastly, an AI/ML model was learned to predict the progression subtype from only first visit data or with data from additional follow-up visits. Results Results demonstrate two distinct subtypes, one large cluster (n = 7122) showing a relative stable disease progression and a second, smaller cluster (n = 411) showing a dramatically more progressive disease trajectory. Clinical characterization of the two subtypes correlates with CAG repeat length, as well as several neurobehavioral, psychiatric, and cognitive scores. In fact, cognitive impairment was found to be the major difference between the two subtypes. Additionally, a prognostic model shows the ability to predict HD subtypes from patients' first visit only. Conclusion In summary, this study aims towards the paradigm shift from reactive to preventive and personalized medicine by showing that non-motor symptoms are of vital importance for predicting and categorizing each patients' disease progression pattern, as cognitive decline is oftentimes more reflective of HD progression than its motor aspects. Considering these aspects while counseling and therapy definition will personalize each individuals' treatment. The ability to provide patients with an objective assessment of their disease progression and thus a perspective for their life with HD is the key to improving their quality of life. By conducting additional analysis on biological data from both subtypes, it is possible to gain a deeper understanding of these subtypes and uncover the underlying biological factors of the disease. This greatly aligns with the goal of shifting towards 3P medicine. Supplementary Information The online version contains supplementary material available at 10.1007/s13167-024-00368-2.
Collapse
Affiliation(s)
- Tamara Raschka
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
| | - Zexin Li
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, University of Bonn, Friedrich-Hirzebruch-Allee 6, 53115 Bonn, Germany
| | - Heiko Gaßner
- Department of Molecular Neurology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
- Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Am Wolfsmantel 33, 91058 Erlangen, Germany
| | - Zacharias Kohl
- Department of Neurology, University of Regensburg, Regensburg, Germany
| | - Jelena Jukic
- Department of Molecular Neurology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
- Center for Rare Diseases Erlangen (ZSEER), University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Franz Marxreiter
- Department of Molecular Neurology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
- Center for Movement Disorders, Passauer Wolf, 93333 Bad Gögging, Germany
- Center for Rare Diseases Erlangen (ZSEER), University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, University of Bonn, Friedrich-Hirzebruch-Allee 6, 53115 Bonn, Germany
| |
Collapse
|
3
|
Venuto CS, Smith G, Herbst K, Zielinski R, Yung NC, Grosset DG, Dorsey ER, Kieburtz K. Predicting Ambulatory Capacity in Parkinson's Disease to Analyze Progression, Biomarkers, and Trial Design. Mov Disord 2023; 38:1774-1785. [PMID: 37363815 PMCID: PMC10615710 DOI: 10.1002/mds.29519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/10/2023] [Accepted: 06/06/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND In Parkinson's disease (PD), gait and balance is impaired, relatively resistant to available treatment and associated with falls and disability. Predictive models of ambulatory progression could enhance understanding of gait/balance disturbances and aid in trial design. OBJECTIVES To predict trajectories of ambulatory abilities from baseline clinical data in early PD, relate trajectories to clinical milestones, compare biomarkers, and evaluate trajectories for enrichment of clinical trials. METHODS Data from two multicenter, longitudinal, observational studies were used for model training (Tracking Parkinson's, n = 1598) and external testing (Parkinson's Progression Markers Initiative, n = 407). Models were trained and validated to predict individuals as having a "Progressive" or "Stable" trajectory based on changes of ambulatory capacity scores from the Movement Disorders Society Unified Parkinson's Disease Rating Scale parts II and III. Survival analyses compared time-to-clinical milestones and trial outcomes between predicted trajectories. RESULTS On external evaluation, a support vector machine model predicted Progressive trajectories using baseline clinical data with an accuracy, weighted-F1 (proportionally weighted harmonic mean of precision and sensitivity), and sensitivity/specificity of 0.735, 0.799, and 0.688/0.739, respectively. Over 4 years, the predicted Progressive trajectory was more likely to experience impaired balance, loss of independence, impaired function and cognition. Baseline dopamine transporter imaging and select biomarkers of neurodegeneration were significantly different between predicted trajectory groups. For an 18-month, randomized (1:1) clinical trial, sample size savings up to 30% were possible when enrollment was enriched for the Progressive trajectory versus no enrichment. CONCLUSIONS It is possible to predict ambulatory abilities from clinical data that are associated with meaningful outcomes in people with early PD. © 2023 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
Collapse
Affiliation(s)
- Charles S. Venuto
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
- Department of Neurology, University of Rochester, Rochester, NY, USA
| | - Greta Smith
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
| | - Konnor Herbst
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
| | - Robert Zielinski
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
- Department of Biostatistics, Brown University, Providence, RI, USA
| | - Norman C.W. Yung
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
| | - Donald G. Grosset
- School of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
| | - E. Ray Dorsey
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
- Department of Neurology, University of Rochester, Rochester, NY, USA
| | - Karl Kieburtz
- Center for Health + Technology, University of Rochester, Rochester, NY, USA
- Department of Neurology, University of Rochester, Rochester, NY, USA
| |
Collapse
|
4
|
Two-year clinical progression in focal and diffuse subtypes of Parkinson's disease. NPJ Parkinsons Dis 2023; 9:29. [PMID: 36806285 PMCID: PMC9937525 DOI: 10.1038/s41531-023-00466-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 01/06/2023] [Indexed: 02/19/2023] Open
Abstract
Heterogeneity in Parkinson's disease (PD) presents a barrier to understanding disease mechanisms and developing new treatments. This challenge may be partially overcome by stratifying patients into clinically meaningful subtypes. A recent subtyping scheme classifies de novo PD patients into three subtypes: mild-motor predominant, intermediate, or diffuse-malignant, based on motor impairment, cognitive function, rapid eye movement sleep behavior disorder (RBD) symptoms, and autonomic symptoms. We aimed to validate this approach in a large longitudinal cohort of early-to-moderate PD (n = 499) by assessing the influence of subtyping on clinical characteristics at baseline and on two-year progression. Compared to mild-motor predominant patients (42%), diffuse-malignant patients (12%) showed involvement of more clinical domains, more diffuse hypokinetic-rigid motor symptoms (decreased lateralization and hand/foot focality), and faster two-year progression. These findings extend the classification of diffuse-malignant and mild-motor predominant subtypes to early-to-moderate PD and suggest that different pathophysiological mechanisms (focal versus diffuse cerebral propagation) may underlie distinct subtype classifications.
Collapse
|
5
|
Dadu A, Satone V, Kaur R, Hashemi SH, Leonard H, Iwaki H, Makarious MB, Billingsley KJ, Bandres‐Ciga S, Sargent LJ, Noyce AJ, Daneshmand A, Blauwendraat C, Marek K, Scholz SW, Singleton AB, Nalls MA, Campbell RH, Faghri F. Identification and prediction of Parkinson's disease subtypes and progression using machine learning in two cohorts. NPJ Parkinsons Dis 2022; 8:172. [PMID: 36526647 PMCID: PMC9758217 DOI: 10.1038/s41531-022-00439-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 11/29/2022] [Indexed: 12/23/2022] Open
Abstract
The clinical manifestations of Parkinson's disease (PD) are characterized by heterogeneity in age at onset, disease duration, rate of progression, and the constellation of motor versus non-motor features. There is an unmet need for the characterization of distinct disease subtypes as well as improved, individualized predictions of the disease course. We used unsupervised and supervised machine learning methods on comprehensive, longitudinal clinical data from the Parkinson's Disease Progression Marker Initiative (n = 294 cases) to identify patient subtypes and to predict disease progression. The resulting models were validated in an independent, clinically well-characterized cohort from the Parkinson's Disease Biomarker Program (n = 263 cases). Our analysis distinguished three distinct disease subtypes with highly predictable progression rates, corresponding to slow, moderate, and fast disease progression. We achieved highly accurate projections of disease progression 5 years after initial diagnosis with an average area under the curve (AUC) of 0.92 (95% CI: 0.95 ± 0.01) for the slower progressing group (PDvec1), 0.87 ± 0.03 for moderate progressors, and 0.95 ± 0.02 for the fast-progressing group (PDvec3). We identified serum neurofilament light as a significant indicator of fast disease progression among other key biomarkers of interest. We replicated these findings in an independent cohort, released the analytical code, and developed models in an open science manner. Our data-driven study provides insights to deconstruct PD heterogeneity. This approach could have immediate implications for clinical trials by improving the detection of significant clinical outcomes. We anticipate that machine learning models will improve patient counseling, clinical trial design, and ultimately individualized patient care.
Collapse
Affiliation(s)
- Anant Dadu
- grid.35403.310000 0004 1936 9991Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61820 USA ,grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.511118.dData Tecnica International, Washington, DC 20812 USA
| | - Vipul Satone
- grid.35403.310000 0004 1936 9991Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Champaign, IL 61820 USA
| | - Rachneet Kaur
- grid.35403.310000 0004 1936 9991Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Champaign, IL 61820 USA
| | - Sayed Hadi Hashemi
- grid.35403.310000 0004 1936 9991Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61820 USA
| | - Hampton Leonard
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.511118.dData Tecnica International, Washington, DC 20812 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Hirotaka Iwaki
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.511118.dData Tecnica International, Washington, DC 20812 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Mary B. Makarious
- grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA ,grid.83440.3b0000000121901201Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, UK ,grid.83440.3b0000000121901201UCL Movement Disorders Centre, University College London, London, UK
| | - Kimberley J. Billingsley
- grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Sara Bandres‐Ciga
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Lana J. Sargent
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.224260.00000 0004 0458 8737School of Nursing, Virginia Commonwealth University, Richmond, VA 23298 USA
| | - Alastair J. Noyce
- grid.83440.3b0000000121901201UCL Movement Disorders Centre, University College London, London, UK ,grid.416041.60000 0001 0738 5466Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London and Department of Neurology, Royal London Hospital, London, UK
| | - Ali Daneshmand
- grid.189504.10000 0004 1936 7558Department of Neurology, Boston Medical Center, Boston University School of Medicine, Boston, MA 02118 USA
| | - Cornelis Blauwendraat
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Ken Marek
- grid.452597.8InviCRO LLC, Boston, MA USA ,grid.452597.8Molecular Neuroimaging, A Division of InviCRO, New Haven, CT USA
| | - Sonja W. Scholz
- grid.416870.c0000 0001 2177 357XNeurodegenerative Diseases Research Unit, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD USA ,grid.21107.350000 0001 2171 9311Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Andrew B. Singleton
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Mike A. Nalls
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.511118.dData Tecnica International, Washington, DC 20812 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| | - Roy H. Campbell
- grid.35403.310000 0004 1936 9991Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61820 USA
| | - Faraz Faghri
- grid.94365.3d0000 0001 2297 5165Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892 USA ,grid.511118.dData Tecnica International, Washington, DC 20812 USA ,grid.94365.3d0000 0001 2297 5165Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892 USA
| |
Collapse
|
6
|
Krishnagopal S. The collective vs individual nature of mountaineering: a network and simplicial approach. APPLIED NETWORK SCIENCE 2022; 7:62. [PMID: 36072295 PMCID: PMC9440880 DOI: 10.1007/s41109-022-00503-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 08/22/2022] [Indexed: 06/15/2023]
Abstract
Mountaineering is a sport of contrary forces: teamwork plays a large role in mental fortitude and skills, but the actual act of climbing, and indeed survival, is largely individualistic. This work studies the effects of the structure and topology of relationships within climbers on the level of cooperation and success. It does so using simplicial complexes, where relationships between climbers are captured through simplices that correspond to joint previous expeditions with dimension given by the number of climbers minus one and weight given by the number of occurrences of the simplex. First, this analysis establishes the importance of relationships in mountaineering and shows that chances of failure to summit reduce drastically when climbing with repeated partners. From a climber-centric perspective, it finds that climbers that belong to simplices with large dimension were more likely to be successful, across all experience levels. Then, the distribution of relationships within a group is explored to categorize collective human behavior in expeditions, on a spectrum from polarized to cooperative. Expeditions containing simplices with large dimension, and usually low weight (weak relationships), implying that a large number of people participated in a small number of joint expeditions, tended to be more cooperative, improving chances of success of all members of the group, not just those that were part of the simplex. On the other hand, the existence of small, usually high weight (i.e., strong relationships) simplices, subgroups lead to a polarized style where climbers that were not a part of the subgroup were less likely to succeed. Lastly, this work examines the effects of individual features (such as age, gender, climber experience etc.) and expedition-wide factors (number of camps, total number of days etc.) that are more important determiners of success in individualistic and cooperative expeditions respectively. Centrality indicates that individual features of youth and oxygen use while ascending are the most important predictors of success. Of expedition-wide factors, the expedition size and number of expedition days are found to be strongly correlated with success rate.
Collapse
Affiliation(s)
- Sanjukta Krishnagopal
- Department of Mathematics, University of California, Los Angeles, Los Angeles, United States
- Berkeley Artificial Intelligence Research Lab, University of California, Berkeley, Berkeley, United States
| |
Collapse
|
7
|
Krishnagopal S, Lohse K, Braun R. Stroke recovery phenotyping through network trajectory approaches and graph neural networks. Brain Inform 2022; 9:13. [PMID: 35717640 PMCID: PMC9206968 DOI: 10.1186/s40708-022-00160-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 04/23/2022] [Indexed: 11/23/2022] Open
Abstract
Stroke is a leading cause of neurological injury characterized by impairments in multiple neurological domains including cognition, language, sensory and motor functions. Clinical recovery in these domains is tracked using a wide range of measures that may be continuous, ordinal, interval or categorical in nature, which can present challenges for multivariate regression approaches. This has hindered stroke researchers’ ability to achieve an integrated picture of the complex time-evolving interactions among symptoms. Here, we use tools from network science and machine learning that are particularly well-suited to extracting underlying patterns in such data, and may assist in prediction of recovery patterns. To demonstrate the utility of this approach, we analyzed data from the NINDS tPA trial using the Trajectory Profile Clustering (TPC) method to identify distinct stroke recovery patterns for 11 different neurological domains at 5 discrete time points. Our analysis identified 3 distinct stroke trajectory profiles that align with clinically relevant stroke syndromes, characterized both by distinct clusters of symptoms, as well as differing degrees of symptom severity. We then validated our approach using graph neural networks to determine how well our model performed predictively for stratifying patients into these trajectory profiles at early vs. later time points post-stroke. We demonstrate that trajectory profile clustering is an effective method for identifying clinically relevant recovery subtypes in multidimensional longitudinal datasets, and for early prediction of symptom progression subtypes in individual patients. This paper is the first work introducing network trajectory approaches for stroke recovery phenotyping, and is aimed at enhancing the translation of such novel computational approaches for practical clinical application.
Collapse
Affiliation(s)
- Sanjukta Krishnagopal
- Gatsby Computational Neuroscience Unit, University College London, London, W1T 4JG, UK.
| | - Keith Lohse
- Physical Therapy and Neurology, Washington University School of Medicine, 4444 Forest Park Ave., Suite 1101, St. Louis, MO, 63108-2212, USA
| | - Robynne Braun
- Department of Neurology, University of Maryland School of Medicine, 655 W. Baltimore Street, Bressler Research Building, 12th Floor, Baltimore, MD, 21201, USA, on behalf of the GPAS Collaboration, Phenotyping Core
| |
Collapse
|
8
|
Predicting Parkinson disease related genes based on PyFeat and gradient boosted decision tree. Sci Rep 2022; 12:10004. [PMID: 35705654 PMCID: PMC9200794 DOI: 10.1038/s41598-022-14127-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 06/01/2022] [Indexed: 11/10/2022] Open
Abstract
Identifying genes related to Parkinson’s disease (PD) is an active research topic in biomedical analysis, which plays a critical role in diagnosis and treatment. Recently, many studies have proposed different techniques for predicting disease-related genes. However, a few of these techniques are designed or developed for PD gene prediction. Most of these PD techniques are developed to identify only protein genes and discard long noncoding (lncRNA) genes, which play an essential role in biological processes and the transformation and development of diseases. This paper proposes a novel prediction system to identify protein and lncRNA genes related to PD that can aid in an early diagnosis. First, we preprocessed the genes into DNA FASTA sequences from the University of California Santa Cruz (UCSC) genome browser and removed the redundancies. Second, we extracted some significant features of DNA FASTA sequences using the PyFeat method with the AdaBoost as feature selection. These selected features achieved promising results compared with extracted features from some state-of-the-art feature extraction techniques. Finally, the features were fed to the gradient-boosted decision tree (GBDT) to diagnose different tested cases. Seven performance metrics were used to evaluate the performance of the proposed system. The proposed system achieved an average accuracy of 78.6%, the area under the curve equals 84.5%, the area under precision-recall (AUPR) equals 85.3%, F1-score equals 78.3%, Matthews correlation coefficient (MCC) equals 0.575, sensitivity (SEN) equals 77.1%, and specificity (SPC) equals 80.2%. The experiments demonstrate promising results compared with other systems. The predicted top-rank protein and lncRNA genes are verified based on a literature review.
Collapse
|
9
|
Hendricks RM, Khasawneh MT. A Systematic Review of Parkinson's Disease Cluster Analysis Research. Aging Dis 2021; 12:1567-1586. [PMID: 34631208 PMCID: PMC8460306 DOI: 10.14336/ad.2021.0519] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 05/18/2021] [Indexed: 12/17/2022] Open
Abstract
One way to understand the Parkinson’s disease (PD) population is to investigate the similarities and differences among patients through cluster analysis, which may lead to defined, patient subgroups for diagnosis, progression tracking and treatment planning. This paper provides a systematic review of PD patient clustering research, evaluating the variables included in clustering, the cluster methods applied, the resulting patient subgroups, and evaluation metrics. A search was conducted from 1999 to 2021 on the PubMed database, using various search terms including: Parkinson’s disease, cluster, and analysis. The majority of studies included a variety of clinical scale scores for clustering, of which many provide a numerical, but ordinal, categorical value. Even though the scale scores are ordinal, these were treated as numerical values with numerical and continuous values being the focus of the clustering, with limited attention to categorical variables, such as gender and family history, which may also provide useful insights into disease diagnosis, progression, and treatment. The results pointed to two to five patient clusters, with similarities among the age of onset and disease duration. The studies lacked the use of existing clustering evaluation metrics which points to a need for a thorough, analysis framework, and consensus on the appropriate variables to include in cluster analysis. Accurate cluster analysis may assist with determining if PD patients’ symptoms can be treated based on a subgroup of features, if personalized care is required, or if a mix of individualized and group-based care is the best approach.
Collapse
Affiliation(s)
- Renee M Hendricks
- Department of Systems Science and Industrial Engineering, Binghamton University, Binghamton, NY 13902, USA
| | - Mohammad T Khasawneh
- Department of Systems Science and Industrial Engineering, Binghamton University, Binghamton, NY 13902, USA
| |
Collapse
|
10
|
Comprehensive subtyping of Parkinson's disease patients with similarity fusion: a case study with BioFIND data. NPJ PARKINSONS DISEASE 2021; 7:83. [PMID: 34535682 PMCID: PMC8448859 DOI: 10.1038/s41531-021-00228-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 08/30/2021] [Indexed: 12/28/2022]
Abstract
Parkinson’s disease (PD) is a complex neurodegenerative disorder with diverse clinical manifestations. To better understand this disease, research has been done to categorize, or subtype, patients, using an array of criteria derived from clinical assessments and biospecimen analyses. In this study, using data from the BioFIND cohort, we aimed at identifying subtypes of moderate-to-advanced PD via comprehensively considering motor and non-motor manifestations. A total of 103 patients were included for analysis. Through the use of a patient-wise similarity matrix fusion technique and hierarchical agglomerative clustering analysis, three unique subtypes emerged from the clustering results. Subtype I, comprised of 60 patients (~58.3%), was characterized by mild symptoms, both motor and non-motor. Subtype II, comprised of 20 (~19.4%) patients, was characterized by an intermediate severity, with a high tremor score and mild non-motor symptoms. Subtype III, comprised of 23 (~22.3%) patients, was characterized by more severe motor and non-motor symptoms. These subtypes show statistically significant differences when looking at motor (on and off medication) clinical features and non-motor clinical features, while there was no clear difference in demographics, biomarker levels, and genetic risk scores.
Collapse
|
11
|
Hagan RD, Langston MA. Molecular Subtyping and Outlier Detection in Human Disease Using the Paraclique Algorithm. ALGORITHMS 2021; 14:63. [PMID: 36092474 PMCID: PMC9455766 DOI: 10.3390/a14020063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Recent discoveries of distinct molecular subtypes have led to remarkable advances in treatment for a variety of diseases. While subtyping via unsupervised clustering has received a great deal of interest, most methods rely on basic statistical or machine learning methods. At the same time, techniques based on graph clustering, particularly clique-based strategies, have been successfully used to identify disease biomarkers and gene networks. A graph theoretical approach based on the paraclique algorithm is described that can easily be employed to identify putative disease subtypes and serve as an aid in outlier detection as well. The feasibility and potential effectiveness of this method is demonstrated on publicly available gene co-expression data derived from patient samples covering twelve different disease families.
Collapse
|
12
|
Krishnagopal S. Multi-layer Trajectory Clustering: a Network Algorithm for Disease Subtyping. Biomed Phys Eng Express 2020; 6. [PMID: 35046146 DOI: 10.1088/2057-1976/abad8f] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 08/06/2020] [Indexed: 01/16/2023]
Abstract
Many diseases display heterogeneity in clinical features and their progression, indicative of the existence of disease subtypes. Extracting patterns of disease variable progression for subtypes has tremendous application in medicine, for example, in early prognosis and personalized medical therapy. This work presents a novel, data-driven, network-based Trajectory Clustering (TC) algorithm for identifying Parkinson's subtypes based on disease trajectory. Modeling patient-variable interactions as a bipartite network, TC first extracts communities of co-expressing disease variables at different stages of progression. Then, it identifies Parkinson's subtypes by clustering similar patient trajectories that are characterized by severity of disease variables through a multi-layer network. Determination of trajectory similarity accounts for direct overlaps between trajectories as well as second-order similarities, i.e., common overlap with a third set of trajectories. This work clusters trajectories across two types of layers: (a) temporal, and (b) ranges of independent outcome variable (representative of disease severity), both of which yield four distinct subtypes. The former subtypes exhibit differences in progression of disease domains (Cognitive, Mental Health etc.), whereas the latter subtypes exhibit different degrees of progression, i.e., some remain mild, whereas others show significant deterioration after 5 years. The TC approach is validated through statistical analyses and consistency of the identified subtypes with medical literature. This generalizable and robust method can easily be extended to other progressive multi-variate disease datasets, and can effectively assist in targeted subtype-specific treatment in the field of personalized medicine.
Collapse
Affiliation(s)
- Sanjukta Krishnagopal
- Department of Physics, University of Maryland, College Park, Maryland, 20742, United States of America.,Gatsby Computational Neuroscience Unit, University College London, London, W1T4JG, United Kingdom
| |
Collapse
|