1
|
Ahangaran M, Dawalatabad N, Karjadi C, Glass J, Au R, Kolachalama VB. Obfuscation via pitch-shifting for balancing privacy and diagnostic utility in voice-based cognitive assessment. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.11.25.24317900. [PMID: 39649616 PMCID: PMC11623733 DOI: 10.1101/2024.11.25.24317900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Introduction Digital voice analysis is gaining traction as a tool to differentiate cognitively normal from impaired individuals. However, voice data poses privacy risks due to the potential identification of speakers by automated systems. Methods We developed a framework that uses weighted linear interpolation of privacy and utility metrics to balance speaker obfuscation and cognitive integrity in cognitive assessments. This framework applies pitch-shifting for speaker obfuscation while preserving cognitive speech features. We tested it on digital voice recordings from the Framingham Heart Study (N=128) and Dementia Bank Delaware corpus (N=85), both containing responses to neuropsychological tests. Results The tool effectively obfuscated speaker identity while maintaining cognitive feature integrity, achieving an accuracy of 0.6465 in classifying individuals with normal cognition, mild cognitive impairment, and dementia in the FHS cohort. Discussion Our approach enables the development of digital markers for dementia assessment while protecting sensitive personal information, offering a scalable solution for privacy-preserving voice-based diagnostics.
Collapse
Affiliation(s)
- Meysam Ahangaran
- Department of Medicine, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
| | - Nauman Dawalatabad
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, USA - 02139
| | - Cody Karjadi
- Department of Anatomy and Neurobiology, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- The Framingham Heart Study, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
| | - James Glass
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, USA - 02139
| | - Rhoda Au
- Department of Medicine, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- Department of Anatomy and Neurobiology, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- The Framingham Heart Study, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- Department of Neurology, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- Department of Epidemiology, Boston University School of Public Health, 715 Albany St, Boston, MA, USA - 02118
- Boston University Alzheimer’s Disease Research Center, 72 E. Concord St, Boston, MA, USA - 02118
| | - Vijaya B. Kolachalama
- Department of Medicine, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA – 02118
- Boston University Alzheimer’s Disease Research Center, 72 E. Concord St, Boston, MA, USA - 02118
- Department of Computer Science, Boston University, 665 Comm Ave, MA, USA - 02215
- Faculty of Computing and Data Sciences, Boston University, 665 Comm Ave, MA, USA - 02215
| |
Collapse
|
2
|
Lin YC, Yan HT, Lin CH, Chang HH. Identifying and Estimating Frailty Phenotypes by Vocal Biomarkers: Cross-Sectional Study. J Med Internet Res 2024; 26:e58466. [PMID: 39515817 PMCID: PMC11584546 DOI: 10.2196/58466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 06/04/2024] [Accepted: 10/07/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Researchers have developed a variety of indices to assess frailty. Recent research indicates that the human voice reflects frailty status. Frailty phenotypes are seldom discussed in the literature on the aging voice. OBJECTIVE This study aims to examine potential phenotypes of frail older adults and determine their correlation with vocal biomarkers. METHODS Participants aged ≥60 years who visited the geriatric outpatient clinic of a teaching hospital in central Taiwan between 2020 and 2021 were recruited. We identified 4 frailty phenotypes: energy-based frailty, sarcopenia-based frailty, hybrid-based frailty-energy, and hybrid-based frailty-sarcopenia. Participants were asked to pronounce a sustained vowel "/a/" for approximately 1 second. The speech signals were digitized and analyzed. Four voice parameters-the average number of zero crossings (A1), variations in local peaks and valleys (A2), variations in first and second formant frequencies (A3), and spectral energy ratio (A4)-were used for analyzing changes in voice. Logistic regression was used to elucidate the prediction model. RESULTS Among 277 older adults, an increase in A1 values was associated with a lower likelihood of energy-based frailty (odds ratio [OR] 0.81, 95% CI 0.68-0.96), whereas an increase in A2 values resulted in a higher likelihood of sarcopenia-based frailty (OR 1.34, 95% CI 1.18-1.52). Respondents with larger A3 and A4 values had a higher likelihood of hybrid-based frailty-sarcopenia (OR 1.03, 95% CI 1.002-1.06) and hybrid-based frailty-energy (OR 1.43, 95% CI 1.02-2.01), respectively. CONCLUSIONS Vocal biomarkers might be potentially useful in estimating frailty phenotypes. Clinicians can use 2 crucial acoustic parameters, namely A1 and A2, to diagnose a frailty phenotype that is associated with insufficient energy or reduced muscle function. The assessment of A3 and A4 involves a complex frailty phenotype.
Collapse
Affiliation(s)
- Yu-Chun Lin
- Graduate Institute of Integrated Medicine, College of Chinese Medicine, China Medical University, Taichung, Taiwan
- Department of Chinese Medicine, China Medical University Hospital, Taichung, Taiwan
| | - Huang-Ting Yan
- Institute of Political Science, Academia Sinica, Taipei, Taiwan
| | - Chih-Hsueh Lin
- School of Medicine, College of Medicine, China Medical University, Taichung, Taiwan
- Department of Family Medicine, China Medical University Hospital, Taichung, Taiwan
| | - Hen-Hong Chang
- Graduate Institute of Integrated Medicine, College of Chinese Medicine, China Medical University, Taichung, Taiwan
- Department of Chinese Medicine, China Medical University Hospital, Taichung, Taiwan
- Chinese Medicine Research Centre, China Medical University, Taichung, Taiwan
| |
Collapse
|
3
|
Mahon E, Lachman ME. Voice biomarkers in middle and later adulthood as predictors of cognitive changes. Front Psychol 2024; 15:1422376. [PMID: 39492818 PMCID: PMC11527629 DOI: 10.3389/fpsyg.2024.1422376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 09/16/2024] [Indexed: 11/05/2024] Open
Abstract
Background Prosody voice measures, especially jitter and shimmer, have been associated with cognitive impairment and hold potential as early indicators of risk for cognitive decline. Prior research suggests that voice measures assessed concurrently with longitudinal cognitive outcomes are associated with 10-year cognitive declines in middle-age and older adults from Midlife in the U.S. (MIDUS) study. Results Using a subsample from the same study, we expanded previous research to examine voice measures that were (1) collected 8 years before cognitive outcomes, (2) derived from narrative speech in logical memory tests instead of word list recall tests, and (3) independent of the cognitive outcomes. Multilevel analyses controlled for covariates of age, sex, education, neurological conditions, depressive symptoms, and chronic conditions. The results indicated that higher jitter and lower shimmer predicted greater 10-year declines in episodic memory and working memory. Conclusion These findings extend previous research by highlighting prosody voice measures assessed 8 years earlier as predictors of subsequent cognitive declines over a decade.
Collapse
Affiliation(s)
- Elizabeth Mahon
- Psychology Department of Psychology, Brandeis University, Waltham, MA, United States
| | | |
Collapse
|
4
|
Libon DJ, Swenson R, Price CC, Lamar M, Cosentino S, Bezdicek O, Kling MA, Tobyne S, Jannati A, Banks R, Pascual-Leone A. Digital assessment of cognition in neurodegenerative disease: a data driven approach leveraging artificial intelligence. Front Psychol 2024; 15:1415629. [PMID: 39035083 PMCID: PMC11258860 DOI: 10.3389/fpsyg.2024.1415629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 06/12/2024] [Indexed: 07/23/2024] Open
Abstract
Introduction A rapid and reliable neuropsychological protocol is essential for the efficient assessment of neurocognitive constructs related to emergent neurodegenerative diseases. We developed an AI-assisted, digitally administered/scored neuropsychological protocol that can be remotely administered in ~10 min. This protocol assesses the requisite neurocognitive constructs associated with emergent neurodegenerative illnesses. Methods The protocol was administered to 77 ambulatory care/memory clinic patients (56.40% women; 88.50% Caucasian). The protocol includes a 6-word version of the Philadelphia (repeatable) Verbal Learning Test [P(r)VLT], three trials of 5 digits backward from the Backwards Digit Span Test (BDST), and the "animal" fluency test. The protocol provides a comprehensive set of traditional "core" measures that are typically obtained through paper-and-pencil tests (i.e., serial list learning, immediate and delayed free recall, recognition hits, percent correct serial order backward digit span, and "animal" fluency output). Additionally, the protocol includes variables that quantify errors and detail the processes used in administering the tests. It also features two separate, norm-referenced summary scores specifically designed to measure executive control and memory. Results Using four core measures, we used cluster analysis to classify participants into four groups: cognitively unimpaired (CU; n = 23), amnestic mild cognitive impairment (MCI; n = 17), dysexecutive MCI (n = 23), and dementia (n = 14). Subsequent analyses of error and process variables operationally defined key features of amnesia (i.e., rapid forgetting, extra-list intrusions, profligate responding to recognition foils); key features underlying reduced executive abilities (i.e., BDST items and dysexecutive errors); and the strength of the semantic association between successive responses on the "animal" fluency test. Executive and memory index scores effectively distinguished between all four groups. There was over 90% agreement between how cluster analysis of digitally obtained measures classified patients compared to classification using a traditional comprehensive neuropsychological protocol. The correlations between digitally obtained outcome variables and analogous paper/pencil measures were robust. Discussion The digitally administered protocol demonstrated a capacity to identify patterns of impaired performance and classification similar to those observed with standard paper/pencil neuropsychological tests. The inclusion of both core measures and detailed error/process variables suggests that this protocol can detect subtle, nuanced signs of early emergent neurodegenerative illness efficiently and comprehensively.
Collapse
Affiliation(s)
- David J. Libon
- Department of Geriatrics and Gerontology, New Jersey Institute for Successful Aging, Rowan-Virtua School of Osteopathic Medicine, Glassboro, NJ, United States
- Department of Psychology, Rowan University, Glassboro, NJ, United States
| | - Rod Swenson
- Department of Psychiatry and Behavioral Health, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, United States
| | - Catherine C. Price
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, United States
| | - Melissa Lamar
- Rush Alzheimer's Disease Center and the Department of Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, United States
| | - Stephanie Cosentino
- Columbia University Medical Center, Department of Neurology, Cognitive Neuroscience Division, Taub Institute and Sergievsky Center, New York, NY, United States
| | - Ondrej Bezdicek
- Department of Neurology and Center of Clinical Neuroscience, First Faculty of Medicine, Charles University, Prague, Czechia
| | - Mitchel A. Kling
- Department of Geriatrics and Gerontology, New Jersey Institute for Successful Aging, Rowan-Virtua School of Osteopathic Medicine, Glassboro, NJ, United States
| | | | - Ali Jannati
- Linus Health, Boston, MA, United States
- Department of Neurology, Harvard Medical School, Boston, MA, United States
| | | | - Alvaro Pascual-Leone
- Linus Health, Boston, MA, United States
- Hinda and Arthur Marcus Institute for Aging Research and Deanna, Sidney Wolk Center for Memory Health, Hebrew Senior Life, Boston, MA, United States
| |
Collapse
|
5
|
Harris C, Tang Y, Birnbaum E, Cherian C, Mendhe D, Chen MH. Digital Neuropsychology beyond Computerized Cognitive Assessment: Applications of Novel Digital Technologies. Arch Clin Neuropsychol 2024; 39:290-304. [PMID: 38520381 PMCID: PMC11485276 DOI: 10.1093/arclin/acae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 02/16/2024] [Indexed: 03/25/2024] Open
Abstract
Compared with other health disciplines, there is a stagnation in technological innovation in the field of clinical neuropsychology. Traditional paper-and-pencil tests have a number of shortcomings, such as low-frequency data collection and limitations in ecological validity. While computerized cognitive assessment may help overcome some of these issues, current computerized paradigms do not address the majority of these limitations. In this paper, we review recent literature on the applications of novel digital health approaches, including ecological momentary assessment, smartphone-based assessment and sensors, wearable devices, passive driving sensors, smart homes, voice biomarkers, and electronic health record mining, in neurological populations. We describe how each digital tool may be applied to neurologic care and overcome limitations of traditional neuropsychological assessment. Ethical considerations, limitations of current research, as well as our proposed future of neuropsychological practice are also discussed.
Collapse
Affiliation(s)
- Che Harris
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
- Department of Neurology, Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ, USA
| | - Yingfei Tang
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
- Department of Neurology, Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ, USA
| | - Eliana Birnbaum
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
| | - Christine Cherian
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
| | - Dinesh Mendhe
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
| | - Michelle H Chen
- Institute for Health, Health Care Policy and Aging Research, Rutgers University, New Brunswick, NJ, USA
- Department of Neurology, Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ, USA
| |
Collapse
|
6
|
Rosen-Lang Y, Zoubi S, Cialic R, Orenstein T. Using voice biomarkers for frailty classification. GeroScience 2024; 46:1175-1179. [PMID: 37480417 PMCID: PMC10828289 DOI: 10.1007/s11357-023-00872-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Accepted: 07/11/2023] [Indexed: 07/24/2023] Open
Abstract
Clinicians use the patient's voice intuitively to evaluate general health and frailty. Voice is an emerging health indicator but has been scarcely studied in the context of frailty. This study explored voice parameters as possible predictors of frailty in older adults. Fifty-three participants over 70 years old were recruited from rehabilitation wards at a tertiary medical center. Participants' frailty was assessed using Rockwood frailty index and they were classified as most-frail (n = 33, 68%) or less-frail (n = 20, 32%). Participants were recorded counting from 1 to 10 and backwards using a smartphone recording application. The following voice biomarkers were derived: peak and average volume, peak/average volume ratio, pauses' total length, and pause length standard deviation. The most-frail group had a higher peak volume/average volume ratio (p = 0.03) and greater variance in lengths of pauses between speech segments (p = 0.002). These parameters indicate greater speech irregularity in the most-frail, compared to the less-frail. The most-frail group also had a longer total duration of pauses (p = 0.02). No statistically significant difference was found in peak and average volume (p = 0.75 and 0.39). Most-frail participants' speech had different characteristics, compared to participants in the less-frail group. This is a first step to developing an AI-based frailty assessment tool that can assist in identifying our most vulnerable patients.
Collapse
Affiliation(s)
- Yael Rosen-Lang
- Joseph Sagol Neuroscience Center, Sheba Medical Center, Ramat-Gan, Israel
| | - Saad Zoubi
- Geriatric Division, Tel-Aviv Sourasky Medical Center, Tel-Aviv, Israel
| | - Ron Cialic
- Geriatric Division, Tel-Aviv Sourasky Medical Center, Tel-Aviv, Israel
| | - Tal Orenstein
- Geriatric Division, Tel-Aviv Sourasky Medical Center, Tel-Aviv, Israel.
| |
Collapse
|
7
|
Rudisch DM, Krasko MN, Barnett DGS, Mueller KD, Russell JA, Connor NP, Ciucci MR. Early ultrasonic vocalization deficits and related thyroarytenoid muscle pathology in the transgenic TgF344-AD rat model of Alzheimer's disease. Front Behav Neurosci 2024; 17:1294648. [PMID: 38322496 PMCID: PMC10844490 DOI: 10.3389/fnbeh.2023.1294648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/01/2023] [Indexed: 02/08/2024] Open
Abstract
Background Alzheimer's disease (AD) is a progressive neurologic disease and the most common cause of dementia. Classic pathology in AD is characterized by inflammation, abnormal presence of tau protein, and aggregation of β-amyloid that disrupt normal neuronal function and lead to cell death. Deficits in communication also occur during disease progression and significantly reduce health, well-being, and quality of life. Because clinical diagnosis occurs in the mid-stage of the disease, characterizing the prodrome and early stages in humans is currently challenging. To overcome these challenges, we use the validated TgF344-AD (F344-Tg(Prp-APP, Prp-PS1)19/Rrrc) transgenic rat model that manifests cognitive, behavioral, and neuropathological dysfunction akin to AD in humans. Objectives The overarching goal of our work is to test the central hypothesis that pathology and related behavioral deficits such as communication dysfunction in part manifest in the peripheral nervous system and corresponding target tissues already in the early stages. The primary aims of this study are to test the hypotheses that: (1) changes in ultrasonic vocalizations (USV) occur in the prodromal stage at 6 months of age and worsen at 9 months of age, (2) inflammation as well as AD-related pathology can be found in the thyroarytenoid muscle (TA) at 12 months of age (experimental endpoint tissue harvest), and to (3) demonstrate that the TgF344-AD rat model is an appropriate model for preclinical investigations of early AD-related vocal deficits. Methods USVs were collected from male TgF344-AD (N = 19) and wildtype (WT) Fischer-344 rats (N = 19) at 6 months (N = 38; WT: n = 19; TgF344-AD: n = 19) and 9 months of age (N = 18; WT: n = 10; TgF344-AD: n = 8) and acoustically analyzed for duration, mean power, principal frequency, low frequency, high frequency, peak frequency, and call type. RT-qPCR was used to assay peripheral inflammation and AD-related pathology via gene expressions in the TA muscle of male TgF344-AD rats (n = 6) and WT rats (n = 6) at 12 months of age. Results This study revealed a significant reduction in mean power of ultrasonic calls from 6 to 9 months of age and increased peak frequency levels over time in TgF344-AD rats compared to WT controls. Additionally, significant downregulation of AD-related genes Uqcrc2, Bace2, Serpina3n, and Igf2, as well as downregulation of pro-inflammatory gene Myd88 was found in the TA muscle of TgF344-AD rats at 12 months of age. Discussion Our findings demonstrate early and progressive vocal deficits in the TgF344-AD rat model. We further provide evidence of dysregulation of AD-pathology-related genes as well as inflammatory genes in the TA muscles of TgF344-AD rats in the early stage of the disease, confirming this rat model for early-stage investigations of voice deficits and related pathology.
Collapse
Affiliation(s)
- Denis Michael Rudisch
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
- UW Institute for Clinical and Translational Research, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Maryann N Krasko
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - David G S Barnett
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Kimberly D Mueller
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
- Wisconsin Alzheimer's Disease Research Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - John A Russell
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Nadine P Connor
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Michelle R Ciucci
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
- Department of Surgery, Division of Otolaryngology - Head and Neck Surgery, UW School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
- Neuroscience Training Program, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
8
|
Park CY, Kim M, Shim Y, Ryoo N, Choi H, Jeong HT, Yun G, Lee H, Kim H, Kim S, Youn YC. Harnessing the Power of Voice: A Deep Neural Network Model for Alzheimer's Disease Detection. Dement Neurocogn Disord 2024; 23:1-10. [PMID: 38362055 PMCID: PMC10864696 DOI: 10.12779/dnd.2024.23.1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/03/2023] [Accepted: 12/08/2023] [Indexed: 02/17/2024] Open
Abstract
Background and Purpose Voice, reflecting cerebral functions, holds potential for analyzing and understanding brain function, especially in the context of cognitive impairment (CI) and Alzheimer's disease (AD). This study used voice data to distinguish between normal cognition and CI or Alzheimer's disease dementia (ADD). Methods This study enrolled 3 groups of subjects: 1) 52 subjects with subjective cognitive decline; 2) 110 subjects with mild CI; and 3) 59 subjects with ADD. Voice features were extracted using Mel-frequency cepstral coefficients and Chroma. Results A deep neural network (DNN) model showed promising performance, with an accuracy of roughly 81% in 10 trials in predicting ADD, which increased to an average value of about 82.0%±1.6% when evaluated against unseen test dataset. Conclusions Although results did not demonstrate the level of accuracy necessary for a definitive clinical tool, they provided a compelling proof-of-concept for the potential use of voice data in cognitive status assessment. DNN algorithms using voice offer a promising approach to early detection of AD. They could improve the accuracy and accessibility of diagnosis, ultimately leading to better outcomes for patients.
Collapse
Affiliation(s)
- Chan-Young Park
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
| | - Minsoo Kim
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - YongSoo Shim
- Department of Neurology, Eunpyeong St. Mary's Hospital, The Catholic University of Korea, Seoul, Korea
| | - Nayoung Ryoo
- Department of Neurology, Eunpyeong St. Mary's Hospital, The Catholic University of Korea, Seoul, Korea
| | - Hyunjoo Choi
- Department of Communication Disorders, Korea Nazarene University, Cheonan, Korea
| | - Ho Tae Jeong
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
| | - Gihyun Yun
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - Hunboc Lee
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - Hyungryul Kim
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - SangYun Kim
- Department of Neurology, Seoul National University College of Medicine and Seoul National University Bundang Hospital, Seongnam, Korea
| | - Young Chul Youn
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
- Department of Medical Informatics, Chung-Ang University College of Medicine, Seoul, Korea
| |
Collapse
|
9
|
Parlak MM, Saylam G, Babademez MA, Munis ÖB, Tokgöz SA. Voice analysis results in individuals with Alzheimer's disease: How do age and cognitive status affect voice parameters? Brain Behav 2023; 13:e3271. [PMID: 37794703 PMCID: PMC10636380 DOI: 10.1002/brb3.3271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/17/2023] [Accepted: 09/23/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Reports of acoustic changes in the voice in individuals with Alzheimer's disease (AD) and the relationship of acoustic changes with age and cognitive status are still limited. OBJECTIVE This study aims to determine the changes in voice analysis results in AD, as well as the effects of age and cognitive status on voice parameters. METHODS The study included 47 (AD: 30; healthy: 17) women with a mean age of 76.13 years. The acoustic voice parameters mean fundamental frequency (F0), relative average perturbation (RAP), jitter percent (Jitt), shimmer percent (Shim), and noise-to-harmonic ratio were detected. The mini-mental state examination (MMSE) was utilized. RESULTS F0, Shim, Jitt, and RAP values were found to be statistically significantly higher in individuals with AD compared to healthy individuals. There was a significant negative correlation between MMSE and F0, Jitt, RAP and Shim, and the MMSE score had a significant negative effect on F0, Jitt, and RAP (p < .05). CONCLUSION Cognitive status was discovered to significantly impact the voice, with fundamental frequency and frequency and amplitude perturbations increasing as cognitive level decreases. In order to contribute to the therapy process for voice disorders, cognitive functions can be focused on in addition to voice therapy.
Collapse
Affiliation(s)
- Mümüne Merve Parlak
- Department of Speech and Language Therapy, Faculty of Health SciencesAnkara Yıldırım Beyazıt UniversityAnkaraTurkey
| | - Güleser Saylam
- Department of OtolaryngologyEtlik City HospitalAnkaraTurkey
| | | | | | | |
Collapse
|
10
|
Yamada Y, Shinkawa K, Nemoto M, Nemoto K, Arai T. A mobile application using automatic speech analysis for classifying Alzheimer's disease and mild cognitive impairment. COMPUT SPEECH LANG 2023. [DOI: 10.1016/j.csl.2023.101514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
|
11
|
Karjadi C, Xue C, Cordella C, Kiran S, Paschalidis IC, Au R, Kolachalama VB. Fusion of Low-Level Descriptors of Digital Voice Recordings for Dementia Assessment. J Alzheimers Dis 2023; 96:507-514. [PMID: 37840494 PMCID: PMC10657667 DOI: 10.3233/jad-230560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/29/2023] [Indexed: 10/17/2023]
Abstract
Digital voice recordings can offer affordable, accessible ways to evaluate behavior and function. We assessed how combining different low-level voice descriptors can evaluate cognitive status. Using voice recordings from neuropsychological exams at the Framingham Heart Study, we developed a machine learning framework fusing spectral, prosodic, and sound quality measures early in the training cycle. The model's area under the receiver operating characteristic curve was 0.832 (±0.034) in differentiating persons with dementia from those who had normal cognition. This offers a data-driven framework for analyzing minimally processed voice recordings for cognitive assessment, highlighting the value of digital technologies in disease detection and intervention.
Collapse
Affiliation(s)
- Cody Karjadi
- The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Chonghua Xue
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | | | - Swathi Kiran
- Sargent College, Boston University, Boston, MA, USA
- Faculty of Computing & Data Sciences, Boston University, Boston, MA, USA
| | - Ioannis Ch. Paschalidis
- Faculty of Computing & Data Sciences, Boston University, Boston, MA, USA
- Departments of Electrical & Computer Engineering, Systems Engineering and Biomedical Engineering, Boston University, Boston, MA, USA
| | - Rhoda Au
- The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Departments of Anatomy & Neurobiology and Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Alzheimer’s Disease Research Center, Boston University, Boston, MA, USA
| | - Vijaya B. Kolachalama
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Faculty of Computing & Data Sciences, Boston University, Boston, MA, USA
- Alzheimer’s Disease Research Center, Boston University, Boston, MA, USA
- Department of Computer Science, Boston University, Boston, MA, USA
| |
Collapse
|
12
|
Owens AP, Krebs C, Kuruppu S, Brem AK, Kowatsch T, Aarsland D, Klöppel S. Broadened assessments, health education and cognitive aids in the remote memory clinic. Front Public Health 2022; 10:1033515. [PMID: 36568790 PMCID: PMC9768191 DOI: 10.3389/fpubh.2022.1033515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/01/2022] [Indexed: 12/12/2022] Open
Abstract
The prevalence of dementia is increasing and poses a health challenge for individuals and society. Despite the desire to know their risks and the importance of initiating early therapeutic options, large parts of the population do not get access to memory clinic-based assessments. Remote memory clinics facilitate low-level access to cognitive assessments by eschewing the need for face-to-face meetings. At the same time, patients with detected impairment or increased risk can receive non-pharmacological treatment remotely. Sensor technology can evaluate the efficiency of this remote treatment and identify cognitive decline. With remote and (partly) automatized technology the process of cognitive decline can be monitored but more importantly also modified by guiding early interventions and a dementia preventative lifestyle. We highlight how sensor technology aids the expansion of assessments beyond cognition and to other domains, e.g., depression. We also illustrate applications for aiding remote treatment and describe how remote tools can facilitate health education which is the cornerstone for long-lasting lifestyle changes. Tools such as transcranial electric stimulation or sleep-based interventions have currently mostly been used in a face-to-face context but have the potential of remote deployment-a step already taken with memory training apps. Many of the presented methods are readily scalable and of low costs and there is a range of target populations, from the worried well to late-stage dementia.
Collapse
Affiliation(s)
- Andrew P. Owens
- Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Christine Krebs
- University Hospital of Old Age Psychiatry and Psychotherapy, University of Bern, Bern, Switzerland
| | - Sajini Kuruppu
- Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Anna-Katharine Brem
- Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom,University Hospital of Old Age Psychiatry and Psychotherapy, University of Bern, Bern, Switzerland
| | - Tobias Kowatsch
- Institute for Implementation Science in Health Care, University of Zurich, Zurich, Switzerland,School of Medicine, University of St. Gallen, St. Gallen, Switzerland,Centre for Digital Health Interventions, Department Management, Technology, and Economics at ETH Zurich, Zurich, Switzerland
| | - Dag Aarsland
- Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Stefan Klöppel
- University Hospital of Old Age Psychiatry and Psychotherapy, University of Bern, Bern, Switzerland,*Correspondence: Stefan Klöppel
| |
Collapse
|