1
|
Lorenzen KP, Heremans ERM, de Vos M, Mikkelsen KB. Personalization of Automatic Sleep Scoring: How Best to Adapt Models to Personal Domains in Wearable EEG. IEEE J Biomed Health Inform 2024; 28:5804-5815. [PMID: 38833404 DOI: 10.1109/jbhi.2024.3409165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
Wearable EEG enables us to capture large amounts of high-quality sleep data for diagnostic purposes. To make full use of this capacity we need high-performance automatic sleep scoring models. To this end, it has been noted that domain mismatch between recording equipment can be considerable, e.g. PSG to wearable EEG, but a previously observed benefit from personalizing models to individual subjects further indicates a personal domain in sleep EEG. In this work, we have investigated the extent of such a personal domain in wearable EEG, and review supervised and unsupervised approaches to personalization as found in the literature. We investigated the personalization effect of the unsupervised Adversarial Domain Adaptation and implemented an unsupervised method based on statistics alignment. No beneficial personalization effect was observed using these unsupervised methods. We find that supervised personalization leads to a substantial performance improvement on the target subject ranging from 15% Cohen's Kappa for subjects with poor performance ( ) to roughly 2% on subjects with high performance ( ). This improvement was present for models trained on both small and large data sets, indicating that even high-performance models benefit from supervised personalization. We found that this personalization can be beneficially regularized using Kullback-Leibler regularization, leading to lower variance with negligible cost to improvement. Based on the experiments, we recommend model personalization using Kullback-Leibler regularization.
Collapse
|
2
|
Kazemi K, Abiri A, Zhou Y, Rahmani A, Khayat RN, Liljeberg P, Khine M. Improved sleep stage predictions by deep learning of photoplethysmogram and respiration patterns. Comput Biol Med 2024; 179:108679. [PMID: 39033682 DOI: 10.1016/j.compbiomed.2024.108679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 05/28/2024] [Accepted: 05/29/2024] [Indexed: 07/23/2024]
Abstract
Sleep staging is a crucial tool for diagnosing and monitoring sleep disorders, but the standard clinical approach using polysomnography (PSG) in a sleep lab is time-consuming, expensive, uncomfortable, and limited to a single night. Advancements in sensor technology have enabled home sleep monitoring, but existing devices still lack sufficient accuracy to inform clinical decisions. To address this challenge, we propose a deep learning architecture that combines a convolutional neural network and bidirectional long short-term memory to accurately classify sleep stages. By supplementing photoplethysmography (PPG) signals with respiratory sensor inputs, we demonstrated significant improvements in prediction accuracy and Cohen's kappa (k) for 2- (92.7 %; k = 0.768), 3- (80.2 %; k = 0.714), 4- (76.8 %, k = 0.550), and 5-stage (76.7 %, k = 0.616) sleep classification using raw data. This relatively translatable approach, with a less intensive AI model and leveraging only a few, inexpensive sensors, shows promise in accurately staging sleep. This has potential for diagnosing and managing sleep disorders in a more accessible and practical manner, possibly even at home.
Collapse
Affiliation(s)
| | - Arash Abiri
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, United States
| | - Yongxiao Zhou
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, United States
| | - Amir Rahmani
- Department of Computer Science, University of California, Irvine, Irvine, CA, United States; School of Nursing, University of California, Irvine, Irvine, CA, United States
| | - Rami N Khayat
- Division of Pulmonary and Critical Care Medicine, The UCI Comprehensive Sleep Center, University of California. Irvine, Newport Beach, CA, United States
| | | | - Michelle Khine
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, United States.
| |
Collapse
|
3
|
Zheng NS, Annis J, Master H, Han L, Gleichauf K, Ching JH, Nasser M, Coleman P, Desine S, Ruderfer DM, Hernandez J, Schneider LD, Brittain EL. Sleep patterns and risk of chronic disease as measured by long-term monitoring with commercial wearable devices in the All of Us Research Program. Nat Med 2024; 30:2648-2656. [PMID: 39030265 PMCID: PMC11405268 DOI: 10.1038/s41591-024-03155-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 06/25/2024] [Indexed: 07/21/2024]
Abstract
Poor sleep health is associated with increased all-cause mortality and incidence of many chronic conditions. Previous studies have relied on cross-sectional and self-reported survey data or polysomnograms, which have limitations with respect to data granularity, sample size and longitudinal information. Here, using objectively measured, longitudinal sleep data from commercial wearable devices linked to electronic health record data from the All of Us Research Program, we show that sleep patterns, including sleep stages, duration and regularity, are associated with chronic disease incidence. Of the 6,785 participants included in this study, 71% were female, 84% self-identified as white and 71% had a college degree; the median age was 50.2 years (interquartile range = 35.7, 61.5) and the median sleep monitoring period was 4.5 years (2.5, 6.5). We found that rapid eye movement sleep and deep sleep were inversely associated with the odds of incident atrial fibrillation and that increased sleep irregularity was associated with increased odds of incident obesity, hyperlipidemia, hypertension, major depressive disorder and generalized anxiety disorder. Moreover, J-shaped associations were observed between average daily sleep duration and hypertension, major depressive disorder and generalized anxiety disorder. These findings show that sleep stages, duration and regularity are all important factors associated with chronic disease development and may inform evidence-based recommendations on healthy sleeping habits.
Collapse
Affiliation(s)
- Neil S Zheng
- Yale School of Medicine, Yale University, New Haven, CT, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Jeffrey Annis
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Hiral Master
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lide Han
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Division of Genetic Medicine, Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | | | | | - Peyton Coleman
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stacy Desine
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Douglas M Ruderfer
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Division of Genetic Medicine, Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Logan D Schneider
- Google, Mountain View, CA, USA
- Sleep Medicine Center, Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Redwood City, CA, USA
| | - Evan L Brittain
- Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
4
|
Ganglberger W, Nasiri S, Sun H, Kim S, Shin C, Westover MB, Thomas RJ. Refining sleep staging accuracy: Transfer learning coupled with scorability models. Sleep 2024:zsae202. [PMID: 39215679 DOI: 10.1093/sleep/zsae202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Indexed: 09/04/2024] Open
Abstract
STUDY OBJECTIVES This study aimed to 1) improve sleep staging accuracy through transfer learning, to achieve or exceede human inter-expert agreement; 2) introduce a scorability model to assess the quality and trustworthiness of automated sleep staging. METHODS A deep neural network (base model) was trained on a large multi-site polysomnography (PSG) dataset from the United States. Transfer learning was used to calibrate the model to a reduced montage and limited samples from the Korean Genome and Epidemiology Study (KoGES) dataset. Model performance was compared to inter-expert reliability among three human experts. A scorability assessment was developed to predict the agreement between the model and human experts. RESULTS Initial sleep staging by the base model showed lower agreement with experts (κ=0.55) compared to inter-expert agreement (κ=0.62). Calibration with 324 randomly sampled training cases matched expert agreement levels. Further targeted sampling improved performance, with models exceeding inter-expert agreement (κ=0.70). The scorability assessment, combining biosignal quality and model confidence features, predicted model-expert agreement moderately well (R²=0.42). Recordings with higher scorability scores demonstrated greater model-expert agreement than inter-expert agreement. Even with lower scorability scores, model performance was comparable to inter-expert agreement. CONCLUSIONS Fine-tuning a pre-trained neural network through targeted transfer learning significantly enhances sleep staging performance for an atypical montage, achieving and surpassing human expert agreement levels. The introduction of a scorability assessment provides a robust measure of reliability, ensuring quality control and enhancing the practical application of the system before deployment. This approach marks an important advancement in automated sleep analysis, demonstrating the potential for AI to exceed human performance in clinical settings.
Collapse
Affiliation(s)
- Wolfgang Ganglberger
- Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Samaneh Nasiri
- McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Emory School of Medicine, Atlanta, GA, USA
| | - Haoqi Sun
- Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Soriul Kim
- Institute of Human Genomic Study, College of Medicine, Kore University, Seoul, Republic of Korea
| | - Chol Shin
- Institute of Human Genomic Study, College of Medicine, Kore University, Seoul, Republic of Korea
- Biomedical Research Center, Korea University Ansan Hospital, Ansan, Republic of Korea
| | - M Brandon Westover
- Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Robert J Thomas
- Harvard Medical School, Boston, MA, USA
- Department of Medicine, Division of Pulmonary Critical Care & Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| |
Collapse
|
5
|
McMahon M, Goldin J, Kealy ES, Wicks DJ, Zilberg E, Freeman W, Aliahmad B. Performance Investigation of Somfit Sleep Staging Algorithm. Nat Sci Sleep 2024; 16:1027-1043. [PMID: 39071546 PMCID: PMC11277903 DOI: 10.2147/nss.s463026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Purpose To investigate accuracy of the sleep staging algorithm in a new miniaturized home sleep monitoring device - Compumedics® Somfit. Somfit is attached to patient's forehead and combines channels specified for a pulse arterial tonometry (PAT)-based home sleep apnea testing (HSAT) device with the neurological signals. Somfit sleep staging deep learning algorithm is based on convolutional neural network architecture. Patients and Methods One hundred and ten participants referred for sleep investigation with suspected or preexisting obstructive sleep apnea (OSA) in need of a review were enrolled into the study involving simultaneous recording of full overnight polysomnography (PSG) and Somfit data. The recordings were conducted at three centers in Australia. The reported statistics include standard measures of agreement between Somfit automatic hypnogram and consensus PSG hypnogram. Results Overall percent agreement across five sleep stages (N1, N2, N3, REM, and wake) between Somfit automatic and consensus PSG hypnograms was 76.14 (SE: 0.79). The percent agreements between different pairs of sleep technologists' PSG hypnograms varied from 74.36 (1.93) to 85.50 (0.64), with interscorer agreement being greater for scorers from the same sleep laboratory. The estimate of kappa between Somfit and consensus PSG was 0.672 (0.002). Percent agreement for sleep/wake discrimination was 89.30 (0.37). The accuracy of Somfit sleep staging algorithm varied with increasing OSA severity - percent agreement was 79.67 (1.87) for the normal subjects, 77.38 (1.06) for mild OSA, 74.83 (1.79) for moderate OSA and 72.93 (1.68) for severe OSA. Conclusion Agreement between Somfit and PSG hypnograms was non-inferior to PSG interscorer agreement for a number of scorers, thus confirming acceptability of electrode placement at the center of the forehead. The directions for algorithm improvement include additional arousal detection, integration of motion and oximetry signals and separate inference models for individual sleep stages.
Collapse
Affiliation(s)
- Marcus McMahon
- Department of Respiratory and Sleep Medicine, Epworth Hospital, Richmond, Victoria, Australia and Department of Respiratory and Sleep Medicine, Austin Health, Heidelberg, Victoria, Australia
| | - Jeremy Goldin
- Department of Respiratory and Sleep Medicine, Royal Melbourne Hospital, Parkvile, Victoria, Australia
| | | | | | - Eugene Zilberg
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| | - Warwick Freeman
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| | - Behzad Aliahmad
- Medical Innovations, Compumedics Limited, Abbotsford, Victoria, Australia
| |
Collapse
|
6
|
Azarbarzin A, Labarca G, Kwon Y, Wellman A. Physiologic Consequences of Upper Airway Obstruction in Sleep Apnea. Chest 2024:S0012-3692(24)00708-6. [PMID: 38885898 DOI: 10.1016/j.chest.2024.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 05/22/2024] [Accepted: 05/27/2024] [Indexed: 06/20/2024] Open
Abstract
OSA is diagnosed and managed by a metric called the apnea-hypopnea index (AHI). The AHI quantifies the number of respiratory events (apnea or hypopnea), disregarding important information on the characteristics and physiologic consequences of respiratory events, including degrees of ventilatory deficit and associated hypoxemia, cardiac autonomic response, and cortical activity. The oversimplification of the disorder by the AHI is considered one of the reasons for divergent findings on the associations of OSA and cardiovascular disease (CVD) in observational and randomized controlled trial studies. Prospective observational cohort studies have demonstrated strong associations of OSA with several cardiovascular diseases, and randomized controlled trials of CPAP intervention have not been able to detect a benefit of CPAP to reduce the risk of CVD. Over the last several years, novel methodologies have been proposed to better quantify the magnitude of OSA-related breathing disturbance and its physiologic consequences. As a result, stronger associations with cardiovascular and neurocognitive outcomes have been observed. In this review, we focus on the methods that capture polysomnographic heterogeneity of OSA.
Collapse
Affiliation(s)
- Ali Azarbarzin
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital and Harvard Medical School, Boston, MA.
| | - Gonzalo Labarca
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital and Harvard Medical School, Boston, MA; Department of Respiratory Diseases, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Younghoon Kwon
- Department of Medicine, University of Washington, Seattle, WA
| | - Andrew Wellman
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital and Harvard Medical School, Boston, MA
| |
Collapse
|
7
|
Bechny M, Monachino G, Fiorillo L, van der Meer J, Schmidt MH, Bassetti CLA, Tzovara A, Faraci FD. Bridging AI and Clinical Practice: Integrating Automated Sleep Scoring Algorithm with Uncertainty-Guided Physician Review. Nat Sci Sleep 2024; 16:555-572. [PMID: 38827394 PMCID: PMC11143488 DOI: 10.2147/nss.s455649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 04/18/2024] [Indexed: 06/04/2024] Open
Abstract
Purpose This study aims to enhance the clinical use of automated sleep-scoring algorithms by incorporating an uncertainty estimation approach to efficiently assist clinicians in the manual review of predicted hypnograms, a necessity due to the notable inter-scorer variability inherent in polysomnography (PSG) databases. Our efforts target the extent of review required to achieve predefined agreement levels, examining both in-domain (ID) and out-of-domain (OOD) data, and considering subjects' diagnoses. Patients and Methods A total of 19,578 PSGs from 13 open-access databases were used to train U-Sleep, a state-of-the-art sleep-scoring algorithm. We leveraged a comprehensive clinical database of an additional 8832 PSGs, covering a full spectrum of ages (0-91 years) and sleep-disorders, to refine the U-Sleep, and to evaluate different uncertainty-quantification approaches, including our novel confidence network. The ID data consisted of PSGs scored by over 50 physicians, and the two OOD sets comprised recordings each scored by a unique senior physician. Results U-Sleep demonstrated robust performance, with Cohen's kappa (K) at 76.2% on ID and 73.8-78.8% on OOD data. The confidence network excelled at identifying uncertain predictions, achieving AUROC scores of 85.7% on ID and 82.5-85.6% on OOD data. Independently of sleep-disorder status, statistical evaluations revealed significant differences in confidence scores between aligning vs discording predictions, and significant correlations of confidence scores with classification performance metrics. To achieve κ ≥ 90% with physician intervention, examining less than 29.0% of uncertain epochs was required, substantially reducing physicians' workload, and facilitating near-perfect agreement. Conclusion Inter-scorer variability limits the accuracy of the scoring algorithms to ~80%. By integrating an uncertainty estimation with U-Sleep, we enhance the review of predicted hypnograms, to align with the scoring taste of a responsible physician. Validated across ID and OOD data and various sleep-disorders, our approach offers a strategy to boost automated scoring tools' usability in clinical settings.
Collapse
Affiliation(s)
- Michal Bechny
- Institute of Computer Science, University of Bern, Bern, Switzerland
- Institute of Digital Technologies for Personalized Healthcare (Meditech), University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
| | - Giuliana Monachino
- Institute of Computer Science, University of Bern, Bern, Switzerland
- Institute of Digital Technologies for Personalized Healthcare (Meditech), University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
| | - Luigi Fiorillo
- Institute of Digital Technologies for Personalized Healthcare (Meditech), University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
| | | | - Markus H Schmidt
- Department of Neurology, University Hospital of Bern, Bern, Switzerland
- Ohio Sleep Medicine Institute, Dublin, OH, USA
| | | | - Athina Tzovara
- Institute of Computer Science, University of Bern, Bern, Switzerland
- Department of Neurology, University Hospital of Bern, Bern, Switzerland
| | - Francesca D Faraci
- Institute of Digital Technologies for Personalized Healthcare (Meditech), University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
| |
Collapse
|
8
|
Tankéré P, Taillard J, Armeni MA, Petitjean T, Berthomier C, Strauss M, Peter-Derex L. Revisiting the maintenance of wakefulness test: from intra-/inter-scorer agreement to normative values in patients treated for obstructive sleep apnea. J Sleep Res 2024; 33:e13961. [PMID: 37287324 DOI: 10.1111/jsr.13961] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 05/06/2023] [Accepted: 05/20/2023] [Indexed: 06/09/2023]
Abstract
The Maintenance of Wakefulness Test is widely used to objectively assess sleepiness and make safety-related decisions, but its interpretation is subjective and normative values remain debated. Our work aimed to determine normative thresholds in non-subjectively sleepy patients with well-treated obstructive sleep apnea, and to assess intra- and inter-scorer variability. We included maintenance of wakefulness tests of 141 consecutive patients with treated obstructive sleep apnea (90% men, mean (SD) age 47.5 (9.2) years, mean (SD) pre-treatment apnea-hypopnea index of 43.8 (20.3) events/h). Sleep onset latencies were independently scored by two experts. Discordant scorings were reviewed to reach a consensus and half of the cohort was double-scored by each scorer. Intra- and inter-scorer variability was assessed using Cohen's kappa for 40, 33, and 19 min mean sleep latency thresholds. Consensual mean sleep latencies were compared between four groups according to subjective sleepiness (Epworth Sleepiness Scale score < versus ≥11) and residual apnea-hypopnea index (< versus ≥15 events/h). In well-treated non-sleepy patients (n = 76), the consensual mean (SD) sleep latency was 38.4 (4.2) min (lower normal limit [mean - 2SD] = 30 min), and 80% of them did not fall asleep. Intra-scorer agreement on mean sleep latency was high but inter-scorer was only fair (Cohen's kappa 0.54 for 33-min threshold, 0.27 for 19-min threshold), resulting in changes in latency category in 4%-12% of patients. A higher sleepiness score but not the residual apnea-hypopnea index was significantly associated with a lower mean sleep latency. Our findings suggest a higher than usually accepted normative threshold (30 min) in this context and emphasise the need for more reproducible scoring approaches.
Collapse
Affiliation(s)
- Pierre Tankéré
- Reference Center for Rare Pulmonary Diseases, Pulmonary Medicine and Intensive Care Unit, Dijon University Hospital, Dijon, France
- Center for Sleep Medicine and Respiratory Disease, Croix-Rousse Hospital, Hospices Civils de Lyon, Lyon, France
| | - Jacques Taillard
- Sommeil, Addiction et Neuropsychiatrie, Université de Bordeaux, SANPSY, USR 3413, Bordeaux, France
- CNRS, SANPSY, USR 3413, Bordeaux, France
| | - Marc-Antoine Armeni
- Center for Sleep Medicine and Respiratory Disease, Croix-Rousse Hospital, Hospices Civils de Lyon, Lyon, France
| | - Thierry Petitjean
- Center for Sleep Medicine and Respiratory Disease, Croix-Rousse Hospital, Hospices Civils de Lyon, Lyon, France
| | | | - Mélanie Strauss
- Hôpital Universitaire de Bruxelles, Site Erasme, Services de Neurologie, Psychiatrie et Laboratoire du Sommeil, Université Libre de Bruxelles, Brussels, Belgium
- Neuropsychology and Functional Imaging Research Group (UR2NF), Center for Research in Cognition and Neurosciences and ULB Neuroscience Institute, Université Libre de Bruxelles, Brussels, Belgium
| | - Laure Peter-Derex
- Center for Sleep Medicine and Respiratory Disease, Croix-Rousse Hospital, Hospices Civils de Lyon, Lyon, France
- Lyon Neuroscience Research Center, PAM Team, INSERM U1028, CNRS UMR 5292, Lyon, France
- Claude Bernard Lyon 1 University, Lyon, France
| |
Collapse
|
9
|
Jirakittayakorn N, Wongsawat Y, Mitrirattanakul S. ZleepAnlystNet: a novel deep learning model for automatic sleep stage scoring based on single-channel raw EEG data using separating training. Sci Rep 2024; 14:9859. [PMID: 38684765 PMCID: PMC11058251 DOI: 10.1038/s41598-024-60796-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/26/2024] [Indexed: 05/02/2024] Open
Abstract
Numerous models for sleep stage scoring utilizing single-channel raw EEG signal have typically employed CNN and BiLSTM architectures. While these models, incorporating temporal information for sequence classification, demonstrate superior overall performance, they often exhibit low per-class performance for N1-stage, necessitating an adjustment of loss function. However, the efficacy of such adjustment is constrained by the training process. In this study, a pioneering training approach called separating training is introduced, alongside a novel model, to enhance performance. The developed model comprises 15 CNN models with varying loss function weights for feature extraction and 1 BiLSTM for sequence classification. Due to its architecture, this model cannot be trained using an end-to-end approach, necessitating separate training for each component using the Sleep-EDF dataset. Achieving an overall accuracy of 87.02%, MF1 of 82.09%, Kappa of 0.8221, and per-class F1-socres (W 90.34%, N1 54.23%, N2 89.53%, N3 88.96%, and REM 87.40%), our model demonstrates promising performance. Comparison with sleep technicians reveals a Kappa of 0.7015, indicating alignment with reference sleep stags. Additionally, cross-dataset validation and adaptation through training with the SHHS dataset yield an overall accuracy of 84.40%, MF1 of 74.96% and Kappa of 0.7785 when tested with the Sleep-EDF-13 dataset. These findings underscore the generalization potential in model architecture design facilitated by our novel training approach.
Collapse
Affiliation(s)
- Nantawachara Jirakittayakorn
- Institute for Innovative Learning, Mahidol University, Nakhon Pathom, Thailand
- Faculty of Dentistry, Mahidol University, Bangkok, Thailand
| | - Yodchanan Wongsawat
- Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Nakhon Pathom, Thailand
| | - Somsak Mitrirattanakul
- Department of Masticatory Science, Faculty of Dentistry, Mahidol University, Bangkok, Thailand.
| |
Collapse
|
10
|
Birrer V, Elgendi M, Lambercy O, Menon C. Evaluating reliability in wearable devices for sleep staging. NPJ Digit Med 2024; 7:74. [PMID: 38499793 PMCID: PMC10948771 DOI: 10.1038/s41746-024-01016-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 01/18/2024] [Indexed: 03/20/2024] Open
Abstract
Sleep is crucial for physical and mental health, but traditional sleep quality assessment methods have limitations. This scoping review analyzes 35 articles from the past decade, evaluating 62 wearable setups with varying sensors, algorithms, and features. Our analysis indicates a trend towards combining accelerometer and photoplethysmography (PPG) data for out-of-lab sleep staging. Devices using only accelerometer data are effective for sleep/wake detection but fall short in identifying multiple sleep stages, unlike those incorporating PPG signals. To enhance the reliability of sleep staging wearables, we propose five recommendations: (1) Algorithm validation with equity, diversity, and inclusion considerations, (2) Comparative performance analysis of commercial algorithms across multiple sleep stages, (3) Exploration of feature impacts on algorithm accuracy, (4) Consistent reporting of performance metrics for objective reliability assessment, and (5) Encouragement of open-source classifier and data availability. Implementing these recommendations can improve the accuracy and reliability of sleep staging algorithms in wearables, solidifying their value in research and clinical settings.
Collapse
Affiliation(s)
- Vera Birrer
- Biomedical and Mobile Health Technology Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, Switzerland
| | - Mohamed Elgendi
- Biomedical and Mobile Health Technology Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland.
| | - Olivier Lambercy
- Rehabilitation Engineering Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
| | - Carlo Menon
- Biomedical and Mobile Health Technology Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
11
|
Nikkonen S, Somaskandhan P, Korkalainen H, Kainulainen S, Terrill PI, Gretarsdottir H, Sigurdardottir S, Olafsdottir KA, Islind AS, Óskarsdóttir M, Arnardóttir ES, Leppänen T. Multicentre sleep-stage scoring agreement in the Sleep Revolution project. J Sleep Res 2024; 33:e13956. [PMID: 37309714 PMCID: PMC10909532 DOI: 10.1111/jsr.13956] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 05/04/2023] [Accepted: 05/11/2023] [Indexed: 06/14/2023]
Abstract
Determining sleep stages accurately is an important part of the diagnostic process for numerous sleep disorders. However, as the sleep stage scoring is done manually following visual scoring rules there can be considerable variation in the sleep staging between different scorers. Thus, this study aimed to comprehensively evaluate the inter-rater agreement in sleep staging. A total of 50 polysomnography recordings were manually scored by 10 independent scorers from seven different sleep centres. We used the 10 scorings to calculate a majority score by taking the sleep stage that was the most scored stage for each epoch. The overall agreement for sleep staging was κ = 0.71 and the mean agreement with the majority score was 0.86. The scorers were in perfect agreement in 48% of all scored epochs. The agreement was highest in rapid eye movement sleep (κ = 0.86) and lowest in N1 sleep (κ = 0.41). The agreement with the majority scoring varied between the scorers from 81% to 91%, with large variations between the scorers in sleep stage-specific agreements. Scorers from the same sleep centres had the highest pairwise agreements at κ = 0.79, κ = 0.85, and κ = 0.78, while the lowest pairwise agreement between the scorers was κ = 0.58. We also found a moderate negative correlation between sleep staging agreement and the apnea-hypopnea index, as well as the rate of sleep stage transitions. In conclusion, although the overall agreement was high, several areas of low agreement were also found, mainly between non-rapid eye movement stages.
Collapse
Affiliation(s)
- Sami Nikkonen
- Department of Technical PhysicsUniversity of Eastern FinlandKuopioFinland
- Diagnostic Imaging CenterKuopio University HospitalKuopioFinland
| | - Pranavan Somaskandhan
- School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneQueenslandAustralia
| | - Henri Korkalainen
- Department of Technical PhysicsUniversity of Eastern FinlandKuopioFinland
- Diagnostic Imaging CenterKuopio University HospitalKuopioFinland
| | - Samu Kainulainen
- Department of Technical PhysicsUniversity of Eastern FinlandKuopioFinland
- Diagnostic Imaging CenterKuopio University HospitalKuopioFinland
| | - Philip I. Terrill
- School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneQueenslandAustralia
| | - Heidur Gretarsdottir
- Reykjavik University Sleep Institute, School of TechnologyReykjavik UniversityReykjavikIceland
| | - Sigridur Sigurdardottir
- Reykjavik University Sleep Institute, School of TechnologyReykjavik UniversityReykjavikIceland
| | | | - Anna Sigridur Islind
- Reykjavik University Sleep Institute, School of TechnologyReykjavik UniversityReykjavikIceland
- Department of Computer ScienceReykjavík UniversityReykajvíkIceland
| | - María Óskarsdóttir
- Reykjavik University Sleep Institute, School of TechnologyReykjavik UniversityReykjavikIceland
- Department of Computer ScienceReykjavík UniversityReykajvíkIceland
| | - Erna Sif Arnardóttir
- Reykjavik University Sleep Institute, School of TechnologyReykjavik UniversityReykjavikIceland
| | - Timo Leppänen
- Department of Technical PhysicsUniversity of Eastern FinlandKuopioFinland
- Diagnostic Imaging CenterKuopio University HospitalKuopioFinland
- School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneQueenslandAustralia
| |
Collapse
|
12
|
Liao YS, Wu MC, Li CX, Lin WK, Lin CY, Liang SF. Polysomnography scoring-related training and quantitative assessment for improving interscorer agreement. J Clin Sleep Med 2024; 20:271-278. [PMID: 37811900 PMCID: PMC10835767 DOI: 10.5664/jcsm.10852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/29/2023] [Accepted: 10/05/2023] [Indexed: 10/10/2023]
Abstract
STUDY OBJECTIVES To efficiently improve the scoring competency of scorers with varying levels of experience across regions in Taiwan, we developed a training program with a cloud-based polysomnography scoring platform to evaluate and improve interscorer agreement. METHODS A total of 70 scorers from 34 sleep centers in Taiwan (job tenure: 0.5-39.0 years) completed a scoring test. All scorers scored a 742-epoch (30 s/epoch) overnight polysomnography recording of a patient with a moderate apnea-hypopnea index. Subsequently, 8 scoring experts delivered 8 interactive online lectures (each lasting 30 minutes). The training program included identifying scoring weaknesses, highlighting the latest scoring rules, and providing physicians' perspectives. Afterward, the scorers completed the second scoring test on the same participant. Changes in agreement from the first to second scoring test were identified. Sleep staging, sleep parameters, and respiratory events were considered for evaluating scoring agreement. RESULTS The scorers' agreement in overall sleep stage scoring significantly increased from 74.6 to 82.3% (median score). The proportion of scorers with an agreement of ≥ 80% increased from 20.0% (14/70) to 58.6% (41/70) after the online training program. In addition, the scorers' agreement in overall respiratory-event scoring increased to 88.8% (median score) after training. The scorers with a job tenure of 2.0-4.9 years exhibited the highest level of improvement in overall sleep staging (their median agreement increased from 72.8 to 84.9%; P < .001). CONCLUSIONS Our interactive online training program efficiently targeted the scorers' scoring weaknesses identified in the first scoring test, leading to substantial improvements in scoring proficiency. CITATION Liao Y-S, Wu M-C, Li C-X, Lin W-K, Lin C-Y, Liang S-F. Polysomnography scoring-related training and quantitative assessment for improving interscorer agreement. J Clin Sleep Med. 2024;20(2):271-278.
Collapse
Affiliation(s)
- Ying-Siou Liao
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Meng-Chun Wu
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Cheng-Xue Li
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Wen-Kuei Lin
- Sleep Medicine Center, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Cheng-Yu Lin
- Sleep Medicine Center, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Department of Otolaryngology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Sheng-Fu Liang
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
13
|
Kainec KA, Caccavaro J, Barnes M, Hoff C, Berlin A, Spencer RMC. Evaluating Accuracy in Five Commercial Sleep-Tracking Devices Compared to Research-Grade Actigraphy and Polysomnography. SENSORS (BASEL, SWITZERLAND) 2024; 24:635. [PMID: 38276327 PMCID: PMC10820351 DOI: 10.3390/s24020635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/12/2024] [Accepted: 01/16/2024] [Indexed: 01/27/2024]
Abstract
The development of consumer sleep-tracking technologies has outpaced the scientific evaluation of their accuracy. In this study, five consumer sleep-tracking devices, research-grade actigraphy, and polysomnography were used simultaneously to monitor the overnight sleep of fifty-three young adults in the lab for one night. Biases and limits of agreement were assessed to determine how sleep stage estimates for each device and research-grade actigraphy differed from polysomnography-derived measures. Every device, except the Garmin Vivosmart, was able to estimate total sleep time comparably to research-grade actigraphy. All devices overestimated nights with shorter wake times and underestimated nights with longer wake times. For light sleep, absolute bias was low for the Fitbit Inspire and Fitbit Versa. The Withings Mat and Garmin Vivosmart overestimated shorter light sleep and underestimated longer light sleep. The Oura Ring underestimated light sleep of any duration. For deep sleep, bias was low for the Withings Mat and Garmin Vivosmart while other devices overestimated shorter and underestimated longer times. For REM sleep, bias was low for all devices. Taken together, these results suggest that proportional bias patterns in consumer sleep-tracking technologies are prevalent and could have important implications for their overall accuracy.
Collapse
Affiliation(s)
- Kyle A. Kainec
- Neuroscience & Behavior Program, French Hall, University of Massachusetts Amherst, 230 Stockbridge Road, Amherst, MA 01003, USA;
- Institute for Applied Life Sciences, Life Science Laboratories, University of Massachusetts Amherst, 240 Thatcher Road, Amherst, MA 01003, USA; (M.B.); (C.H.)
| | - Jamie Caccavaro
- Department of Psychological and Brain Sciences, Tobin Hall, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA
| | - Morgan Barnes
- Institute for Applied Life Sciences, Life Science Laboratories, University of Massachusetts Amherst, 240 Thatcher Road, Amherst, MA 01003, USA; (M.B.); (C.H.)
- Department of Psychological and Brain Sciences, Tobin Hall, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA
| | - Chloe Hoff
- Institute for Applied Life Sciences, Life Science Laboratories, University of Massachusetts Amherst, 240 Thatcher Road, Amherst, MA 01003, USA; (M.B.); (C.H.)
- Department of Psychological and Brain Sciences, Tobin Hall, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA
| | - Annika Berlin
- Institute for Applied Life Sciences, Life Science Laboratories, University of Massachusetts Amherst, 240 Thatcher Road, Amherst, MA 01003, USA; (M.B.); (C.H.)
- Department of Psychological and Brain Sciences, Tobin Hall, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA
| | - Rebecca M. C. Spencer
- Neuroscience & Behavior Program, French Hall, University of Massachusetts Amherst, 230 Stockbridge Road, Amherst, MA 01003, USA;
- Institute for Applied Life Sciences, Life Science Laboratories, University of Massachusetts Amherst, 240 Thatcher Road, Amherst, MA 01003, USA; (M.B.); (C.H.)
- Department of Psychological and Brain Sciences, Tobin Hall, University of Massachusetts Amherst, 135 Hicks Way, Amherst, MA 01003, USA
| |
Collapse
|
14
|
Berra F, Fasiello E, Zucconi M, Casoni F, De Gennaro L, Ferini-Strambi L, Galbiati A. Neurophysiological Parameters Influencing Sleep-Wake Discrepancy in Insomnia Disorder: A Preliminary Analysis on Alpha Rhythm during Sleep Onset. Brain Sci 2024; 14:97. [PMID: 38275517 PMCID: PMC10813212 DOI: 10.3390/brainsci14010097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/15/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024] Open
Abstract
Sleep state misperception (SSM) is a common issue in insomnia disorder (ID), causing a discrepancy between objective and subjective sleep/wake time estimation and increased daytime impairments. In this context, the hyperarousal theory assumes that sustained central nervous system activation contributes to the SSM. This study investigates factors influencing SSM during sleep latency (SL) and total sleep time (TST). Objective polysomnographic sleep variables (the alpha density index, latency-to-sleep stages and the first K-complex, and Rapid Eye Movement (REM) arousal density) and subjective sleep indices, taken from sleep diaries, were analyzed in 16 ID patients. Correlation analyses revealed a positive association between the degree of SL misperception (SLm) and the percentage of epochs that contained a visually scored stereotyped alpha rhythm during objective SL. A regression analysis showed that the REM arousal density and alpha density index significantly predicted TST misperception (TSTm). Furthermore, the degree of SLm was associated with an increased probability of transitioning from stage 1 of non-REM sleep to wakefulness during subjective SL. These findings support the role of hyperarousal in SSM and highlight the importance of alpha activity in unravelling the heterogeneous underpinnings of SSM.
Collapse
Affiliation(s)
- Francesca Berra
- Department of Psychology, “Vita-Salute” San Raffaele University, 20132 Milan, Italy; (F.B.); (E.F.); (L.F.-S.)
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| | - Elisabetta Fasiello
- Department of Psychology, “Vita-Salute” San Raffaele University, 20132 Milan, Italy; (F.B.); (E.F.); (L.F.-S.)
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| | - Marco Zucconi
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| | - Francesca Casoni
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| | - Luigi De Gennaro
- Department of Psychology, Sapienza University of Rome, 00185 Rome, Italy;
- Body and Action Lab, IRCCS Fondazione Santa Lucia, 00179 Rome, Italy
| | - Luigi Ferini-Strambi
- Department of Psychology, “Vita-Salute” San Raffaele University, 20132 Milan, Italy; (F.B.); (E.F.); (L.F.-S.)
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| | - Andrea Galbiati
- Department of Psychology, “Vita-Salute” San Raffaele University, 20132 Milan, Italy; (F.B.); (E.F.); (L.F.-S.)
- IRCCS San Raffaele Scientific Institute, Department of Clinical Neurosciences, Neurology–Sleep Disorders Center, 20132 Milan, Italy; (M.Z.); (F.C.)
| |
Collapse
|
15
|
Park MJ, Choi JH, Kim SY, Ha TK. A deep learning algorithm model to automatically score and grade obstructive sleep apnea in adult polysomnography. Digit Health 2024; 10:20552076241291707. [PMID: 39430691 PMCID: PMC11489947 DOI: 10.1177/20552076241291707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 09/27/2024] [Indexed: 10/22/2024] Open
Abstract
Objective Polysomnography (PSG) is unique in diagnosing sleep disorders, notably obstructive sleep apnea (OSA). Despite its advantages, manual PSG data grading is time-consuming and laborious. Thus, this research evaluated a deep learning-based automated scoring system for respiratory events in sleep-disordered breathing patients. Methods A total of 1000 case PSG data were enrolled to develop a deep learning algorithm. Of the 1000 data, 700 were distributed for training, 200 for validation, and 100 for testing. The respiratory events scoring deep learning model is composed of five sequential layers: an initial layer of perceptrons, followed by three consecutive layers of long short-term memory cells, and ultimately, an additional two layers of perceptrons. Results The PSG data of 100 patients (simple snoring, mild, moderate, and severe OSA; n = 25 in each group) were selected for validation and testing of the deep learning model. The algorithm demonstrated high sensitivity (95% CI: 98.06-98.51) and specificity (95% CI: 95.46-97.79) across all OSA severities in detecting apnea/hypopnea events, compared to manual PSG analysis. The deep learning model's area under the curve values for predicting OSA in apnea-hypopnea index ≥ 5, 15, and 30 groups were 0.9402, 0.9388, and 0.9442, respectively, showing no significant differences between each group. Conclusion The deep learning algorithm employed in our study showed high accuracy in identifying apnea/hypopnea episodes and assessing the severity of OSA, suggesting the potential for enhancing both the efficiency and accuracy of automated respiratory event scoring in PSG through advanced deep learning techniques.
Collapse
Affiliation(s)
- Marn Joon Park
- Department of Otorhinolaryngology-Head and Neck Surgery, Inha University Hospital, Inha University School of Medicine, Incheon, Republic of Korea
| | - Ji Ho Choi
- Department of Otorhinolaryngology-Head and Neck Surgery, Soonchunhyang University College of Medicine, Bucheon Hospital, Bucheon, Republic of Korea
| | - Shin Young Kim
- Department of Otorhinolaryngology-Head and Neck Surgery, Soonchunhyang University College of Medicine, Bucheon Hospital, Bucheon, Republic of Korea
| | - Tae Kyoung Ha
- Honeynaps Research and Development Center, Honeynaps Co. Ltd, Seoul, Republic of Korea
| |
Collapse
|
16
|
Jeong J, Yoon W, Lee JG, Kim D, Woo Y, Kim DK, Shin HW. Standardized image-based polysomnography database and deep learning algorithm for sleep-stage classification. Sleep 2023; 46:zsad242. [PMID: 37703391 DOI: 10.1093/sleep/zsad242] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 08/11/2023] [Indexed: 09/15/2023] Open
Abstract
STUDY OBJECTIVES Polysomnography (PSG) scoring is labor-intensive, subjective, and often ambiguous. Recently several deep learning (DL) models for automated sleep scoring have been developed, they are tied to a fixed amount of input channels and resolution. In this study, we constructed a standardized image-based PSG dataset in order to overcome the heterogeneity of raw signal data obtained from various PSG devices and various sleep laboratory environments. METHODS All individually exported European data format files containing raw signals were converted into images with an annotation file, which contained the demographics, diagnoses, and sleep statistics. An image-based DL model for automatic sleep staging was developed, compared with a signal-based model, and validated in an external dataset. RESULTS We constructed 10253 image-based PSG datasets using a standardized format. Among these, 7745 diagnostic PSG data were used to develop our DL model. The DL model using the image dataset showed similar performance to the signal-based dataset for the same subject. The overall DL accuracy was greater than 80%, even with severe obstructive sleep apnea. Moreover, for the first time, we showed explainable DL in the field of sleep medicine as visualized key inference regions using Eigen-class activation maps. Furthermore, when a DL model for sleep scoring performs external validation, we achieved a relatively good performance. CONCLUSIONS Our main contribution demonstrates the availability of a standardized image-based dataset, and highlights that changing the data sampling rate or number of sensors may not require retraining, although performance decreases slightly as the number of sensors decreases.
Collapse
Affiliation(s)
- Jaemin Jeong
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | | | - Jeong-Gun Lee
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Dongyoung Kim
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Yunhee Woo
- Institute of New Frontier Research, Division of Big Data and Artificial Intelligence, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea
| | - Dong-Kyu Kim
- OUaR LaB, Inc, Seoul, Republic of Korea
- Institute of New Frontier Research, Division of Big Data and Artificial Intelligence, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea
- Department of Otorhinolaryngology-Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea¸
| | - Hyun-Woo Shin
- OUaR LaB, Inc, Seoul, Republic of Korea
- Obstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
- Sensory Organ Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Republic of Korea
| |
Collapse
|
17
|
Grassi M, Daccò S, Caldirola D, Perna G, Schruers K, Defillo A. Enhanced sleep staging with artificial intelligence: a validation study of new software for sleep scoring. Front Artif Intell 2023; 6:1278593. [PMID: 38145233 PMCID: PMC10739507 DOI: 10.3389/frai.2023.1278593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/14/2023] [Indexed: 12/26/2023] Open
Abstract
Manual sleep staging (MSS) using polysomnography is a time-consuming task, requires significant training, and can lead to significant variability among scorers. STAGER is a software program based on machine learning algorithms that has been developed by Medibio Limited (Savage, MN, USA) to perform automatic sleep staging using only EEG signals from polysomnography. This study aimed to extensively investigate its agreement with MSS performed during clinical practice and by three additional expert sleep technicians. Forty consecutive polysomnographic recordings of patients referred to three US sleep clinics for sleep evaluation were retrospectively collected and analyzed. Three experienced technicians independently staged the recording using the electroencephalography, electromyography, and electrooculography signals according to the American Academy of Sleep Medicine guidelines. The staging initially performed during clinical practice was also considered. Several agreement statistics between the automatic sleep staging (ASS) and MSS, among the different MSSs, and their differences were calculated. Bootstrap resampling was used to calculate 95% confidence intervals and the statistical significance of the differences. STAGER's ASS was most comparable with, or statistically significantly better than the MSS, except for a partial reduction in the positive percent agreement in the wake stage. These promising results indicate that STAGER software can perform ASS of inpatient polysomnographic recordings accurately in comparison with MSS.
Collapse
Affiliation(s)
- Massimiliano Grassi
- Medibio Limited, Savage, MN, United States
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Italy
- Department of Clinical Neurosciences, Villa San Benedetto Menni Hospital, Hermanas Hospitalarias, Albese con Cassano, Italy
| | - Silvia Daccò
- Medibio Limited, Savage, MN, United States
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Italy
- Department of Clinical Neurosciences, Villa San Benedetto Menni Hospital, Hermanas Hospitalarias, Albese con Cassano, Italy
- Humanitas San Pio X, Personalized Medicine Center for Anxiety and Panic Disorders, Milan, Italy
| | - Daniela Caldirola
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Italy
- Department of Clinical Neurosciences, Villa San Benedetto Menni Hospital, Hermanas Hospitalarias, Albese con Cassano, Italy
- Humanitas San Pio X, Personalized Medicine Center for Anxiety and Panic Disorders, Milan, Italy
| | - Giampaolo Perna
- Medibio Limited, Savage, MN, United States
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Italy
- Department of Clinical Neurosciences, Villa San Benedetto Menni Hospital, Hermanas Hospitalarias, Albese con Cassano, Italy
- Humanitas San Pio X, Personalized Medicine Center for Anxiety and Panic Disorders, Milan, Italy
- Department of Psychiatry and Neuropsychology, Faculty of Health, Medicine, and Life Sciences, Research Institute of Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Koen Schruers
- Department of Psychiatry and Neuropsychology, Faculty of Health, Medicine, and Life Sciences, Research Institute of Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands
| | | |
Collapse
|
18
|
Yun R, Rembado I, Perlmutter SI, Rao RPN, Fetz EE. Local field potentials and single unit dynamics in motor cortex of unconstrained macaques during different behavioral states. Front Neurosci 2023; 17:1273627. [PMID: 38075283 PMCID: PMC10702227 DOI: 10.3389/fnins.2023.1273627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 11/09/2023] [Indexed: 02/12/2024] Open
Abstract
Different sleep stages have been shown to be vital for a variety of brain functions, including learning, memory, and skill consolidation. However, our understanding of neural dynamics during sleep and the role of prominent LFP frequency bands remain incomplete. To elucidate such dynamics and differences between behavioral states we collected multichannel LFP and spike data in primary motor cortex of unconstrained macaques for up to 24 h using a head-fixed brain-computer interface (Neurochip3). Each 8-s bin of time was classified into awake-moving (Move), awake-resting (Rest), REM sleep (REM), or non-REM sleep (NREM) by using dimensionality reduction and clustering on the average spectral density and the acceleration of the head. LFP power showed high delta during NREM, high theta during REM, and high beta when the animal was awake. Cross-frequency phase-amplitude coupling typically showed higher coupling during NREM between all pairs of frequency bands. Two notable exceptions were high delta-high gamma and theta-high gamma coupling during Move, and high theta-beta coupling during REM. Single units showed decreased firing rate during NREM, though with increased short ISIs compared to other states. Spike-LFP synchrony showed high delta synchrony during Move, and higher coupling with all other frequency bands during NREM. These results altogether reveal potential roles and functions of different LFP bands that have previously been unexplored.
Collapse
Affiliation(s)
- Richy Yun
- Department of Bioengineering, University of Washington, Seattle, WA, United States
- Center for Neurotechnology, University of Washington, Seattle, WA, United States
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States
| | - Irene Rembado
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, United States
| | - Steve I. Perlmutter
- Center for Neurotechnology, University of Washington, Seattle, WA, United States
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, United States
| | - Rajesh P. N. Rao
- Center for Neurotechnology, University of Washington, Seattle, WA, United States
- Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, United States
| | - Eberhard E. Fetz
- Department of Bioengineering, University of Washington, Seattle, WA, United States
- Center for Neurotechnology, University of Washington, Seattle, WA, United States
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, United States
| |
Collapse
|
19
|
Gerardy B, Kuna ST, Pack A, Kushida CA, Walsh JK, Staley B, Pien GW, Younes M. An approach for determining the reliability of manual and digital scoring of sleep stages. Sleep 2023; 46:zsad248. [PMID: 37712522 DOI: 10.1093/sleep/zsad248] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 08/21/2023] [Indexed: 09/16/2023] Open
Abstract
STUDY OBJECTIVES Inter-scorer variability in sleep staging is largely due to equivocal epochs that contain features of more than one stage. We propose an approach that recognizes the existence of equivocal epochs and evaluates scorers accordingly. METHODS Epoch-by-epoch staging was performed on 70 polysomnograms by six qualified technologists and by a digital system (Michele Sleep Scoring [MSS]). Probability that epochs assigned the same stage by only two of the six technologists (minority score) resulted from random occurrence of two errors was calculated and found to be <5%, thereby indicating that the stage assigned is an acceptable variant for the epoch. Acceptable stages were identified in each epoch as stages assigned by at least two technologists. Percent agreement between each technologist and the other five technologists, acting as judges, was determined. Agreement was considered to exist if the stage assigned by the tested scorer was one of the acceptable stages for the epoch. Stage assigned by MSS was likewise considered in agreement if included in the acceptable stages made by the technologists. RESULTS Agreement of technologists tested against five qualified judges increased from 80.8% (range 70.5%-86.4% among technologists) when using the majority rule, to 96.1 (89.8%-98.5%) by the proposed approach. Agreement between unedited MSS and same judges was 90.0% and increased to 92.1% after brief editing. CONCLUSIONS Accounting for equivocal epochs provides a more accurate estimate of a scorer's (human or digital) competence in scoring sleep stages and reduces inter-scorer disagreements. The proposed approach can be implemented in sleep-scoring training and accreditation programs.
Collapse
Affiliation(s)
| | - Samuel T Kuna
- Department of Medicine, Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, USA
| | - Allan Pack
- Division of Sleep Medicine/Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Clete A Kushida
- Department of Psychiatry, Stanford University, Palo Alto, CA, USA
| | - James K Walsh
- Sleep Medicine and Research Center, St. Luke's Hospital, Chesterfield, MO, USA
| | - Bethany Staley
- Division of Sleep Medicine/Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Grace W Pien
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Magdy Younes
- YRT Limited, Winnipeg, MB, Canada
- Department of Medicine, University of Manitoba, Winnipeg, MB, Canada
| |
Collapse
|
20
|
Anido-Alonso A, Alvarez-Estevez D. Decentralized Data-Privacy Preserving Deep-Learning Approaches for Enhancing Inter-Database Generalization in Automatic Sleep Staging. IEEE J Biomed Health Inform 2023; 27:5610-5621. [PMID: 37651482 DOI: 10.1109/jbhi.2023.3310869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Automatic sleep staging has been an active field of development. Despite multiple efforts, the area remains a focus of research interest. Indeed, while promising results have reported in past literature, uptake of automatic sleep scoring in the clinical setting remains low. One of the current issues regards the difficulty to generalization performance results beyond the local testing scenario, i.e. across data from different clinics. Issues derived from data-privacy restrictions, that generally apply in the medical domain, pose additional difficulties in the successful development of these methods. We propose the use of several decentralized deep-learning approaches, namely ensemble models and federated learning, for robust inter-database performance generalization and data-privacy preservation in automatic sleep staging scenario. Specifically, we explore four ensemble combination strategies (max-voting, output averaging, size-proportional weighting, and Nelder-Mead) and present a new federated learning algorithm, so-called sub-sampled federated stochastic gradient descent (ssFedSGD). To evaluate generalization capabilities of such approaches, experimental procedures are carried out using a leaving-one-database-out direct-transfer scenario on six independent and heterogeneous public sleep staging databases. The resulting performance is compared with respect to two baseline approaches involving single-database and centralized multiple-database derived models. Our results show that proposed decentralized learning methods outperform baseline local approaches, and provide similar generalization results to centralized database-combined approaches. We conclude that these methods are more preferable choices, as they come with additional advantages concerning improved scalability, flexible design, and data-privacy preservation.
Collapse
|
21
|
Jørgensen SD, Kidmose P, Mikkelsen K, Blech M, Hemmsen MC, Rank ML, Kjaer TW. Long-term ear-EEG monitoring of sleep - A case study during shift work. J Sleep Res 2023; 32:e13853. [PMID: 36889935 DOI: 10.1111/jsr.13853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/26/2023] [Accepted: 01/26/2023] [Indexed: 03/10/2023]
Abstract
The interest in sleep as a potential clinical biomarker is growing, but the standard method of sleep assessment, polysomnography, is expensive, time consuming, and requires a lot of expert assistance for both set-up and interpretation. To make sleep analysis more available both in research and in the clinic, there is a need for a reliable wearable device for sleep staging. In this case study, we test ear-electroencephalography. A wearable, where electrodes are placed in the outer ear, as a platform for longitudinal at-home recording of sleep. We explore the usability of the ear-electroencephalography in a shift work case with alternating sleep conditions. We find the ear-electroencephalography platform to be reliable both in terms of showing substantial agreement to polysomnography after long-time use (with an overall agreement, using Cohen's kappa, of 0.72) and by being unobtrusive enough to wear during night shift conditions. We find that fractions of non-rapid eye movement sleep and transition probability between sleep stages show great potential as sleep metrics when exploring quantitative differences in sleep architecture between shifting sleep conditions. This study shows that the ear-electroencephalography platform holds great potential as a reliable wearable for quantifying sleep "in the wild", pushing this technology further towards clinical adaptation.
Collapse
Affiliation(s)
| | - Preben Kidmose
- Department of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark
| | - Kaare Mikkelsen
- Department of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark
| | | | | | | | - Troels Wesenberg Kjaer
- Department of Neurology, Zealand University Hospital, Roskilde, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Department of Neuroscience, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
22
|
Abstract
Automatic polysomnography analysis can be leveraged to shorten scoring times, reduce associated costs, and ultimately improve the overall diagnosis of sleep disorders. Multiple and diverse strategies have been attempted for implementation of this technology at scale in the routine workflow of sleep centers. The field, however, is complex and presents unsolved challenges in a number of areas. Recent developments in computer science and artificial intelligence are nevertheless closing the gap. Technological advances are also opening new pathways for expanding our current understanding of the domain and its analysis.
Collapse
Affiliation(s)
- Diego Alvarez-Estevez
- Center for Information and Communications Technology Research (CITIC), Universidade da Coruña, 15071 A Coruña, Spain.
| |
Collapse
|
23
|
Zahid AN, Jennum P, Mignot E, Sorensen HBD. MSED: A Multi-Modal Sleep Event Detection Model for Clinical Sleep Analysis. IEEE Trans Biomed Eng 2023; 70:2508-2518. [PMID: 37028083 DOI: 10.1109/tbme.2023.3252368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Clinical sleep analysis require manual analysis of sleep patterns for correct diagnosis of sleep disorders. However, several studies have shown significant variability in manual scoring of clinically relevant discrete sleep events, such as arousals, leg movements, and sleep disordered breathing (apneas and hypopneas). We investigated whether an automatic method could be used for event detection and if a model trained on all events (joint model) performed better than corresponding event-specific models (single-event models). We trained a deep neural network event detection model on 1653 individual recordings and tested the optimized model on 1000 separate hold-out recordings. F1 scores for the optimized joint detection model were 0.70, 0.63, and 0.62 for arousals, leg movements, and sleep disordered breathing, respectively, compared to 0.65, 0.61, and 0.60 for the optimized single-event models. Index values computed from detected events correlated positively with manual annotations (r2 = 0.73, r2 = 0.77, r2 = 0.78, respectively). We furthermore quantified model accuracy based on temporal difference metrics, which improved overall by using the joint model compared to single-event models. Our automatic model jointly detects arousals, leg movements and sleep disordered breathing events with high correlation with human annotations. Finally, we benchmark against previous state-of-the-art multi-event detection models and found an overall increase in F1 score with our proposed model despite a 97.5% reduction in model size.
Collapse
|
24
|
West LC, Summers M, Tang S, Hirt L, Halpern CH, Maroni D, Das R, Gliske SV, Abosch A, Kushida CA, Thompson JA. Evaluation of consensus sleep stage scoring of dysregulated sleep in Parkinson's disease. Sleep Med 2023; 107:236-242. [PMID: 37257366 PMCID: PMC10344673 DOI: 10.1016/j.sleep.2023.04.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/24/2023] [Accepted: 04/30/2023] [Indexed: 06/02/2023]
Abstract
OBJECTIVE Sleep dysregulation in Parkinson's disease (PD) has been hypothesized to occur, in part, from dysfunction in the basal ganglia-cortical circuit. Assessment of this relationship requires accurate sleep stage determination, a known challenge in this clinical population. Our objective was to optimize the consensus on the sleep staging process and reduce interrater variability in a cohort of advanced PD subjects. METHODS Fifteen PD subjects were enrolled from three sites in a clinical trial that involved recordings from subthalamic nucleus (STN) deep brain stimulation (DBS) leads (NCT04620551). Video polysomnography (vPSG) data for a total of 45 nights were analyzed. Four experienced scorers independently scored data on initial review. Epochs with less than 75% consensus were flagged for secondary review. In secondary review of discordant epochs, two of the original scorers re-assessed epochs, from which the final consensus stage was derived. RESULTS Sleep stage classification agreement averaged 83.10% across all sleep stages on initial scoring (IS), and on secondary consensus scoring (CS) review, agreement reached 96.58%. Greatest disagreement was noted in determination of awake epochs (33.6% of discordant epochs) and non-rapid-eye-movement stage 2 (N2) epochs (31.8% of discordant epochs). Scoring discrepancy was resolved with direct measurement of cortical frequency and amplitudes, physiologic context of the epoch, and video review. CONCLUSION Our method of multi-level initial and then secondary consensus review scoring resulted in consensus scoring agreement superior to conventional standards. This work features a custom-engineered vPSG software and review platform for integration of consensus sleep stage scoring in a multi-site clinical trial.
Collapse
Affiliation(s)
- Leslie C West
- University of California San Francisco, Department of Neurology, USA.
| | - Michael Summers
- University of Nebraska Medical Center, Nebraska Medicine Sleep Center, Internal Medicine, Division of Pulmonary, Critical Care & Sleep Medicine, USA
| | - Siqun Tang
- Stanford University, Sleep Medicine Division, Department of Psychiatry and Behavioral Science, USA
| | - Lisa Hirt
- University of Colorado School of Medicine, Department of Neurosurgery, USA
| | - Casey H Halpern
- University of Pennsylvania School of Medicine, Department of Neurosurgery, USA; Department of Surgery, Corporal Michael J. Crescenz Veterans Affairs Medical Center, USA
| | - Dulce Maroni
- University of Nebraska Medical Center, Department of Neurosurgery, USA
| | - Rig Das
- University of Nebraska Medical Center, Department of Neurosurgery, USA
| | - Stephen V Gliske
- University of Nebraska Medical Center, Department of Neurosurgery, USA
| | - Aviva Abosch
- University of Nebraska Medical Center, Department of Neurosurgery, USA
| | - Clete A Kushida
- Stanford University, Sleep Medicine Division, Department of Psychiatry and Behavioral Science, USA
| | - John A Thompson
- University of Colorado School of Medicine, Department of Neurosurgery, USA; University of Colorado School of Medicine, Department of Neurology, USA
| |
Collapse
|
25
|
Song TA, Chowdhury SR, Malekzadeh M, Harrison S, Hoge TB, Redline S, Stone KL, Saxena R, Purcell SM, Dutta J. AI-Driven sleep staging from actigraphy and heart rate. PLoS One 2023; 18:e0285703. [PMID: 37195925 PMCID: PMC10191307 DOI: 10.1371/journal.pone.0285703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 05/02/2023] [Indexed: 05/19/2023] Open
Abstract
Sleep is an important indicator of a person's health, and its accurate and cost-effective quantification is of great value in healthcare. The gold standard for sleep assessment and the clinical diagnosis of sleep disorders is polysomnography (PSG). However, PSG requires an overnight clinic visit and trained technicians to score the obtained multimodality data. Wrist-worn consumer devices, such as smartwatches, are a promising alternative to PSG because of their small form factor, continuous monitoring capability, and popularity. Unlike PSG, however, wearables-derived data are noisier and far less information-rich because of the fewer number of modalities and less accurate measurements due to their small form factor. Given these challenges, most consumer devices perform two-stage (i.e., sleep-wake) classification, which is inadequate for deep insights into a person's sleep health. The challenging multi-class (three, four, or five-class) staging of sleep using data from wrist-worn wearables remains unresolved. The difference in the data quality between consumer-grade wearables and lab-grade clinical equipment is the motivation behind this study. In this paper, we present an artificial intelligence (AI) technique termed sequence-to-sequence LSTM for automated mobile sleep staging (SLAMSS), which can perform three-class (wake, NREM, REM) and four-class (wake, light, deep, REM) sleep classification from activity (i.e., wrist-accelerometry-derived locomotion) and two coarse heart rate measures-both of which can be reliably obtained from a consumer-grade wrist-wearable device. Our method relies on raw time-series datasets and obviates the need for manual feature selection. We validated our model using actigraphy and coarse heart rate data from two independent study populations: the Multi-Ethnic Study of Atherosclerosis (MESA; N = 808) cohort and the Osteoporotic Fractures in Men (MrOS; N = 817) cohort. SLAMSS achieves an overall accuracy of 79%, weighted F1 score of 0.80, 77% sensitivity, and 89% specificity for three-class sleep staging and an overall accuracy of 70-72%, weighted F1 score of 0.72-0.73, 64-66% sensitivity, and 89-90% specificity for four-class sleep staging in the MESA cohort. It yielded an overall accuracy of 77%, weighted F1 score of 0.77, 74% sensitivity, and 88% specificity for three-class sleep staging and an overall accuracy of 68-69%, weighted F1 score of 0.68-0.69, 60-63% sensitivity, and 88-89% specificity for four-class sleep staging in the MrOS cohort. These results were achieved with feature-poor inputs with a low temporal resolution. In addition, we extended our three-class staging model to an unrelated Apple Watch dataset. Importantly, SLAMSS predicts the duration of each sleep stage with high accuracy. This is especially significant for four-class sleep staging, where deep sleep is severely underrepresented. We show that, by appropriately choosing the loss function to address the inherent class imbalance, our method can accurately estimate deep sleep time (SLAMSS/MESA: 0.61±0.69 hours, PSG/MESA ground truth: 0.60±0.60 hours; SLAMSS/MrOS: 0.53±0.66 hours, PSG/MrOS ground truth: 0.55±0.57 hours;). Deep sleep quality and quantity are vital metrics and early indicators for a number of diseases. Our method, which enables accurate deep sleep estimation from wearables-derived data, is therefore promising for a variety of clinical applications requiring long-term deep sleep monitoring.
Collapse
Affiliation(s)
- Tzu-An Song
- University of Massachusetts Amherst, Amherst, MA, United States of America
| | | | - Masoud Malekzadeh
- University of Massachusetts Amherst, Amherst, MA, United States of America
| | - Stephanie Harrison
- California Pacific Medical Center Research Institute, San Francisco, CA, United States of America
| | - Terri Blackwell Hoge
- California Pacific Medical Center Research Institute, San Francisco, CA, United States of America
| | - Susan Redline
- Brigham and Women’s Hospital, Boston, MA, United States of America
| | - Katie L. Stone
- California Pacific Medical Center Research Institute, San Francisco, CA, United States of America
| | - Richa Saxena
- Massachusetts General Hospital, Boston, MA, United States of America
| | - Shaun M. Purcell
- Brigham and Women’s Hospital, Boston, MA, United States of America
| | - Joyita Dutta
- University of Massachusetts Amherst, Amherst, MA, United States of America
| |
Collapse
|
26
|
Bakker JP, Ross M, Cerny A, Vasko R, Shaw E, Kuna S, Magalang UJ, Punjabi NM, Anderer P. Scoring sleep with artificial intelligence enables quantification of sleep stage ambiguity: hypnodensity based on multiple expert scorers and auto-scoring. Sleep 2023; 46:6628222. [PMID: 35780449 PMCID: PMC9905781 DOI: 10.1093/sleep/zsac154] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/22/2022] [Indexed: 11/12/2022] Open
Abstract
STUDY OBJECTIVES To quantify the amount of sleep stage ambiguity across expert scorers and to validate a new auto-scoring platform against sleep staging performed by multiple scorers. METHODS We applied a new auto-scoring system to three datasets containing 95 PSGs scored by 6-12 scorers, to compare sleep stage probabilities (hypnodensity; i.e. the probability of each sleep stage being assigned to a given epoch) as the primary output, as well as a single sleep stage per epoch assigned by hierarchical majority rule. RESULTS The percentage of epochs with 100% agreement across scorers was 46 ± 9%, 38 ± 10% and 32 ± 9% for the datasets with 6, 9, and 12 scorers, respectively. The mean intra-class correlation coefficient between sleep stage probabilities from auto- and manual-scoring was 0.91, representing excellent reliability. Within each dataset, agreement between auto-scoring and consensus manual-scoring was significantly higher than agreement between manual-scoring and consensus manual-scoring (0.78 vs. 0.69; 0.74 vs. 0.67; and 0.75 vs. 0.67; all p < 0.01). CONCLUSIONS Analysis of scoring performed by multiple scorers reveals that sleep stage ambiguity is the rule rather than the exception. Probabilities of the sleep stages determined by artificial intelligence auto-scoring provide an excellent estimate of this ambiguity. Compared to consensus manual-scoring, sleep staging derived from auto-scoring is for each individual PSG noninferior to manual-scoring meaning that auto-scoring output is ready for interpretation without the need for manual adjustment.
Collapse
Affiliation(s)
| | - Marco Ross
- Philips Sleep and Respiratory Care, Vienna, Austria
| | | | - Ray Vasko
- Philips Sleep and Respiratory Care, Pittsburgh, PA,USA
| | - Edmund Shaw
- Philips Sleep and Respiratory Care, Pittsburgh, PA,USA
| | - Samuel Kuna
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA,USA.,Corporal Michael J. Crescenz Veterans Affairs Medical Center, Philadelphia, PA,USA
| | - Ulysses J Magalang
- Division of Pulmonary, Critical Care, and Sleep Medicine, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Naresh M Punjabi
- Division of Pulmonary, Critical Care, and Sleep Medicine, University of Miami, Miami FL, USA
| | | |
Collapse
|
27
|
Huijben IAM, Hermans LWA, Rossi AC, Overeem S, van Gilst MM, van Sloun RJG. Interpretation and further development of the hypnodensity representation of sleep structure. Physiol Meas 2023; 44. [PMID: 36595329 DOI: 10.1088/1361-6579/aca641] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 11/25/2022] [Indexed: 11/27/2022]
Abstract
Objective.The recently-introduced hypnodensity graph provides a probability distribution over sleep stages per data window (i.e. an epoch). This work explored whether this representation reveals continuities that can only be attributed to intra- and inter-rater disagreement of expert scorings, or also to co-occurrence of sleep stage-dependent features within one epoch.Approach.We proposed a simplified model for time series like the ones measured during sleep, and a second model to describe the annotation process by an expert. Generating data according to these models, enabled controlled experiments to investigate the interpretation of the hypnodensity graph. Moreover, the influence of both the supervised training strategy, and the used softmax non-linearity were investigated. Polysomnography recordings of 96 healthy sleepers (of which 11 were used as independent test set), were subsequently used to transfer conclusions to real data.Main results.A hypnodensity graph, predicted by a supervised neural classifier, represents the probability with which the sleep expert(s) assigned a label to an epoch. It thus reflects annotator behavior, and is thereby only indirectly linked to the ratio of sleep stage-dependent features in the epoch. Unsupervised training was shown to result in hypnodensity graph that were slightly less dependent on this annotation process, resulting in, on average, higher-entropy distributions over sleep stages (Hunsupervised= 0.41 versusHsupervised= 0.29). Moreover, pre-softmax predictions were, for both training strategies, found to better reflect the ratio of sleep stage-dependent characteristics in an epoch, as compared to the post-softmax counterparts (i.e. the hypnodensity graph). In real data, this was observed from the linear relation between pre-softmax N3 predictions and the amount of delta power.Significance.This study provides insights in, and proposes new, representations of sleep that may enhance our comprehension about sleep and sleep disorders.
Collapse
Affiliation(s)
- Iris A M Huijben
- Dept. of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands.,Onera Health, 5617 BD Eindhoven, The Netherlands
| | - Lieke W A Hermans
- Dept. of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
| | | | - Sebastiaan Overeem
- Dept. of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands.,Sleep Medicine Center Kempenhaeghe, 5591 VE Heeze, The Netherlands
| | - Merel M van Gilst
- Dept. of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands.,Sleep Medicine Center Kempenhaeghe, 5591 VE Heeze, The Netherlands
| | - Ruud J G van Sloun
- Dept. of Electrical Engineering, Eindhoven University of Technology, 5612 AP Eindhoven, The Netherlands
| |
Collapse
|
28
|
Tsai CY, Liu WT, Hsu WH, Majumdar A, Stettler M, Lee KY, Cheng WH, Wu D, Lee HC, Kuan YC, Wu CJ, Lin YC, Ho SC. Screening the risk of obstructive sleep apnea by utilizing supervised learning techniques based on anthropometric features and snoring events. Digit Health 2023; 9:20552076231152751. [PMID: 36896329 PMCID: PMC9989412 DOI: 10.1177/20552076231152751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 01/04/2023] [Indexed: 03/08/2023] Open
Abstract
Objectives Obstructive sleep apnea (OSA) is typically diagnosed by polysomnography (PSG). However, PSG is time-consuming and has some clinical limitations. This study thus aimed to establish machine learning models to screen for the risk of having moderate-to-severe and severe OSA based on easily acquired features. Methods We collected PSG data on 3529 patients from Taiwan and further derived the number of snoring events. Their baseline characteristics and anthropometric measures were obtained, and correlations among the collected variables were investigated. Next, six common supervised machine learning techniques were utilized, including random forest (RF), extreme gradient boosting (XGBoost), k-nearest neighbor (kNN), support vector machine (SVM), logistic regression (LR), and naïve Bayes (NB). First, data were independently separated into a training and validation dataset (80%) and a test dataset (20%). The approach with the highest accuracy in the training and validation phase was employed to classify the test dataset. Next, feature importance was investigated by calculating the Shapley value of every factor, which represented the impact on OSA risk screening. Results The RF produced the highest accuracy (of >70%) in the training and validation phase in screening for both OSA severities. Hence, we employed the RF to classify the test dataset, and results showed a 79.32% accuracy for moderate-to-severe OSA and 74.37% accuracy for severe OSA. Snoring events and the visceral fat level were the most and second most essential features of screening for OSA risk. Conclusions The established model can be considered for screening for the risk of having moderate-to-severe or severe OSA.
Collapse
Affiliation(s)
- Cheng-Yu Tsai
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Wen-Te Liu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Research Center of Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
| | - Wen-Hua Hsu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Arnab Majumdar
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Marc Stettler
- Department of Civil and Environmental Engineering, Imperial College London, London, UK
| | - Kang-Yun Lee
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Division of Pulmonary Medicine, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Wun-Hao Cheng
- Graduate Institute of Clinical Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Dean Wu
- Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Taipei Neuroscience Institute, Taipei Medical University, Taipei, Taiwan.,Dementia Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Hsin-Chien Lee
- Department of Psychiatry, Taipei Medical University Hospital, Taipei, Taiwan
| | - Yi-Chun Kuan
- Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan.,Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Taipei Neuroscience Institute, Taipei Medical University, Taipei, Taiwan.,Dementia Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Cheng-Jung Wu
- Department of Otolaryngology, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Yi-Chih Lin
- Department of Otolaryngology, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Shu-Chuan Ho
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| |
Collapse
|
29
|
Hanna J, Flöel A. An accessible and versatile deep learning-based sleep stage classifier. Front Neuroinform 2023; 17:1086634. [PMID: 36938361 PMCID: PMC10017438 DOI: 10.3389/fninf.2023.1086634] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 02/13/2023] [Indexed: 03/06/2023] Open
Abstract
Manual sleep scoring for research purposes and for the diagnosis of sleep disorders is labor-intensive and often varies significantly between scorers, which has motivated many attempts to design automatic sleep stage classifiers. With the recent introduction of large, publicly available hand-scored polysomnographic data, and concomitant advances in machine learning methods to solve complex classification problems with supervised learning, the problem has received new attention, and a number of new classifiers that provide excellent accuracy. Most of these however have non-trivial barriers to use. We introduce the Greifswald Sleep Stage Classifier (GSSC), which is free, open source, and can be relatively easily installed and used on any moderately powered computer. In addition, the GSSC has been trained to perform well on a large variety of electrode set-ups, allowing high performance sleep staging with portable systems. The GSSC can also be readily integrated into brain-computer interfaces for real-time inference. These innovations were achieved while simultaneously reaching a level of accuracy equal to, or exceeding, recent state of the art classifiers and human experts, making the GSSC an excellent choice for researchers in need of reliable, automatic sleep staging.
Collapse
Affiliation(s)
- Jevri Hanna
- Greifswald University Hospital, Greifswald, Germany
- *Correspondence: Jevri Hanna,
| | - Agnes Flöel
- Greifswald University Hospital, Greifswald, Germany
- German Center for Neurodegenerative Diseases, Standort Rostock/Greifswald, Greifswald, Germany
| |
Collapse
|
30
|
Choo BP, Mok Y, Oh HC, Patanaik A, Kishan K, Awasthi A, Biju S, Bhattacharjee S, Poh Y, Wong HS. Benchmarking performance of an automatic polysomnography scoring system in a population with suspected sleep disorders. Front Neurol 2023; 14:1123935. [PMID: 36873452 PMCID: PMC9981786 DOI: 10.3389/fneur.2023.1123935] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 01/16/2023] [Indexed: 02/19/2023] Open
Abstract
Aim The current gold standard for measuring sleep disorders is polysomnography (PSG), which is manually scored by a sleep technologist. Scoring a PSG is time-consuming and tedious, with substantial inter-rater variability. A deep-learning-based sleep analysis software module can perform autoscoring of PSG. The primary objective of the study is to validate the accuracy and reliability of the autoscoring software. The secondary objective is to measure workflow improvements in terms of time and cost via a time motion study. Methodology The performance of an automatic PSG scoring software was benchmarked against the performance of two independent sleep technologists on PSG data collected from patients with suspected sleep disorders. The technologists at the hospital clinic and a third-party scoring company scored the PSG records independently. The scores were then compared between the technologists and the automatic scoring system. An observational study was also performed where the time taken for sleep technologists at the hospital clinic to manually score PSGs was tracked, along with the time taken by the automatic scoring software to assess for potential time savings. Results Pearson's correlation between the manually scored apnea-hypopnea index (AHI) and the automatically scored AHI was 0.962, demonstrating a near-perfect agreement. The autoscoring system demonstrated similar results in sleep staging. The agreement between automatic staging and manual scoring was higher in terms of accuracy and Cohen's kappa than the agreement between experts. The autoscoring system took an average of 42.7 s to score each record compared with 4,243 s for manual scoring. Following a manual review of the auto scores, an average time savings of 38.6 min per PSG was observed, amounting to 0.25 full-time equivalent (FTE) savings per year. Conclusion The findings indicate a potential for a reduction in the burden of manual scoring of PSGs by sleep technologists and may be of operational significance for sleep laboratories in the healthcare setting.
Collapse
Affiliation(s)
- Bryan Peide Choo
- Health Services Research, Changi General Hospital, Singapore, Singapore
| | - Yingjuan Mok
- Department of Respiratory and Critical Care Medicine, Changi General Hospital, Singapore, Singapore.,Department of Sleep Medicine, Surgery and Science, Changi General Hospital, Singapore, Singapore
| | - Hong Choon Oh
- Health Services Research, Changi General Hospital, Singapore, Singapore.,Duke-NUS Medical School, Singapore, Singapore.,Centre for Population Health Research and Implementation, SingHealth Office of Regional Health, Singapore, Singapore
| | | | | | - Animesh Awasthi
- Department of Biotechnology, Indian Institute of Technology, Kharagpur, India
| | - Siddharth Biju
- Department of Biotechnology, Indian Institute of Technology, Kharagpur, India
| | - Soumya Bhattacharjee
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru, India
| | - Yvonne Poh
- Department of Sleep Medicine, Surgery and Science, Changi General Hospital, Singapore, Singapore
| | - Hang Siang Wong
- Department of Respiratory and Critical Care Medicine, Changi General Hospital, Singapore, Singapore.,Department of Sleep Medicine, Surgery and Science, Changi General Hospital, Singapore, Singapore
| |
Collapse
|
31
|
Tsai CY, Huang HT, Cheng HC, Wang J, Duh PJ, Hsu WH, Stettler M, Kuan YC, Lin YT, Hsu CR, Lee KY, Kang JH, Wu D, Lee HC, Wu CJ, Majumdar A, Liu WT. Screening for Obstructive Sleep Apnea Risk by Using Machine Learning Approaches and Anthropometric Features. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22228630. [PMID: 36433227 PMCID: PMC9694257 DOI: 10.3390/s22228630] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 10/26/2022] [Accepted: 11/05/2022] [Indexed: 05/14/2023]
Abstract
Obstructive sleep apnea (OSA) is a global health concern and is typically diagnosed using in-laboratory polysomnography (PSG). However, PSG is highly time-consuming and labor-intensive. We, therefore, developed machine learning models based on easily accessed anthropometric features to screen for the risk of moderate to severe and severe OSA. We enrolled 3503 patients from Taiwan and determined their PSG parameters and anthropometric features. Subsequently, we compared the mean values among patients with different OSA severity and considered correlations among all participants. We developed models based on the following machine learning approaches: logistic regression, k-nearest neighbors, naïve Bayes, random forest (RF), support vector machine, and XGBoost. Collected data were first independently split into two data sets (training and validation: 80%; testing: 20%). Thereafter, we adopted the model with the highest accuracy in the training and validation stage to predict the testing set. We explored the importance of each feature in the OSA risk screening by calculating the Shapley values of each input variable. The RF model achieved the highest accuracy for moderate to severe (84.74%) and severe (72.61%) OSA. The level of visceral fat was found to be a predominant feature in the risk screening models of OSA with the aforementioned levels of severity. Our machine learning models can be employed to screen for OSA risk in the populations in Taiwan and in those with similar craniofacial structures.
Collapse
Affiliation(s)
- Cheng-Yu Tsai
- Centre for Transport Studies, Department of Civil and Environmental Engineering, Imperial College London, London SW7 2AZ, UK
| | - Huei-Tyng Huang
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
| | - Hsueh-Chien Cheng
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton CB10 1RQ, UK
| | - Jieni Wang
- Chemical Engineering and Biotechnology, University of Cambridge, Cambridge CB3 0AS, UK
| | - Ping-Jung Duh
- Cognitive Neuroscience, Division of Psychology and Language Science, University College London, London WC1H 0AP, UK
| | - Wen-Hua Hsu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei 110301, Taiwan
| | - Marc Stettler
- Centre for Transport Studies, Department of Civil and Environmental Engineering, Imperial College London, London SW7 2AZ, UK
| | - Yi-Chun Kuan
- Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei 110301, Taiwan
- Taipei Neuroscience Institute, Taipei Medical University, Taipei 110301, Taiwan
- Dementia Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
| | - Yin-Tzu Lin
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital at Linkou, Taoyuan 33305, Taiwan
| | - Chia-Rung Hsu
- Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
| | - Kang-Yun Lee
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
| | - Jiunn-Horng Kang
- Department of Physical Medicine and Rehabilitation, Taipei Medical University Hospital, Taipei 110301, Taiwan
- Research Center of Artificial Intelligence in Medicine, Taipei Medical University, Taipei 110301, Taiwan
- Graduate Institute of Nanomedicine and Medical Engineering, College of Biomedical Engineering, Taipei Medical University, Taipei 110301, Taiwan
| | - Dean Wu
- Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Department of Neurology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Department of Neurology, School of Medicine, College of Medicine, Taipei Medical University, Taipei 110301, Taiwan
- Taipei Neuroscience Institute, Taipei Medical University, Taipei 110301, Taiwan
- Dementia Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
| | - Hsin-Chien Lee
- Department of Psychiatry, Taipei Medical University Hospital, Taipei 110301, Taiwan
| | - Cheng-Jung Wu
- Department of Otolaryngology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
| | - Arnab Majumdar
- Centre for Transport Studies, Department of Civil and Environmental Engineering, Imperial College London, London SW7 2AZ, UK
- Correspondence: (A.M.); (W.-T.L.)
| | - Wen-Te Liu
- School of Respiratory Therapy, College of Medicine, Taipei Medical University, Taipei 110301, Taiwan
- Sleep Center, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235041, Taiwan
- Research Center of Artificial Intelligence in Medicine, Taipei Medical University, Taipei 110301, Taiwan
- Correspondence: (A.M.); (W.-T.L.)
| |
Collapse
|
32
|
Alvarez-Estevez D, Rijsman RM. Computer-assisted analysis of polysomnographic recordings improves inter-scorer associated agreement and scoring times. PLoS One 2022; 17:e0275530. [PMID: 36174095 PMCID: PMC9522290 DOI: 10.1371/journal.pone.0275530] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 09/19/2022] [Indexed: 11/18/2022] Open
Abstract
STUDY OBJECTIVES To investigate inter-scorer agreement and scoring time differences associated with visual and computer-assisted analysis of polysomnographic (PSG) recordings. METHODS A group of 12 expert scorers reviewed 5 PSGs that were independently selected in the context of each of the following tasks: (i) sleep staging, (ii) scoring of leg movements, (iii) detection of respiratory (apneic-related) events, and (iv) of electroencephalographic (EEG) arousals. All scorers independently reviewed the same recordings, hence resulting in 20 scoring exercises per scorer from an equal amount of different subjects. The procedure was repeated, separately, using the classical visual manual approach and a computer-assisted (semi-automatic) procedure. Resulting inter-scorer agreement and scoring times were examined and compared among the two methods. RESULTS Computer-assisted sleep scoring showed a consistent and statistically relevant effect toward less time required for the completion of each of the PSG scoring tasks. Gain factors ranged from 1.26 (EEG arousals) to 2.41 (leg movements). Inter-scorer kappa agreement was also consistently increased with the use of supervised semi-automatic scoring. Specifically, agreement increased from Κ = 0.76 to K = 0.80 (sleep stages), Κ = 0.72 to K = 0.91 (leg movements), Κ = 0.55 to K = 0.66 (respiratory events), and Κ = 0.58 to Κ = 0.65 (EEG arousals). Inter-scorer agreement on the examined set of diagnostic indices did also show a trend toward higher Interclass Correlation Coefficient scores when using the semi-automatic scoring approach. CONCLUSIONS Computer-assisted analysis can improve inter-scorer agreement and scoring times associated with the review of PSG studies resulting in higher efficiency and overall quality in the diagnosis sleep disorders.
Collapse
Affiliation(s)
- Diego Alvarez-Estevez
- Center for Information and Communications Technology Research (CITIC), Universidade da Coruña, A Coruña, Spain
| | - Roselyne M. Rijsman
- Sleep Center and Clinical Neurophysiology Department, Haaglanden Medisch Centrum, The Hague, The Netherlands
| |
Collapse
|
33
|
Kamon M, Okada S, Furuta M, Yoshida K. Development of a non-contact sleep monitoring system for children. Front Digit Health 2022; 4:877234. [PMID: 36003190 PMCID: PMC9393414 DOI: 10.3389/fdgth.2022.877234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 07/19/2022] [Indexed: 11/13/2022] Open
Abstract
Daily monitoring is important, even for healthy children, because sleep plays a critical role in their development and growth. Polysomnography is necessary for sleep monitoring. However, measuring sleep requires specialized equipment and knowledge and is difficult to do at home. In recent years, smartwatches and other devices have been developed to easily measure sleep. However, they cannot measure children's sleep, and contact devices may disturb their sleep.A non-contact method of measuring sleep is the use of video during sleep. This is most suitable for the daily monitoring of children’s sleep, as it is simple and inexpensive. However, the algorithms have been developed only based on adult sleep, whereas children’s sleep is known to differ considerably from that of adults.For this reason, we conducted a non-contact estimation of sleep stages for children using video. The participants were children between the ages of 0–6 years old. We estimated the four stages of sleep using the body movement information calculated from the videos recorded. Six parameters were calculated from body movement information. As children’s sleep is known to change significantly as they grow, estimation was divided into two groups (0–2 and 3–6 years).The results show average estimation accuracies of 46.7 ± 6.6 and 49.0 ± 4.8% and kappa coefficients of 0.24 ± 0.11 and 0.28 ± 0.06 in the age groups of 0–2 and 3–6 years, respectively. This performance is comparable to or better than that reported in previous adult studies.
Collapse
Affiliation(s)
- Masamitsu Kamon
- Department of Robotics, Ritsumeikan University, Shiga, Japan
- Correspondence: Masamitsu Kamon
| | - Shima Okada
- Department of Robotics, Ritsumeikan University, Shiga, Japan
| | - Masafumi Furuta
- Technology Research Laboratory, Shimadzu Corporation, Kyoto, Japan
| | - Koki Yoshida
- Technology Research Laboratory, Shimadzu Corporation, Kyoto, Japan
| |
Collapse
|
34
|
von Ellenrieder N, Peter-Derex L, Gotman J, Frauscher B. SleepSEEG: Automatic sleep scoring using intracranial EEG recordings only. J Neural Eng 2022; 19. [PMID: 35439736 DOI: 10.1088/1741-2552/ac6829] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 04/18/2022] [Indexed: 11/11/2022]
Abstract
OBJECTIVE To perform automatic sleep scoring based only on intracranial EEG, without the need for scalp electroencephalography (EEG), electrooculography (EOG) and electromyography (EMG), in order to study sleep, epilepsy, and their interaction. APPROACH Data from 33 adult patients was used for development and training of the automatic scoring algorithm using both oscillatory and non-oscillatory spectral features. The first step consisted in unsupervised clustering of channels based on feature variability. For each cluster the classification was done in two steps, a multiclass tree followed by binary classification trees to distinguish the more challenging stage N1. The test data consisted in 11 patients, in whom the classification was done independently for each channel and then combined to get a single stage per epoch. MAIN RESULTS An overall agreement of 78% was observed in the test set between the sleep scoring of the algorithm and two human experts scoring based on scalp EEG, EOG and EMG. Balanced sensitivity and specificity were obtained for the different sleep stages. The performance was excellent for stages W, N2, and N3, and good for stage R, but with high variability across patients. The performance for the challenging stage N1 was poor, but at a similar level as for published algorithms based on scalp EEG. High confidence epochs in different stages (other than N1) can be identified with median per patient specificity >80%. SIGNIFICANCE The automatic algorithm can perform sleep scoring of long term recordings of patients with intracranial electrodes undergoing presurgical evaluation in the absence of scalp EEG, EOG and EMG, which are normally required to define sleep stages but are difficult to use in the context of intracerebral studies. It also constitutes a valuable tool to generate hypotheses regarding local aspects of sleep, and will be significant for sleep evaluation in clinical epileptology and neuroscience research.
Collapse
Affiliation(s)
- Nicolás von Ellenrieder
- Montreal Neurological Institute and Hospital, McGill University, 3801 University streeet, Montreal, Quebec, H3A 2B4, CANADA
| | - Laure Peter-Derex
- PAM Team, Centre de Recherche en Neurosciences de Lyon, 95 Boulevard Pinel, Lyon, Rhône-Alpes , 69675 BRON, FRANCE
| | - Jean Gotman
- Montreal Neurological Institute and Hospital, McGill University, 3801 University St, Montreal, Quebec, H3A 2B4, CANADA
| | - Birgit Frauscher
- Montreal Neurological Institute and Hospital, McGill University, 3801 University Street, Montreal, Quebec, H3A 2B4, CANADA
| |
Collapse
|
35
|
Lok R, Chawra D, Hon F, Ha M, Kaplan KA, Zeitzer JM. Objective underpinnings of self-reported sleep quality in middle-aged and older adults: The importance of N2 and wakefulness. Biol Psychol 2022; 170:108290. [PMID: 35192907 PMCID: PMC9038649 DOI: 10.1016/j.biopsycho.2022.108290] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/16/2022] [Accepted: 02/17/2022] [Indexed: 12/16/2022]
Abstract
STUDY OBJECTIVES The measurable aspects of brain function (polysomnography, PSG) that are correlated with sleep satisfaction are poorly understood. Using recent developments in automated sleep scoring, which remove the within- and between-rater error associated with human scoring, we examine whether PSG measures are associated with sleep satisfaction. DESIGN AND SETTING A single night of PSG data was compared to contemporaneously collected measures of sleep satisfaction with Random Forest regressions. Whole and partial night PSG data were scored using a novel machine learning algorithm. PARTICIPANTS Community-dwelling adults (N = 3165) who participated in the Sleep Heart Health Study. INTERVENTIONS None. MEASUREMENTS AND RESULTS Models explained 30% of sleep depth and 27% of sleep restfulness, with a similar top four predictors: minutes of N2 sleep, sleep efficiency, age, and minutes of wake after sleep onset (WASO). With increasing self-reported sleep quality, there was a progressive increase in N2 and decrease in WASO of similar magnitude, without systematic changes in N1, N3 or REM sleep. In comparing those with the best and worst self-reported sleep satisfaction, there was a range of approximately 30 min more N2, 30 min less WASO, an improvement of sleep efficiency of 7-8%, and an age span of 3-5 years. Examination of sleep most proximal to morning awakening revealed no greater explanatory power than the whole-night data set. CONCLUSIONS Higher N2 and concomitant lower wake is associated with improved sleep satisfaction. Interventions that specifically target these may be suitable for improving the self-reported sleep experience.
Collapse
Affiliation(s)
- Renske Lok
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| | - Dwijen Chawra
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| | - Flora Hon
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; College of Literature, Science, and The Arts, University of Michigan, Ann Arbor, MI 48109, USA
| | - Michelle Ha
- Department of Mathematics and Statistics, San Jose State University, San Jose, CA 95112, USA
| | - Katherine A Kaplan
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| | - Jamie M Zeitzer
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA; Mental Illness Research Education and Clinical Center, VA Palo Alto Health Care System, Palo Alto, CA 94304, USA.
| |
Collapse
|
36
|
Kelly JL, Ben Messaoud R, Joyeux-Faure M, Terrail R, Tamisier R, Martinot JB, Le-Dong NN, Morrell MJ, Pépin JL. Diagnosis of Sleep Apnoea Using a Mandibular Monitor and Machine Learning Analysis: One-Night Agreement Compared to in-Home Polysomnography. Front Neurosci 2022; 16:726880. [PMID: 35368281 PMCID: PMC8965001 DOI: 10.3389/fnins.2022.726880] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 02/22/2022] [Indexed: 11/13/2022] Open
Abstract
BackgroundThe capacity to diagnose obstructive sleep apnoea (OSA) must be expanded to meet an estimated disease burden of nearly one billion people worldwide. Validated alternatives to the gold standard polysomnography (PSG) will improve access to testing and treatment. This study aimed to evaluate the diagnosis of OSA, using measurements of mandibular movement (MM) combined with automated machine learning analysis, compared to in-home PSG.Methods40 suspected OSA patients underwent single overnight in-home sleep testing with PSG (Nox A1, ResMed, Australia) and simultaneous MM monitoring (Sunrise, Sunrise SA, Belgium). PSG recordings were manually analysed by two expert sleep centres (Grenoble and London); MM analysis was automated. The Obstructive Respiratory Disturbance Index calculated from the MM monitoring (MM-ORDI) was compared to the PSG (PSG-ORDI) using intraclass correlation coefficient and Bland-Altman analysis. Receiver operating characteristic curves (ROC) were constructed to optimise the diagnostic performance of the MM monitor at different PSG-ORDI thresholds (5, 15, and 30 events/hour).Results31 patients were included in the analysis (58% men; mean (SD) age: 48 (15) years; BMI: 30.4 (7.6) kg/m2). Good agreement was observed between MM-ORDI and PSG-ORDI (median bias 0.00; 95% CI −23.25 to + 9.73 events/hour). However, for 15 patients with no or mild OSA, MM monitoring overestimated disease severity (PSG-ORDI < 5: MM-ORDI mean overestimation + 5.58 (95% CI + 2.03 to + 7.46) events/hour; PSG-ORDI > 5–15: MM-ORDI overestimation + 3.70 (95% CI −0.53 to + 18.32) events/hour). In 16 patients with moderate-severe OSA (n = 9 with PSG-ORDI 15–30 events/h and n = 7 with a PSG-ORD > 30 events/h), there was an underestimation (PSG-ORDI > 15: MM-ORDI underestimation −8.70 (95% CI −28.46 to + 4.01) events/hour). ROC optimal cut-off values for PSG-ORDI thresholds of 5, 15, 30 events/hour were: 9.53, 12.65 and 24.81 events/hour, respectively. These cut-off values yielded a sensitivity of 88, 100 and 79%, and a specificity of 100, 75, 96%. The positive predictive values were: 100, 80, 95% and the negative predictive values 89, 100, 82%, respectively.ConclusionThe diagnosis of OSA, using MM with machine learning analysis, is comparable to manually scored in-home PSG. Therefore, this novel monitor could be a convenient diagnostic tool that can easily be used in the patients’ own home.Clinical Trial Registrationhttps://clinicaltrials.gov, identifier NCT04262557
Collapse
Affiliation(s)
- Julia L. Kelly
- National Heart and Lung Institute, Imperial College London, Royal Brompton Hospital, London, United Kingdom
| | - Raoua Ben Messaoud
- HP2 Laboratory, Inserm U1300, Grenoble Alpes University, Grenoble, France
| | - Marie Joyeux-Faure
- HP2 Laboratory, Inserm U1300, Grenoble Alpes University, Grenoble, France
- EFCR Laboratory, Thorax and Vessels division, Grenoble Alpes University Hospital, Grenoble, France
| | - Robin Terrail
- HP2 Laboratory, Inserm U1300, Grenoble Alpes University, Grenoble, France
- EFCR Laboratory, Thorax and Vessels division, Grenoble Alpes University Hospital, Grenoble, France
| | - Renaud Tamisier
- HP2 Laboratory, Inserm U1300, Grenoble Alpes University, Grenoble, France
- EFCR Laboratory, Thorax and Vessels division, Grenoble Alpes University Hospital, Grenoble, France
| | - Jean-Benoît Martinot
- Sleep Laboratory, CHU Université catholique de Louvain (UCL) Namur Site Sainte-Elisabeth, Namur, Belgium
- Institute of Experimental and Clinical Research, UCL Bruxelles Woluwe, Brussels, Belgium
| | | | - Mary J. Morrell
- National Heart and Lung Institute, Imperial College London, Royal Brompton Hospital, London, United Kingdom
| | - Jean-Louis Pépin
- HP2 Laboratory, Inserm U1300, Grenoble Alpes University, Grenoble, France
- EFCR Laboratory, Thorax and Vessels division, Grenoble Alpes University Hospital, Grenoble, France
- *Correspondence: Jean-Louis Pépin,
| |
Collapse
|
37
|
Ricci A, Calhoun SL, He F, Fang J, Vgontzas AN, Liao D, Bixler EO, Younes M, Fernandez-Mendoza J. Association of a novel EEG metric of sleep depth/intensity with attention-deficit/hyperactivity, learning, and internalizing disorders and their pharmacotherapy in adolescence. Sleep 2022; 45:zsab287. [PMID: 34888687 PMCID: PMC8919202 DOI: 10.1093/sleep/zsab287] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 11/17/2021] [Indexed: 01/08/2023] Open
Abstract
STUDY OBJECTIVES Psychiatric/learning disorders are associated with sleep disturbances, including those arising from abnormal cortical activity. The odds ratio product (ORP) is a standardized electroencephalogram metric of sleep depth/intensity validated in adults, while ORP data in youth are lacking. We tested ORP as a measure of sleep depth/intensity in adolescents with and without psychiatric/learning disorders. METHODS Four hundred eighteen adolescents (median 16 years) underwent a 9-hour, in-lab polysomnography. Of them, 263 were typically developing (TD), 89 were unmedicated, and 66 were medicated for disorders including attention-deficit/hyperactivity (ADHD), learning (LD), and internalizing (ID). Central ORP during non-rapid eye movement (NREM) sleep was the primary outcome. Secondary/exploratory outcomes included central and frontal ORP during NREM stages, in the 9-seconds following arousals (ORP-9), in the first and second halves of the night, during REM sleep and wakefulness. RESULTS Unmedicated youth with ADHD/LD had greater central ORP than TD during stage 3 and in central and frontal regions during stage 2 and the second half of the sleep period, while ORP in youth with ADHD/LD on stimulants did not significantly differ from TD. Unmedicated youth with ID did not significantly differ from TD in ORP, while youth with ID on antidepressants had greater central and frontal ORP than TD during NREM and REM sleep, and higher ORP-9. CONCLUSIONS The greater ORP in unmedicated youth with ADHD/LD, and normalized levels in those on stimulants, suggests ORP is a useful metric of decreased NREM sleep depth/intensity in ADHD/LD. Antidepressants are associated with greater ORP/ORP-9, suggesting these medications induce cortical arousability.
Collapse
Affiliation(s)
- Anna Ricci
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| | - Susan L Calhoun
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| | - Fan He
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Jidong Fang
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| | - Alexandros N Vgontzas
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| | - Duanping Liao
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Edward O Bixler
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| | - Magdy Younes
- Sleep Disorders Centre, Department of Medicine, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Julio Fernandez-Mendoza
- Sleep Research and Treatment Center, Department of Psychiatry and Behavioral Health, Penn State College of Medicine, Hershey, PA,USA
| |
Collapse
|
38
|
Lee YJ, Lee JY, Cho JH, Choi JH. Interrater reliability of sleep stage scoring: a meta-analysis. J Clin Sleep Med 2022; 18:193-202. [PMID: 34310277 PMCID: PMC8807917 DOI: 10.5664/jcsm.9538] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/02/2021] [Accepted: 07/02/2021] [Indexed: 01/03/2023]
Abstract
STUDY OBJECTIVES We evaluated the interrater reliabilities of manual polysomnography sleep stage scoring. We included all studies that employed Rechtschaffen and Kales rules or American Academy of Sleep Medicine standards. We sought the overall degree of agreement and those for each stage. METHODS The keywords were "Polysomnography (PSG)," "sleep staging," "Rechtschaffen and Kales (R&K)," "American Academy of Sleep Medicine (AASM)," "interrater (interscorer) reliability," and "Cohen's kappa." We searched PubMed, OVID Medline, EMBASE, the Cochrane library, KoreaMed, KISS, and the MedRIC. The exclusion criteria included automatic scoring and pediatric patients. We collected data on scorer histories, scoring rules, numbers of epochs scored, and the underlying diseases of the patients. RESULTS A total of 101 publications were retrieved; 11 satisfied the selection criteria. The Cohen's kappa for manual, overall sleep scoring was 0.76, indicating substantial agreement (95% confidence interval, 0.71-0.81; P < .001). By sleep stage, the figures were 0.70, 0.24, 0.57, 0.57, and 0.69 for the W, N1, N2, N3, and R stages, respectively. The interrater reliabilities for stage N2 and N3 sleep were moderate, and that for stage N1 sleep was only fair. CONCLUSIONS We conducted a meta-analysis to generalize the variation in manual scoring of polysomnography and provide reference data for automatic sleep stage scoring systems. The reliability of manual scorers of polysomnography sleep stages was substantial. However, for certain stages, the results were poor; validity requires improvement. CITATION Lee YJ, Lee JY, Cho JH, Choi JH. Interrater reliability of sleep stage scoring: a meta-analysis. J Clin Sleep Med. 2022;18(1):193-202.
Collapse
Affiliation(s)
- Yun Ji Lee
- Department of Otorhinolaryngology—Head and Neck Surgery, College of Medicine, Soonchunhyang University, Bucheon Hospital, Bucheon, Korea
| | - Jae Yong Lee
- Department of Otorhinolaryngology—Head and Neck Surgery, College of Medicine, Soonchunhyang University, Bucheon Hospital, Bucheon, Korea
| | - Jae Hoon Cho
- Department of Otorhinolaryngology—Head and Neck Surgery, College of Medicine, Konkuk University, Seoul, Korea
| | - Ji Ho Choi
- Department of Otorhinolaryngology—Head and Neck Surgery, College of Medicine, Soonchunhyang University, Bucheon Hospital, Bucheon, Korea
| |
Collapse
|
39
|
Anderer P, Ross M, Cerny A, Shaw E. Automated Scoring of Sleep and Associated Events. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1384:107-130. [PMID: 36217081 DOI: 10.1007/978-3-031-06413-5_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Conventionally, sleep and associated events are scored visually by trained technologists according to the rules summarized in the American Academy of Sleep Medicine Manual. Since its first publication in 2007, the manual was continuously updated; the most recent version as of this writing was published in 2020. Human expert scoring is considered as gold standard, even though there is increasing evidence of limited interrater reliability between human scorers. Significant advances in machine learning have resulted in powerful methods for addressing complex classification problems such as automated scoring of sleep and associated events. Evidence is increasing that these autoscoring systems deliver performance comparable to manual scoring and offer several advantages to visual scoring: (1) avoidance of the rather expensive, time-consuming, and difficult visual scoring task that can be performed only by well-trained and experienced human scorers, (2) attainment of consistent scoring results, and (3) proposition of added value such as scoring in real time, sleep stage probabilities per epoch (hypnodensity), estimates of signal quality and sleep/wake-related features, identifications of periods with clinically relevant ambiguities (confidence trends), configurable sensitivity and rule settings, as well as cardiorespiratory sleep staging for home sleep apnea testing. This chapter describes the development of autoscoring systems since the first attempts in the 1970s up to the most recent solutions based on deep neural network approaches which achieve an accuracy that allows to use the autoscoring results directly for review and interpretation by a physician.
Collapse
Affiliation(s)
- Peter Anderer
- Philips Sleep and Respiratory Care, Vienna, Austria.
- The Siesta Group Schlafanalyse GmbH, Vienna, Austria.
| | - Marco Ross
- Philips Sleep and Respiratory Care, Vienna, Austria
| | | | - Edmund Shaw
- Philips Sleep and Respiratory Care, Pittsburgh, PA, USA
| |
Collapse
|
40
|
Skovgaard EL, Pedersen J, Møller NC, Grøntved A, Brønd JC. Manual Annotation of Time in Bed Using Free-Living Recordings of Accelerometry Data. SENSORS 2021; 21:s21248442. [PMID: 34960533 PMCID: PMC8707394 DOI: 10.3390/s21248442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/07/2021] [Accepted: 12/14/2021] [Indexed: 12/02/2022]
Abstract
With the emergence of machine learning for the classification of sleep and other human behaviors from accelerometer data, the need for correctly annotated data is higher than ever. We present and evaluate a novel method for the manual annotation of in-bed periods in accelerometer data using the open-source software Audacity®, and we compare the method to the EEG-based sleep monitoring device Zmachine® Insight+ and self-reported sleep diaries. For evaluating the manual annotation method, we calculated the inter- and intra-rater agreement and agreement with Zmachine and sleep diaries using interclass correlation coefficients and Bland–Altman analysis. Our results showed excellent inter- and intra-rater agreement and excellent agreement with Zmachine and sleep diaries. The Bland–Altman limits of agreement were generally around ±30 min for the comparison between the manual annotation and the Zmachine timestamps for the in-bed period. Moreover, the mean bias was minuscule. We conclude that the manual annotation method presented is a viable option for annotating in-bed periods in accelerometer data, which will further qualify datasets without labeling or sleep records.
Collapse
|
41
|
Rentz LE, Ulman HK, Galster SM. Deconstructing Commercial Wearable Technology: Contributions toward Accurate and Free-Living Monitoring of Sleep. SENSORS (BASEL, SWITZERLAND) 2021; 21:5071. [PMID: 34372308 PMCID: PMC8348972 DOI: 10.3390/s21155071] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/09/2021] [Accepted: 07/23/2021] [Indexed: 01/07/2023]
Abstract
Despite prolific demands and sales, commercial sleep assessment is primarily limited by the inability to "measure" sleep itself; rather, secondary physiological signals are captured, combined, and subsequently classified as sleep or a specific sleep state. Using markedly different approaches compared with gold-standard polysomnography, wearable companies purporting to measure sleep have rapidly developed during recent decades. These devices are advertised to monitor sleep via sensors such as accelerometers, electrocardiography, photoplethysmography, and temperature, alone or in combination, to estimate sleep stage based upon physiological patterns. However, without regulatory oversight, this market has historically manufactured products of poor accuracy, and rarely with third-party validation. Specifically, these devices vary in their capacities to capture a signal of interest, process the signal, perform physiological calculations, and ultimately classify a state (sleep vs. wake) or sleep stage during a given time domain. Device performance depends largely on success in all the aforementioned requirements. Thus, this review provides context surrounding the complex hardware and software developed by wearable device companies in their attempts to estimate sleep-related phenomena, and outlines considerations and contributing factors for overall device success.
Collapse
Affiliation(s)
| | | | - Scott M. Galster
- Human Performance Innovation Center, Rockefeller Neuroscience Institute, West Virginia University, Morgantown, WV 26505, USA; (L.E.R.); (H.K.U.)
| |
Collapse
|
42
|
Cesari M, Stefani A, Penzel T, Ibrahim A, Hackner H, Heidbreder A, Szentkirályi A, Stubbe B, Völzke H, Berger K, Högl B. Interrater sleep stage scoring reliability between manual scoring from two European sleep centers and automatic scoring performed by the artificial intelligence-based Stanford-STAGES algorithm. J Clin Sleep Med 2021; 17:1237-1247. [PMID: 33599203 DOI: 10.5664/jcsm.9174] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
STUDY OBJECTIVES The objective of this study was to evaluate interrater reliability between manual sleep stage scoring performed in 2 European sleep centers and automatic sleep stage scoring performed by the previously validated artificial intelligence-based Stanford-STAGES algorithm. METHODS Full night polysomnographies of 1,066 participants were included. Sleep stages were manually scored in Berlin and Innsbruck sleep centers and automatically scored with the Stanford-STAGES algorithm. For each participant, we compared (1) Innsbruck to Berlin scorings (INN vs BER); (2) Innsbruck to automatic scorings (INN vs AUTO); (3) Berlin to automatic scorings (BER vs AUTO); (4) epochs where scorers from Innsbruck and Berlin had consensus to automatic scoring (CONS vs AUTO); and (5) both Innsbruck and Berlin manual scorings (MAN) to the automatic ones (MAN vs AUTO). Interrater reliability was evaluated with several measures, including overall and sleep stage-specific Cohen's κ. RESULTS Overall agreement across participants was substantial for INN vs BER (κ = 0.66 ± 0.13), INN vs AUTO (κ = 0.68 ± 0.14), CONS vs AUTO (κ = 0.73 ± 0.14), and MAN vs AUTO (κ = 0.61 ± 0.14), and moderate for BER vs AUTO (κ = 0.55 ± 0.15). Human scorers had the highest disagreement for N1 sleep (κN1 = 0.40 ± 0.16 for INN vs BER). Automatic scoring had lowest agreement with manual scorings for N1 and N3 sleep (κN1 = 0.25 ± 0.14 and κN3 = 0.42 ± 0.32 for MAN vs AUTO). CONCLUSIONS Interrater reliability for sleep stage scoring between human scorers was in line with previous findings, and the algorithm achieved an overall substantial agreement with manual scoring. In this cohort, the Stanford-STAGES algorithm showed similar performances to the ones achieved in the original study, suggesting that it is generalizable to new cohorts. Before its integration in clinical practice, future independent studies should further evaluate it in other cohorts.
Collapse
Affiliation(s)
- Matteo Cesari
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Ambra Stefani
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Thomas Penzel
- Interdisciplinary Sleep Medicine Center, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Saratov State University, Saratov, Russian Federation
| | - Abubaker Ibrahim
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Heinz Hackner
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - Anna Heidbreder
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| | - András Szentkirályi
- Institute of Epidemiology and Social Medicine, University of Münster, Münster, Germany
| | - Beate Stubbe
- Department of Internal Medicine B, University Medicine Greifswald, Greifswald, Germany
| | - Henry Völzke
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Klaus Berger
- Institute of Epidemiology and Social Medicine, University of Münster, Münster, Germany
| | - Birgit Högl
- Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
| |
Collapse
|
43
|
Peter-Derex L, Berthomier C, Taillard J, Berthomier P, Bouet R, Mattout J, Brandewinder M, Bastuji H. Automatic analysis of single-channel sleep EEG in a large spectrum of sleep disorders. J Clin Sleep Med 2021; 17:393-402. [PMID: 33089777 DOI: 10.5664/jcsm.8864] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
STUDY OBJECTIVES To assess the performance of the single-channel automatic sleep staging (AS) software ASEEGA in adult patients diagnosed with various sleep disorders. METHODS Sleep recordings were included of 95 patients (38 women, 40.5 ± 13.7 years) diagnosed with insomnia (n = 23), idiopathic hypersomnia (n = 24), narcolepsy (n = 24), and obstructive sleep apnea (n = 24). Visual staging (VS) was performed by two experts (VS1 and VS2) according to the American Academy of Sleep Medicine rules. AS was based on the analysis of a single electroencephalogram channel (Cz-Pz), without any information from electro-oculography nor electromyography. The epoch-by-epoch agreement (concordance and Conger's coefficient [κ]) was compared pairwise (VS1-VS2, AS-VS1, AS-VS2) and between AS and consensual VS. Sleep parameters were also compared. RESULTS The pairwise agreements were: between AS and VS1, 78.6% (κ = 0.70); AS and VS2, 75.0% (0.65); and VS1 and VS2, 79.5% (0.72). Agreement between AS and consensual VS was 85.6% (0.80), with the following distribution: insomnia 85.5% (0.80), narcolepsy 83.8% (0.78), idiopathic hypersomnia 86.1% (0.68), and obstructive sleep disorder 87.2% (0.82). A significant low-amplitude scorer effect was observed for most sleep parameters, not always driven by the same scorer. Hypnograms obtained with AS and VS exhibited very close sleep organization, except for 80% of rapid eye movement sleep onset in the group diagnosed with narcolepsy missed by AS. CONCLUSIONS Agreement between AS and VS in sleep disorders is comparable to that reported in healthy individuals and to interexpert agreement in patients. ASEEGA could therefore be considered as a complementary sleep stage scoring tool in clinical practice, after improvement of rapid eye movement sleep onset detection.
Collapse
Affiliation(s)
- Laure Peter-Derex
- Center for Sleep Medicine and Respiratory Diseases, Croix-Rousse Hospital, Lyon, France.,Lyon Neuroscience Research Center, CNRS 5292 INSERM U1028, Lyon, France.,Lyon 1 University, Lyon, France
| | | | - Jacques Taillard
- CNRS, Bordeaux University, USR 3413 SANPSY Sleep, Addiction and Neuropsychiatry, Bordeaux, France
| | | | - Romain Bouet
- Lyon Neuroscience Research Center, CNRS 5292 INSERM U1028, Lyon, France
| | - Jérémie Mattout
- Lyon Neuroscience Research Center, CNRS 5292 INSERM U1028, Lyon, France
| | | | - Hélène Bastuji
- Center for Sleep Medicine and Respiratory Diseases, Croix-Rousse Hospital, Lyon, France.,Lyon Neuroscience Research Center, CNRS 5292 INSERM U1028, Lyon, France.,Functional Neurology and Epilepsy Unit, Neurological Hospital, Hospices Civils de Lyon, Bron, France
| |
Collapse
|
44
|
McConnell BV, Kronberg E, Teale PD, Sillau SH, Fishback GM, Kaplan RI, Fought AJ, Dhanasekaran AR, Berman BD, Ramos AR, McClure RL, Bettcher BM. The Aging Slow Wave: A Shifting Amalgam of Distinct Slow Wave and Spindle Coupling Subtypes Define Slow Wave Sleep Across the Human Lifespan. Sleep 2021; 44:6276901. [PMID: 33999194 PMCID: PMC8503831 DOI: 10.1093/sleep/zsab125] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 03/14/2021] [Indexed: 11/14/2022] Open
Abstract
STUDY OBJECTIVES Slow wave and spindle coupling supports memory consolidation, and loss of coupling is linked with cognitive decline and neurodegeneration. Coupling is proposed to be a possible biomarker of neurological disease, yet little is known about the different subtypes of coupling that normally occur throughout human development and aging. Here we identify distinct subtypes of spindles within slow wave upstates and describe their relationships with sleep stage across the human lifespan. METHODS Coupling within a cross-sectional cohort of 582 subjects was quantified from stages N2 and N3 sleep across ages 6-88 years old. Results were analyzed across the study population via mixed model regression. Within a subset of subjects, we further utilized coupling to identify discrete subtypes of slow waves by their coupled spindles. RESULTS Two different subtypes of spindles were identified during the upstates of (distinct) slow waves: an "early-fast" spindle, more common in stage N2 sleep, and a "late-fast" spindle, more common in stage N3. We further found stages N2 and N3 sleep contain a mixture of discrete subtypes of slow waves, each identified by their unique coupled-spindle timing and frequency. The relative contribution of coupling subtypes shifts across the human lifespan, and a deeper sleep phenotype prevails with increasing age. CONCLUSIONS Distinct subtypes of slow waves and coupled spindles form the composite of slow wave sleep. Our findings support a model of sleep-dependent synaptic regulation via discrete slow wave/spindle coupling subtypes and advance a conceptual framework for the development of coupling-based biomarkers in age-associated neurological disease.
Collapse
Affiliation(s)
- Brice V McConnell
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Eugene Kronberg
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Peter D Teale
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Stefan H Sillau
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Grace M Fishback
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Rini I Kaplan
- Psychological & Brain Sciences Boston University, Boston, MA, USA
| | - Angela J Fought
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | | | - Brian D Berman
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO, USA.,Neurology, Virginia Commonwealth University, Richmond, VA, USA
| | - Alberto R Ramos
- Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
| | | | - Brianne M Bettcher
- Neurology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
45
|
Ricci A, He F, Fang J, Calhoun SL, Vgontzas AN, Liao D, Younes M, Bixler EO, Fernandez-Mendoza J. Maturational trajectories of non-rapid eye movement slow wave activity and odds ratio product in a population-based sample of youth. Sleep Med 2021; 83:271-279. [PMID: 34049047 DOI: 10.1016/j.sleep.2021.05.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 04/23/2021] [Accepted: 05/01/2021] [Indexed: 02/06/2023]
Abstract
BACKGROUND Brain maturation is reflected in the sleep electroencephalogram (EEG) by a decline in non-rapid eye movement (NREM) slow wave activity (SWA) throughout adolescence and a related decrease in sleep depth. However, this trajectory and its sex and pubertal differences lack replication in population-based samples. We tested age-related changes in SWA (0.4-4 Hz) power and odds ratio product (ORP), a standardized measure of sleep depth. METHODS We analyzed the sleep EEG of 572 subjects aged 6-21 y (48% female, 26% racial/ethnic minority) and 332 subjects 5-12 y followed-up at 12-22 y. Multivariable-adjusted analyses tested age-related cross-sectional and longitudinal trajectories of SWA and ORP. RESULTS SWA remained stable from age 6 to 10, decreased between ages 11 and 17, and plateaued from age 18 to 21 (p-cubic<0.001); females showed a longitudinal decline 23% greater than males by 13 y, while males experienced a steeper slope after 14 y and their longitudinal decline was 21% greater by 19 y. More mature adolescents (75% female) experienced a greater longitudinal decline in SWA than less mature adolescents by 14 y. ORP showed an age-related increasing trajectory (p-linear<0.001) with no sex or pubertal differences. CONCLUSIONS We provide population-level evidence for the maturational decline and sex and pubertal differences in SWA in the transition from childhood to adolescence, while introducing ORP as a novel metric in youth. Along with previous studies, the distinct trajectories observed suggest that age-related changes in SWA reflect brain maturation and local/synaptic processes during this developmental period, while those of ORP may reflect global/state control of NREM sleep depth.
Collapse
Affiliation(s)
- Anna Ricci
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA
| | - Fan He
- Department of Public Health Sciences, Penn State College of Medicine, A210 Public Health Sciences, Hershey, PA, 17033 USA
| | - Jidong Fang
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA
| | - Susan L Calhoun
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA
| | - Alexandros N Vgontzas
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA
| | - Duanping Liao
- Department of Public Health Sciences, Penn State College of Medicine, A210 Public Health Sciences, Hershey, PA, 17033 USA
| | - Magdy Younes
- Sleep Disorders Centre, University of Manitoba, 1001 Wellington Crescent, Winnipeg, MB, R3M 0A7, Canada
| | - Edward O Bixler
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA
| | - Julio Fernandez-Mendoza
- Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State College of Medicine, 500 University Dr., Hershey, PA, 17033 USA.
| |
Collapse
|
46
|
Olesen AN, Jørgen Jennum P, Mignot E, Sorensen HBD. Automatic sleep stage classification with deep residual networks in a mixed-cohort setting. Sleep 2021; 44:5897250. [PMID: 32844179 DOI: 10.1093/sleep/zsaa161] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 06/30/2020] [Indexed: 11/13/2022] Open
Abstract
STUDY OBJECTIVES Sleep stage scoring is performed manually by sleep experts and is prone to subjective interpretation of scoring rules with low intra- and interscorer reliability. Many automatic systems rely on few small-scale databases for developing models, and generalizability to new datasets is thus unknown. We investigated a novel deep neural network to assess the generalizability of several large-scale cohorts. METHODS A deep neural network model was developed using 15,684 polysomnography studies from five different cohorts. We applied four different scenarios: (1) impact of varying timescales in the model; (2) performance of a single cohort on other cohorts of smaller, greater, or equal size relative to the performance of other cohorts on a single cohort; (3) varying the fraction of mixed-cohort training data compared with using single-origin data; and (4) comparing models trained on combinations of data from 2, 3, and 4 cohorts. RESULTS Overall classification accuracy improved with increasing fractions of training data (0.25%: 0.782 ± 0.097, 95% CI [0.777-0.787]; 100%: 0.869 ± 0.064, 95% CI [0.864-0.872]), and with increasing number of data sources (2: 0.788 ± 0.102, 95% CI [0.787-0.790]; 3: 0.808 ± 0.092, 95% CI [0.807-0.810]; 4: 0.821 ± 0.085, 95% CI [0.819-0.823]). Different cohorts show varying levels of generalization to other cohorts. CONCLUSIONS Automatic sleep stage scoring systems based on deep learning algorithms should consider as much data as possible from as many sources available to ensure proper generalization. Public datasets for benchmarking should be made available for future research.
Collapse
Affiliation(s)
- Alexander Neergaard Olesen
- Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark.,Stanford Center for Sleep Sciences and Medicine, Stanford University, Palo Alto, CA.,Danish Center for Sleep Medicine, Department of Clinical Neurophysiology, Rigshospitalet, Glostrup, Denmark
| | - Poul Jørgen Jennum
- Danish Center for Sleep Medicine, Department of Clinical Neurophysiology, Rigshospitalet, Glostrup, Denmark
| | - Emmanuel Mignot
- Stanford Center for Sleep Sciences and Medicine, Stanford University, Palo Alto, CA
| | | |
Collapse
|
47
|
Gottlieb E, Churilov L, Werden E, Churchward T, Pase MP, Egorova N, Howard ME, Brodtmann A. Sleep-wake parameters can be detected in patients with chronic stroke using a multisensor accelerometer: a validation study. J Clin Sleep Med 2021; 17:167-175. [PMID: 32975195 DOI: 10.5664/jcsm.8812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
STUDY OBJECTIVES Sleep-wake dysfunction is bidirectionally associated with the pathogenesis and evolution of stroke. Longitudinal and prospective measurement of sleep after chronic stroke remains poorly characterized because of a lack of validated objective and ambulatory sleep measurement tools in neurological populations. This study aimed to validate a multisensor sleep monitor, the SenseWear Armband (SWA), in patients with ischemic stroke and control patients using at-home polysomnography. METHODS Twenty-eight radiologically confirmed patients with ischemic stroke (aged 69.61 ± 7.35 years; mean = 4.1 years poststroke) and 16 control patients (aged 73.75 ± 7.10 years) underwent overnight at-home polysomnography in tandem with the SWA. Lin's concordance correlation coefficient and reduced major axis regressions were employed to assess concordance of SWA vs polysomnography-measured total sleep time, sleep efficiency, sleep onset latency, and wake after sleep onset. Subsequently, data were converted to 30-second epochs to match at-home polysomnography. Epoch-by-epoch agreement between SWA and at-home polysomnography was estimated using crude agreement, Cohen's kappa, sensitivity, and specificity. RESULTS Total sleep time was the most robustly quantified sleep-wake variable (concordance correlation coefficient = 0.49). The SWA performed poorest for sleep measures requiring discrimination of wakefulness (sleep onset latency; concordance correlation coefficient = 0.16). The sensitivity of the SWA was high (95.90%) for patients with stroke and for control patients (95.70%). The specificity of the SWA was fair-moderate for patients with stroke (40.45%) and moderate for control patients (45.60%). Epoch-by-epoch agreement rate was fair (78%) in patients with stroke and fair (74%) in controls. CONCLUSIONS The SWA shows promise as an ambulatory tool to estimate macro parameters of sleep-wake; however, agreement at an epoch level is only moderate-fair. Use of the SWA warrants caution when it is used as a diagnostic tool or in populations with significant sleep-wake fragmentation.
Collapse
Affiliation(s)
- Elie Gottlieb
- The Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia.,University of Melbourne, Melbourne, Victoria, Australia
| | | | - Emilio Werden
- The Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia.,University of Melbourne, Melbourne, Victoria, Australia
| | - Thomas Churchward
- Institute for Breathing and Sleep, Melbourne, Victoria, Australia.,Austin Health, Heidelberg, Victoria, Australia
| | - Matthew P Pase
- Turner Institute for Brain and Mental Health, School of Psychological Sciences, Monash University, Victoria, Australia.,Harvard T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts
| | - Natalia Egorova
- The Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia.,Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, Victoria, Australia
| | - Mark E Howard
- University of Melbourne, Melbourne, Victoria, Australia.,Institute for Breathing and Sleep, Melbourne, Victoria, Australia.,Austin Health, Heidelberg, Victoria, Australia.,Co-senior authors
| | - Amy Brodtmann
- The Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia.,University of Melbourne, Melbourne, Victoria, Australia.,Co-senior authors
| |
Collapse
|
48
|
Banfi T, Valigi N, di Galante M, d'Ascanio P, Ciuti G, Faraguna U. Efficient embedded sleep wake classification for open-source actigraphy. Sci Rep 2021; 11:345. [PMID: 33431918 PMCID: PMC7801620 DOI: 10.1038/s41598-020-79294-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 12/04/2020] [Indexed: 11/09/2022] Open
Abstract
This study presents a thorough analysis of sleep/wake detection algorithms for efficient on-device sleep tracking using wearable accelerometric devices. It develops a novel end-to-end algorithm using convolutional neural network applied to raw accelerometric signals recorded by an open-source wrist-worn actigraph. The aim of the study is to develop an automatic classifier that: (1) is highly generalizable to heterogenous subjects, (2) would not require manual features' extraction, (3) is computationally lightweight, embeddable on a sleep tracking device, and (4) is suitable for a wide assortment of actigraphs. Hereby, authors analyze sleep parameters, such as total sleep time, waking after sleep onset and sleep efficiency, by comparing the outcomes of the proposed algorithm to the gold standard polysomnographic concurrent recordings. The relatively substantial agreement (Cohen's kappa coefficient, median, equal to 0.78 ± 0.07) and the low-computational cost (2727 floating-point operations) make this solution suitable for an on-board sleep-detection approach.
Collapse
Affiliation(s)
- Tommaso Banfi
- The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy. .,Department of Excellence in Robotics & AI, Scuola Superiore Sant'Anna, Pisa, Italy. .,sleepActa S.R.L, Pontedera, Italy.
| | | | - Marco di Galante
- sleepActa S.R.L, Pontedera, Italy.,Department of Developmental Neuroscience, IRCCS Stella Maris, Pisa, Italy
| | - Paola d'Ascanio
- Department of Translational Research and of New Medical and Surgical Technologies, University of Pisa, Pisa, Italy
| | - Gastone Ciuti
- The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy.,Department of Excellence in Robotics & AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Ugo Faraguna
- sleepActa S.R.L, Pontedera, Italy.,Department of Developmental Neuroscience, IRCCS Stella Maris, Pisa, Italy.,Department of Translational Research and of New Medical and Surgical Technologies, University of Pisa, Pisa, Italy
| |
Collapse
|
49
|
Wang H, Lin G, Li Y, Zhang X, Xu W, Wang X, Han D. Automatic Sleep Stage Classification of Children with Sleep-Disordered Breathing Using the Modularized Network. Nat Sci Sleep 2021; 13:2101-2112. [PMID: 34876865 PMCID: PMC8643215 DOI: 10.2147/nss.s336344] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 10/12/2021] [Indexed: 12/05/2022] Open
Abstract
PURPOSE To develop an automatic sleep stage analysis model for children and evaluate the effect of the model on the diagnosis of sleep-disordered breathing (SDB). PATIENTS AND METHODS Three hundred and forty-four SDB patients aged between 2 to 18 years who completed polysomnography (PSG) to assess the severity of the disease were enrolled in this study. We developed deep neural networks to stage sleep from electroencephalography (EEG), electrooculography (EOG) and electromyogram (EMG). The model performance was estimated by accuracy, precision, recall, F1-score, and Cohen's Kappa coefficient (ĸ). And we compared the difference in calculation of sleep parameters among the technicians, the model ensemble, and the single-channel EEG model. RESULTS The numbers of raw data divided into training, validation, and testing were 240, 36, and 68, respectively. The best performance appeared in the model ensemble of which the accuracy was 83.36% (ĸ=0.7817) in 5-stages, and the accuracy was 96.76% (ĸ=0.8236) in 2-stages. The single-channel EEG model showed the classification satisfyingly as well. There was no significant difference in TST, SE, SOL, time in W, time in N1+N2, time in N3, and OAHI between technician and the model (P>0.05). On the datasets from sleep-EDF-13 and sleep-EDF-18, the average classification accuracies achieved were 92.76% and 91.94% in 5-stages by using the proposed method, respectively. CONCLUSION This research established the model for pediatric automatic sleep stage classification with satisfying reliability and generalizability. In addition, it could be applied for calculating quantitative sleep parameters and evaluating the severity of SDB.
Collapse
Affiliation(s)
- Huijun Wang
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, People's Republic of China.,Obstructive Sleep Apnea-Hypopnea Syndrome Clinical Diagnosis and Therapy and Research Centre, Capital Medical University, Beijing, People's Republic of China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Ministry of Education, Capital Medical University, Beijing, People's Republic of China
| | - Guodong Lin
- Department of Electronic Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, People's Republic of China
| | - Yanru Li
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, People's Republic of China.,Obstructive Sleep Apnea-Hypopnea Syndrome Clinical Diagnosis and Therapy and Research Centre, Capital Medical University, Beijing, People's Republic of China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Ministry of Education, Capital Medical University, Beijing, People's Republic of China
| | - Xiaoqing Zhang
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, People's Republic of China.,Obstructive Sleep Apnea-Hypopnea Syndrome Clinical Diagnosis and Therapy and Research Centre, Capital Medical University, Beijing, People's Republic of China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Ministry of Education, Capital Medical University, Beijing, People's Republic of China
| | - Wen Xu
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, People's Republic of China.,Obstructive Sleep Apnea-Hypopnea Syndrome Clinical Diagnosis and Therapy and Research Centre, Capital Medical University, Beijing, People's Republic of China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Ministry of Education, Capital Medical University, Beijing, People's Republic of China
| | - Xingjun Wang
- Department of Electronic Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, People's Republic of China
| | - Demin Han
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, People's Republic of China.,Obstructive Sleep Apnea-Hypopnea Syndrome Clinical Diagnosis and Therapy and Research Centre, Capital Medical University, Beijing, People's Republic of China.,Key Laboratory of Otolaryngology-Head and Neck Surgery, Ministry of Education, Capital Medical University, Beijing, People's Republic of China
| |
Collapse
|
50
|
Banville H, Chehab O, Hyvarinen A, Engemann D, Gramfort A. Uncovering the structure of clinical EEG signals with self-supervised learning. J Neural Eng 2020; 18. [PMID: 33181507 DOI: 10.1088/1741-2552/abca18] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 11/12/2020] [Indexed: 01/28/2023]
Abstract
OBJECTIVE Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relatively shallow models and performances at best similar to those of traditional feature-based approaches. However, in most situations, unlabeled data is available in abundance. By extracting information from this unlabeled data, it might be possible to reach competitive performance with deep neural networks despite limited access to labels. APPROACH We investigated self-supervised learning (SSL), a promising technique for discovering structure in unlabeled data, to learn representations of EEG signals. Specifically, we explored two tasks based on temporal context prediction as well as contrastive predictive coding on two clinically-relevant problems: EEG-based sleep staging and pathology detection. We conducted experiments on two large public datasets with thousands of recordings and performed baseline comparisons with purely supervised and hand-engineered approaches. MAIN RESULTS Linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available. Additionally, the embeddings learned with each method revealed clear latent structures related to physiological and clinical phenomena, such as age effects. SIGNIFICANCE We demonstrate the benefit of SSL approaches on EEG data. Our results suggest that self-supervision may pave the way to a wider use of deep learning models on EEG data.
Collapse
|