1
|
Dhinagar NJ, Thomopoulos SI, Owens‐Walton C, Stripelis D, Ambite JL, Steeg GV, Thompson PM. Alzheimer’s Disease Detection with a 3D Convolutional Neural Network using Gray Matter Maps from T1‐weighted Brain MRI. Alzheimers Dement 2022. [DOI: 10.1002/alz.066446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Nikhil J Dhinagar
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California Marina del Rey CA USA
| | - Sophia I Thomopoulos
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California Marina del Rey CA USA
| | - Conor Owens‐Walton
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California Marina del Rey CA USA
| | - Dimitris Stripelis
- Information Sciences Institute, University of Southern California Marina del Rey CA USA
| | - Jose Luis Ambite
- Information Sciences Institute, University of Southern California Marina del Rey CA USA
| | - Greg Ver Steeg
- Information Sciences Institute, University of Southern California Marina del Rey CA USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California Marina del Rey CA USA
| |
Collapse
|
2
|
Stripelis D, Thompson PM, Ambite JL. Semi-Synchronous Federated Learning for Energy-Efficient Training and Accelerated Convergence in Cross-Silo Settings. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3524885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
There are situations where data relevant to machine learning problems are distributed across multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning (FL) is a promising approach to learn a joint model over all the available data across silos. In many cases, the sites participating in a federation have different data distributions and computational capabilities. In these heterogeneous environments existing approaches exhibit poor performance: synchronous FL protocols are communication efficient, but have slow learning convergence and high energy cost; conversely, asynchronous FL protocols have faster convergence with lower energy cost, but higher communication. In this work, we introduce a novel energy-efficient
Semi-Synchronous Federated Learning
protocol that mixes local models periodically with minimal idle time and fast convergence. We show through extensive experiments over established benchmark datasets in the computer-vision domain as well as in real-world biomedical settings that our approach significantly outperforms previous work in
data and computationally heterogeneous environments
.
Collapse
Affiliation(s)
| | - Paul M. Thompson
- Imaging Genetics Center, Stevens Neuroimaging and Informatics Institute, University of Southern California, USA
| | - José Luis Ambite
- Information Sciences Institute, University of Southern California, USA
| |
Collapse
|
3
|
Li K, Habre R, Deng H, Urman R, Morrison J, Gilliland FD, Ambite JL, Stripelis D, Chiang YY, Lin Y, Bui AA, King C, Hosseini A, Vliet EV, Sarrafzadeh M, Eckel SP. Applying Multivariate Segmentation Methods to Human Activity Recognition From Wearable Sensors' Data. JMIR Mhealth Uhealth 2019; 7:e11201. [PMID: 30730297 PMCID: PMC6386646 DOI: 10.2196/11201] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 09/30/2018] [Accepted: 11/14/2018] [Indexed: 12/20/2022] Open
Abstract
Background Time-resolved quantification of physical activity can contribute to both personalized medicine and epidemiological research studies, for example, managing and identifying triggers of asthma exacerbations. A growing number of reportedly accurate machine learning algorithms for human activity recognition (HAR) have been developed using data from wearable devices (eg, smartwatch and smartphone). However, many HAR algorithms depend on fixed-size sampling windows that may poorly adapt to real-world conditions in which activity bouts are of unequal duration. A small sliding window can produce noisy predictions under stable conditions, whereas a large sliding window may miss brief bursts of intense activity. Objective We aimed to create an HAR framework adapted to variable duration activity bouts by (1) detecting the change points of activity bouts in a multivariate time series and (2) predicting activity for each homogeneous window defined by these change points. Methods We applied standard fixed-width sliding windows (4-6 different sizes) or greedy Gaussian segmentation (GGS) to identify break points in filtered triaxial accelerometer and gyroscope data. After standard feature engineering, we applied an Xgboost model to predict physical activity within each window and then converted windowed predictions to instantaneous predictions to facilitate comparison across segmentation methods. We applied these methods in 2 datasets: the human activity recognition using smartphones (HARuS) dataset where a total of 30 adults performed activities of approximately equal duration (approximately 20 seconds each) while wearing a waist-worn smartphone, and the Biomedical REAl-Time Health Evaluation for Pediatric Asthma (BREATHE) dataset where a total of 14 children performed 6 activities for approximately 10 min each while wearing a smartwatch. To mimic a real-world scenario, we generated artificial unequal activity bout durations in the BREATHE data by randomly subdividing each activity bout into 10 segments and randomly concatenating the 60 activity bouts. Each dataset was divided into ~90% training and ~10% holdout testing. Results In the HARuS data, GGS produced the least noisy predictions of 6 physical activities and had the second highest accuracy rate of 91.06% (the highest accuracy rate was 91.79% for the sliding window of size 0.8 second). In the BREATHE data, GGS again produced the least noisy predictions and had the highest accuracy rate of 79.4% of predictions for 6 physical activities. Conclusions In a scenario with variable duration activity bouts, GGS multivariate segmentation produced smart-sized windows with more stable predictions and a higher accuracy rate than traditional fixed-size sliding window approaches. Overall, accuracy was good in both datasets but, as expected, it was slightly lower in the more real-world study using wrist-worn smartwatches in children (BREATHE) than in the more tightly controlled study using waist-worn smartphones in adults (HARuS). We implemented GGS in an offline setting, but it could be adapted for real-time prediction with streaming data.
Collapse
Affiliation(s)
- Kenan Li
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - Rima Habre
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - Huiyu Deng
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - Robert Urman
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - John Morrison
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - Frank D Gilliland
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - José Luis Ambite
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Dimitris Stripelis
- Information Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Yao-Yi Chiang
- Spatial Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Yijun Lin
- Spatial Sciences Institute, University of Southern California, Los Angeles, CA, United States
| | - Alex At Bui
- Department of Radiological Sciences, University of California Los Angeles, Los Angeles, CA, United States
| | - Christine King
- Department of Biomedical Engineering, University of California, Irvine, Irvine, CA, United States
| | - Anahita Hosseini
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, United States
| | - Eleanne Van Vliet
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| | - Majid Sarrafzadeh
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, United States
| | - Sandrah P Eckel
- Department of Preventive Medicine, Keck School of Medicine of University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
4
|
Stripelis D, Ambite JL, Chiang YY, Eckel SP, Habre R. A Scalable Data Integration and Analysis Architecture for Sensor Data of Pediatric Asthma. Proc Int Conf Data Eng 2017; 2017:1407-1408. [PMID: 29731601 DOI: 10.1109/icde.2017.198] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
According to the Centers for Disease Control, in the United States there are 6.8 million children living with asthma. Despite the importance of the disease, the available prognostic tools are not sufficient for biomedical researchers to thoroughly investigate the potential risks of the disease at scale. To overcome these challenges we present a big data integration and analysis infrastructure developed by our Data and Software Coordination and Integration Center (DSCIC) of the NIBIB-funded Pediatric Research using Integrated Sensor Monitoring Systems (PRISMS) program. Our goal is to help biomedical researchers to efficiently predict and prevent asthma attacks. The PRISMS-DSCIC is responsible for collecting, integrating, storing, and analyzing real-time environmental, physiological and behavioral data obtained from heterogeneous sensor and traditional data sources. Our architecture is based on the Apache Kafka, Spark and Hadoop frameworks and PostgreSQL DBMS. A main contribution of this work is extending the Spark framework with a mediation layer, based on logical schema mappings and query rewriting, to facilitate data analysis over a consistent harmonized schema. The system provides both batch and stream analytic capabilities over the massive data generated by wearable and fixed sensors.
Collapse
Affiliation(s)
| | - José Luis Ambite
- Information Sciences Institute, University of Southern California
| | - Yao-Yi Chiang
- Spatial Sciences Institute, University of Southern California
| | - Sandrah P Eckel
- Department of Preventive Medicine, University of Southern California
| | - Rima Habre
- Department of Preventive Medicine, University of Southern California
| |
Collapse
|