1
|
Meng R, Bouchard KE. Bayesian inference of structured latent spaces from neural population activity with the orthogonal stochastic linear mixing model. PLoS Comput Biol 2024; 20:e1011975. [PMID: 38669271 PMCID: PMC11078355 DOI: 10.1371/journal.pcbi.1011975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 05/08/2024] [Accepted: 03/07/2024] [Indexed: 04/28/2024] Open
Abstract
The brain produces diverse functions, from perceiving sounds to producing arm reaches, through the collective activity of populations of many neurons. Determining if and how the features of these exogenous variables (e.g., sound frequency, reach angle) are reflected in population neural activity is important for understanding how the brain operates. Often, high-dimensional neural population activity is confined to low-dimensional latent spaces. However, many current methods fail to extract latent spaces that are clearly structured by exogenous variables. This has contributed to a debate about whether or not brains should be thought of as dynamical systems or representational systems. Here, we developed a new latent process Bayesian regression framework, the orthogonal stochastic linear mixing model (OSLMM) which introduces an orthogonality constraint amongst time-varying mixture coefficients, and provide Markov chain Monte Carlo inference procedures. We demonstrate superior performance of OSLMM on latent trajectory recovery in synthetic experiments and show superior computational efficiency and prediction performance on several real-world benchmark data sets. We primarily focus on demonstrating the utility of OSLMM in two neural data sets: μECoG recordings from rat auditory cortex during presentation of pure tones and multi-single unit recordings form monkey motor cortex during complex arm reaching. We show that OSLMM achieves superior or comparable predictive accuracy of neural data and decoding of external variables (e.g., reach velocity). Most importantly, in both experimental contexts, we demonstrate that OSLMM latent trajectories directly reflect features of the sounds and reaches, demonstrating that neural dynamics are structured by neural representations. Together, these results demonstrate that OSLMM will be useful for the analysis of diverse, large-scale biological time-series datasets.
Collapse
Affiliation(s)
- Rui Meng
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Kristofer E. Bouchard
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Scientific Data Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, California, United States of America
- Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, California, United States of America
| |
Collapse
|
2
|
Wang Y, Zhao W, Ross A, You L, Wang H, Zhou X. Revealing chronic disease progression patterns using Gaussian process for stage inference. J Am Med Inform Assoc 2024; 31:396-405. [PMID: 38055638 PMCID: PMC10797260 DOI: 10.1093/jamia/ocad230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/06/2023] [Accepted: 11/20/2023] [Indexed: 12/08/2023] Open
Abstract
OBJECTIVE The early stages of chronic disease typically progress slowly, so symptoms are usually only noticed until the disease is advanced. Slow progression and heterogeneous manifestations make it challenging to model the transition from normal to disease status. As patient conditions are only observed at discrete timestamps with varying intervals, an incomplete understanding of disease progression and heterogeneity affects clinical practice and drug development. MATERIALS AND METHODS We developed the Gaussian Process for Stage Inference (GPSI) approach to uncover chronic disease progression patterns and assess the dynamic contribution of clinical features. We tested the ability of the GPSI to reliably stratify synthetic and real-world data for osteoarthritis (OA) in the Osteoarthritis Initiative (OAI), bipolar disorder (BP) in the Adolescent Brain Cognitive Development Study (ABCD), and hepatocellular carcinoma (HCC) in the UTHealth and The Cancer Genome Atlas (TCGA). RESULTS First, GPSI identified two subgroups of OA based on image features, where these subgroups corresponded to different genotypes, indicating the bone-remodeling and overweight-related pathways. Second, GPSI differentiated BP into two distinct developmental patterns and defined the contribution of specific brain region atrophy from early to advanced disease stages, demonstrating the ability of the GPSI to identify diagnostic subgroups. Third, HCC progression patterns were well reproduced in the two independent UTHealth and TCGA datasets. CONCLUSION Our study demonstrated that an unsupervised approach can disentangle temporal and phenotypic heterogeneity and identify population subgroups with common patterns of disease progression. Based on the differences in these features across stages, physicians can better tailor treatment plans and medications to individual patients.
Collapse
Affiliation(s)
- Yanfei Wang
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Weiling Zhao
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Angela Ross
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Lei You
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Hongyu Wang
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
- Cizik School of Nursing, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| |
Collapse
|
3
|
Krock ML, Kleiber W, Hammerling D, Becker S. Modeling massive highly-multivariate nonstationary spatial data with the basis graphical lasso. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2174126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Affiliation(s)
- Mitchell L. Krock
- Mathematics and Computer Science Division, Argonne National Laboratory
| | - William Kleiber
- Department of Applied Mathematics, University of Colorado Boulder
| | - Dorit Hammerling
- Department of Applied Mathematics and Statistics, Colorado School of Mines
| | - Stephen Becker
- Department of Applied Mathematics, University of Colorado Boulder
| |
Collapse
|
4
|
Kaplan AD, Greene JD, Liu VX, Ray P. Unsupervised probabilistic models for sequential Electronic Health Records. J Biomed Inform 2022; 134:104163. [PMID: 36038064 PMCID: PMC10588733 DOI: 10.1016/j.jbi.2022.104163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/23/2022] [Accepted: 08/11/2022] [Indexed: 11/18/2022]
Abstract
We develop an unsupervised probabilistic model for heterogeneous Electronic Health Record (EHR) data. Utilizing a mixture model formulation, our approach directly models sequences of arbitrary length, such as medications and laboratory results. This allows for subgrouping and incorporation of the dynamics underlying heterogeneous data types. The model consists of a layered set of latent variables that encode underlying structure in the data. These variables represent subject subgroups at the top layer, and unobserved states for sequences in the second layer. We train this model on episodic data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The resulting properties of the trained model generate novel insight from these complex and multifaceted data. In addition, we show how the model can be used to analyze sequences that contribute to assessment of mortality likelihood.
Collapse
Affiliation(s)
- Alan D Kaplan
- Computational Engineering Division, Lawrence Livermore National Laboratory, 7000 East Ave., Livermore, CA 94550, United States of America.
| | - John D Greene
- Kaiser Permanente Division of Research, 2000 Broadway, Oakland, CA 94612, United States of America
| | - Vincent X Liu
- Kaiser Permanente Division of Research, 2000 Broadway, Oakland, CA 94612, United States of America
| | - Priyadip Ray
- Computational Engineering Division, Lawrence Livermore National Laboratory, 7000 East Ave., Livermore, CA 94550, United States of America
| |
Collapse
|
5
|
Soper BC, Cadena J, Nguyen S, Chan KHR, Kiszka P, Womack L, Work M, Duggan JM, Haller ST, Hanrahan JA, Kennedy DJ, Mukundan D, Ray P. OUP accepted manuscript. J Am Med Inform Assoc 2022; 29:864-872. [PMID: 35137149 PMCID: PMC8903413 DOI: 10.1093/jamia/ocac012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 12/15/2021] [Accepted: 01/28/2022] [Indexed: 11/12/2022] Open
Abstract
Objective The study sought to investigate the disease state–dependent risk profiles of patient demographics and medical comorbidities associated with adverse outcomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Materials and Methods A covariate-dependent, continuous-time hidden Markov model with 4 states (moderate, severe, discharged, and deceased) was used to model the dynamic progression of COVID-19 during the course of hospitalization. All model parameters were estimated using the electronic health records of 1362 patients from ProMedica Health System admitted between March 20, 2020 and December 29, 2020 with a positive nasopharyngeal PCR test for SARS-CoV-2. Demographic characteristics, comorbidities, vital signs, and laboratory test results were retrospectively evaluated to infer a patient’s clinical progression. Results The association between patient-level covariates and risk of progression was found to be disease state dependent. Specifically, while being male, being Black or having a medical comorbidity were all associated with an increased risk of progressing from the moderate disease state to the severe disease state, these same factors were associated with a decreased risk of progressing from the severe disease state to the deceased state. Discussion Recent studies have not included analyses of the temporal progression of COVID-19, making the current study a unique modeling-based approach to understand the dynamics of COVID-19 in hospitalized patients. Conclusion Dynamic risk stratification models have the potential to improve clinical outcomes not only in COVID-19, but also in a myriad of other acute and chronic diseases that, to date, have largely been assessed only by static modeling techniques.
Collapse
Affiliation(s)
- Braden C Soper
- Corresponding Author: Braden C. Soper, PhD, Computing Directorate, Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA 94550, USA;
| | - Jose Cadena
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Sam Nguyen
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Kwan Ho Ryan Chan
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Paul Kiszka
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Lucas Womack
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Mark Work
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Joan M Duggan
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Steven T Haller
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Jennifer A Hanrahan
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - David J Kennedy
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Deepa Mukundan
- Department of Pediatrics, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Priyadip Ray
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| |
Collapse
|