1
|
Genkin M, Hughes O, Engel TA. Learning non-stationary Langevin dynamics from stochastic observations of latent trajectories. Nat Commun 2021; 12:5986. [PMID: 34645828 PMCID: PMC8514604 DOI: 10.1038/s41467-021-26202-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 09/22/2021] [Indexed: 11/09/2022] Open
Abstract
Many complex systems operating far from the equilibrium exhibit stochastic dynamics that can be described by a Langevin equation. Inferring Langevin equations from data can reveal how transient dynamics of such systems give rise to their function. However, dynamics are often inaccessible directly and can be only gleaned through a stochastic observation process, which makes the inference challenging. Here we present a non-parametric framework for inferring the Langevin equation, which explicitly models the stochastic observation process and non-stationary latent dynamics. The framework accounts for the non-equilibrium initial and final states of the observed system and for the possibility that the system's dynamics define the duration of observations. Omitting any of these non-stationary components results in incorrect inference, in which erroneous features arise in the dynamics due to non-stationary data distribution. We illustrate the framework using models of neural dynamics underlying decision making in the brain.
Collapse
Affiliation(s)
- Mikhail Genkin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | | |
Collapse
|
2
|
Hodge SR, Berg MA. Nonlinear measurements of kinetics and generalized dynamical modes. I. Extracting the one-dimensional Green's function from a time series. J Chem Phys 2021; 155:024122. [PMID: 34266246 DOI: 10.1063/5.0053422] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Often, a single correlation function is used to measure the kinetics of a complex system. In contrast, a large set of k-vector modes and their correlation functions are commonly defined for motion in free space. This set can be transformed to the van Hove correlation function, which is the Green's function for molecular diffusion. Here, these ideas are generalized to other observables. A set of correlation functions of nonlinear functions of an observable is used to extract the corresponding Green's function. Although this paper focuses on nonlinear correlation functions of an equilibrium time series, the results are directly connected to other types of nonlinear kinetics, including perturbation-response experiments with strong fields. Generalized modes are defined as the orthogonal polynomials associated with the equilibrium distribution. A matrix of mode-correlation functions can be transformed to the complete, single-time-interval (1D) Green's function. Diagonalizing this matrix finds the eigendecays. To understand the advantages and limitation of this approach, Green's functions are calculated for a number of models of complex dynamics within a Gaussian probability distribution. Examples of non-diffusive motion, rate heterogeneity, and range heterogeneity are examined. General arguments are made that a full set of nonlinear 1D measurements is necessary to extract all the information available in a time series. However, when a process is neither dynamically Gaussian nor Markovian, they are not sufficient. In those cases, additional multidimensional measurements are needed.
Collapse
Affiliation(s)
- Stuart R Hodge
- Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Mark A Berg
- Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208, USA
| |
Collapse
|
3
|
Lee J, Brooks BR. Direct global optimization of Onsager-Machlup action using Action-CSA. Chem Phys 2020. [DOI: 10.1016/j.chemphys.2020.110768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
4
|
Yin S, Tien M, Yang H. Prior-Apprised Unsupervised Learning of Subpixel Curvilinear Features in Low Signal/Noise Images. Biophys J 2020; 118:2458-2469. [PMID: 32359407 PMCID: PMC7231927 DOI: 10.1016/j.bpj.2020.04.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Revised: 03/07/2020] [Accepted: 04/09/2020] [Indexed: 11/16/2022] Open
Abstract
Many biophysical problems involve molecular and nanoscale targets moving next to a curvilinear track, e.g., a cytosolic cargo transported by motor proteins moving along a microtubule. For this type of problem, fluorescence imaging is usually the primary tool of choice. There is, however, an ∼20-fold mismatch between target-localization precision and track-imaging resolution such that questions requiring high-fidelity definition of the target's track remain inaccessible. On the other hand, if the contextual image of the tracks can be refined to a level comparable to that of the target, many intuitive yet mechanistically important issues can begin to be addressed. This work demonstrates that it is possible to statistically infer, to subpixel precision, curvilinear features in a low signal/noise image. This is achieved by a framework that consists of three stages: the Hessian-based feature enhancement, the subimage feature sampling and registration, and the statistical learning of the underlying curvilinear structure using a new, to our knowledge, method developed here for inferring the principal curves. In each stage, the descriptive prior information that the features come from curvilinear elements is explicitly taken into account. It is fully automated without user supervision, which is distinctly different from approaches that require user seeding or well-defined training data sets. Computer simulations of realistic images are used to investigate the performance of the framework and its implementation. The characterization results suggest that curvilinear features are refined to the same order of precision as that of the target and that the bootstrap confidence intervals from the analysis allow an estimate for the statistical bounds of the simulated "true" curve. Also shown are analyses of experimental images from three different microscopy modalities: two-photon laser-scanning microscopy, epifluorescence microscopy, and total internal reflection fluorescence microscopy. The practical application of this prior-apprised unsupervised learning framework as well as its potential outlook are discussed.
Collapse
Affiliation(s)
- Shuhui Yin
- Department of Chemistry, Princeton University, Princeton, New Jersey
| | - Ming Tien
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, Pennsylvania
| | - Haw Yang
- Department of Chemistry, Princeton University, Princeton, New Jersey.
| |
Collapse
|
5
|
Abstract
Time series obtained from time-dependent experiments contain rich information on kinetics and dynamics of the system under investigation. This work describes an unsupervised learning framework, along with the derivation of the necessary analytical expressions, for the analysis of Gaussian-distributed time series that exhibit discrete states. After the time series has been partitioned into segments in a model-free manner using the previously developed change-point (CP) method, this protocol starts with an agglomerative hierarchical clustering algorithm to classify the detected segments into possible states. The initial state clustering is further refined using an expectation-maximization (EM) procedure, and the number of states is determined by a Bayesian information criterion (BIC). Also introduced here is an achievement scalarization function, usually seen in artificial intelligence literature, for quantitatively assessing the performance of state determination. The statistical learning framework, which is comprised of three stages, detection of signal change, clustering, and number-of-state determination, was thoroughly characterized using simulated trajectories with random intensity segments that have no underlying kinetics, and its performance was critically evaluated. The application to experimental data is also demonstrated. The results suggested that this general framework, the implementation of which is based on firm theoretical foundations and does not require the imposition of any kinetics model, is powerful in determining the number of states, the parameters contained in each state, as well as the associated statistical significance.
Collapse
Affiliation(s)
- Hao Li
- Department of Chemistry , Princeton University , Princeton , New Jersey 08544 , United States
| | - Haw Yang
- Department of Chemistry , Princeton University , Princeton , New Jersey 08544 , United States
| |
Collapse
|
6
|
Morrell TE, Rafalska-Metcalf IU, Yang H, Chu JW. Compound Molecular Logic in Accessing the Active Site of Mycobacterium tuberculosis Protein Tyrosine Phosphatase B. J Am Chem Soc 2018; 140:14747-14752. [PMID: 30301350 DOI: 10.1021/jacs.8b08070] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein tyrosine phosphatase B (PtpB) from Mycobacterium tuberculosis (Mtb) extends the bacteria's survival in hosts and hence is a potential target for Mtb-specific drugs. To study how Mtb-specific sequence insertions in PtpB may regulate access to its active site through large-amplitude conformational changes, we performed free-energy calculations using an all-atom explicit solvent model. Corroborated by biochemical assays, the results show that PtpB's active site is controlled via an "either/or" compound conformational gating mechanism, an unexpected discovery that Mtb has evolved to bestow a single enzyme with such intricate logical operations. In addition to providing unprecedented insights for its active-site surroundings, the findings also suggest new ways of inactivating PtpB.
Collapse
Affiliation(s)
- Thomas E Morrell
- Department of Chemistry , Princeton University , Princeton , New Jersey 08544 , United States
| | | | - Haw Yang
- Department of Chemistry , Princeton University , Princeton , New Jersey 08544 , United States
| | - Jhih-Wei Chu
- Institute of Bioinformatics and Systems Biology, Department of Biological Science and Technology, and Institute of Molecular Medicine and Bioengineering , National Chiao Tung University , Hsinchu , Taiwan 30068 , ROC
| |
Collapse
|
7
|
Chu JW, Yang H. Identifying the structural and kinetic elements in protein large-amplitude conformational motions. INT REV PHYS CHEM 2017. [DOI: 10.1080/0144235x.2017.1283885] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
8
|
Sun X, Morrell TE, Yang H. Extraction of Protein Conformational Modes from Distance Distributions Using Structurally Imputed Bayesian Data Augmentation. J Phys Chem B 2016; 120:10469-10482. [PMID: 27642672 DOI: 10.1021/acs.jpcb.6b07767] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Protein conformational changes are known to play important roles in assorted biochemical and biological processes. Driven by thermal motions of surrounding solvent molecules, such a structural remodeling often occurs stochastically. Yet, regardless of how random the conformational reconfiguration may appear, it could in principle be described by a linear combination of a set of orthogonal modes which, in turn, are contained in the intramolecular distance distributions. The central challenge is how to obtain the distribution. This contribution proposes a Bayesian data-augmentation scheme to extract the predominant modes from only few distance distributions, be they from computational sampling or directly from experiments such as single-molecule Förster-type resonance energy transfer (smFRET). The inference of the complete protein structure from insufficient data was recognized as isomorphic to the missing-data problem in Bayesian statistical learning. Using smFRET data as an example, the missing coordinates were deduced, given protein structural constraints and multiple but limited number of smFRET distances; the Boltzmann weighing of each inferred protein structure was then evaluated using computational modeling to numerically construct the posterior density for the global protein conformation. The conformational modes were then determined from the iteratively converged overall conformational distribution using principal component analysis. Two examples were presented to illustrate these basic ideas as well as their practical implementation. The scheme described herein was based on the theory behind the powerful Tanner-Wang algorithm that guarantees convergence to the true posterior density. However, instead of assuming a mathematical model to calculate the likelihood as in conventional statistical inference, here the protein structure was treated as a statistical parameter and was imputed from the numerical likelihood function based on structural information, a probability model-free method. The framework put forth here is anticipated to be generally applicable, offering a new way to articulate protein conformational changes in a quantifiable manner.
Collapse
Affiliation(s)
- Xun Sun
- Department of Chemistry, Princeton University , Princeton, New Jersey 08544, United States
| | - Thomas E Morrell
- Department of Chemistry, Princeton University , Princeton, New Jersey 08544, United States
| | - Haw Yang
- Department of Chemistry, Princeton University , Princeton, New Jersey 08544, United States
| |
Collapse
|
9
|
Bras W, Koizumi S, Terrill NJ. Beyond simple small-angle X-ray scattering: developments in online complementary techniques and sample environments. IUCRJ 2014; 1:478-91. [PMID: 25485128 PMCID: PMC4224466 DOI: 10.1107/s2052252514019198] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Accepted: 08/25/2014] [Indexed: 05/20/2023]
Abstract
Small- and wide-angle X-ray scattering (SAXS, WAXS) are standard tools in materials research. The simultaneous measurement of SAXS and WAXS data in time-resolved studies has gained popularity due to the complementary information obtained. Furthermore, the combination of these data with non X-ray based techniques, via either simultaneous or independent measurements, has advanced understanding of the driving forces that lead to the structures and morphologies of materials, which in turn give rise to their properties. The simultaneous measurement of different data regimes and types, using either X-rays or neutrons, and the desire to control parameters that initiate and control structural changes have led to greater demands on sample environments. Examples of developments in technique combinations and sample environment design are discussed, together with a brief speculation about promising future developments.
Collapse
Affiliation(s)
- Wim Bras
- Netherlands Organization for Scientific Research (NWO), DUBBLE@ESRF, BP 220, 6 Rue Jules Horowitz, Grenoble 38043, France
| | - Satoshi Koizumi
- College of Engineering, Ibaraki University, 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan
| | - Nicholas J Terrill
- Science Division, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0DE, UK
| |
Collapse
|
10
|
Vestergaard B, Sayers Z. Investigating increasingly complex macromolecular systems with small-angle X-ray scattering. IUCRJ 2014; 1:523-9. [PMID: 25485132 PMCID: PMC4224470 DOI: 10.1107/s2052252514020843] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2014] [Accepted: 09/17/2014] [Indexed: 05/04/2023]
Abstract
The biological solution small-angle X-ray scattering (BioSAXS) field has undergone tremendous development over recent decades. This means that increasingly complex biological questions can be addressed by the method. An intricate synergy between advances in hardware and software development, data collection and evaluation strategies and implementations that readily allow integration with complementary techniques result in significant results and a rapidly growing user community with ever increasing ambitions. Here, a review of these developments, by including a selection of novel BioSAXS method-ologies and recent results, is given.
Collapse
Affiliation(s)
- Bente Vestergaard
- Department of Drug Design and Pharmacology, University of Copenhagen, Universitetsparken 2, Copenhagen, DK-2100, Denmark
- Correspondence e-mail:
| | - Zehra Sayers
- Faculty of Engineering and Natural Science, Sabanci University, Orhanli, Istanbul Tuzla 34956, Turkey
| |
Collapse
|
11
|
Haas KR, Yang H, Chu JW. Trajectory Entropy of Continuous Stochastic Processes at Equilibrium. J Phys Chem Lett 2014; 5:999-1003. [PMID: 26270979 DOI: 10.1021/jz500111p] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We propose to quantify the trajectory entropy of a dynamic system as the information content in excess of a free-diffusion reference model. The space-time trajectory is now the dynamic variable, and its path probability is given by the Onsager-Machlup action. For the time propagation of the overdamped Langevin equation, we solved the action path integral in the continuum limit and arrived at an exact analytical expression that emerged as a simple functional of the deterministic mean force and the stochastic diffusion. This work may have direct implications in chemical and phase equilibria, bond isomerization, and conformational changes in biological macromolecules as well transport problems in general.
Collapse
Affiliation(s)
- Kevin R Haas
- †Department of Chemical and Biomolecular Engineering, University of California-Berkeley, 201 Gilman Hall, Berkeley, California 94720, United States
| | - Haw Yang
- ‡Department of Chemistry, Princeton University, Washington Road, Princeton, New Jersey 08544, United States
| | - Jhih-Wei Chu
- §Department of Biological Science and Technology, National Chiao Tung University, 75 Bo-Ai Street, Hsinchu, Taiwan, ROC
- ∥Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan, ROC
| |
Collapse
|