1
|
Diessner E, Thomas LJ, Butts CT. Production of Distinct Fibrillar, Oligomeric, and Other Aggregation States from Network Models of Multibody Interaction. J Chem Theory Comput 2024; 20. [PMID: 39259851 PMCID: PMC11448054 DOI: 10.1021/acs.jctc.4c00916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 08/28/2024] [Accepted: 08/29/2024] [Indexed: 09/13/2024]
Abstract
Protein aggregation can produce a wide range of states, ranging from fibrillar structures and oligomers to unstructured and semistructured gel phases. Recent work has shown that many of these states can be recapitulated by relatively simple, topological models specified in terms of multibody interaction energies, providing a direct connection between aggregate intermolecular forces and aggregation products. Here, we examine a low-dimensional network Hamiltonian model (NHM) based on four basic multibody interactions found in any aggregate system. We characterize the phase behavior of this NHM family, showing that fibrils arise from a balance between elongation-inducing and contact-inhibiting forces. Complex oligomers (including annular oligomers resembling those thought to be toxic species in Alzheimer's disease) also form distinct phases in this regime, controlled in part by closure-inducing forces. We show that phase structure is largely independent of system size, and provide evidence of a rich structure of minor oligomeric phases that can arise from appropriate conditions. We characterize the phase behavior of this NHM family, demonstrating the range of ordered and disordered aggregation states possible with this set of interactions. As we show, fibrils arise from a balance between elongation-inducing and contact-inhibiting forces, existing in a regime bounded by gel-like and disaggregated phases; complex oligomers (including annular oligomers resembling those thought to be toxic species in Alzheimer's disease) also form distinct phases in this regime, controlled in part by closure-inducing forces. We show that phase structure is largely independent of system size, allowing generalization to macroscopic systems, and provide evidence of a rich structure of minor oligomeric phases that can arise from appropriate conditions.
Collapse
Affiliation(s)
- Elizabeth
M. Diessner
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Loring J. Thomas
- Department
of Sociology, University of California, Irvine, California 92697, United States
| | - Carter T. Butts
- Department
of Sociology, University of California, Irvine, California 92697, United States
- Departments
of Statistics, Computer Science, and EECS, University of California, Irvine, California 92697, United States
| |
Collapse
|
2
|
Grazioli G, Tao A, Bhatia I, Regan P. Genetic Algorithm for Automated Parameterization of Network Hamiltonian Models of Amyloid Fibril Formation. J Phys Chem B 2024; 128:1854-1865. [PMID: 38359362 PMCID: PMC10910512 DOI: 10.1021/acs.jpcb.3c07322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 01/07/2024] [Accepted: 02/05/2024] [Indexed: 02/17/2024]
Abstract
The time scales of long-time atomistic molecular dynamics simulations are typically reported in microseconds, while the time scales for experiments studying the kinetics of amyloid fibril formation are typically reported in minutes or hours. This time scale deficit of roughly 9 orders of magnitude presents a major challenge in the design of computer simulation methods for studying protein aggregation events. Coarse-grained molecular simulations offer a computationally tractable path forward for exploring the molecular mechanism driving the formation of these structures, which are implicated in diseases such as Alzheimer's, Parkinson's, and type-II diabetes. Network Hamiltonian models of aggregation are centered around a Hamiltonian function that returns the total energy of a system of aggregating proteins, given the graph structure of the system as an input. In the graph, or network, representation of the system, each protein molecule is represented as a node, and noncovalent bonds between proteins are represented as edges. The parameter, i.e., a set of coefficients that determine the degree to which each topological degree of freedom is favored or disfavored, must be determined for each network Hamiltonian model, and is a well-known technical challenge. The methodology is first demonstrated by beginning with an initial set of randomly parametrized models of low fibril fraction (<5% fibrillar), and evolving to subsequent generations of models, ultimately leading to high fibril fraction models (>70% fibrillar). The methodology is also demonstrated by applying it to optimizing previously published network Hamiltonian models for the 5 key amyloid fibril topologies that have been reported in the Protein Data Bank (PDB). The models generated by the AI produced fibril fractions that surpass previously published fibril fractions in 3 of 5 cases, including the most naturally abundant amyloid fibril topology, the 1,2 2-ribbon, which features a steric zipper. The authors also aim to encourage more widespread use of the network Hamiltonian methodology for fitting a wide variety of self-assembling systems by releasing a free open-source implementation of the genetic algorithm introduced here.
Collapse
Affiliation(s)
- Gianmarc Grazioli
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Andy Tao
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Inika Bhatia
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Patrick Regan
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| |
Collapse
|
3
|
Butts CT. Continuous Time Graph Processes with Known ERGM Equilibria: Contextual Review, Extensions, and Synthesis. THE JOURNAL OF MATHEMATICAL SOCIOLOGY 2023; 48:129-171. [PMID: 38681800 PMCID: PMC11043653 DOI: 10.1080/0022250x.2023.2180001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 10/17/2022] [Indexed: 05/01/2024]
Abstract
Graph processes that unfold in continuous time are of obvious theoretical and practical interest. Particularly useful are those whose long-term behavior converges to a graph distribution of known form. Here, we review some of the conditions for such convergence, and provide examples of novel and/or known processes that do so. These include subfamilies of the well-known stochastic actor oriented models, as well as continuum extensions of temporal and separable temporal exponential family random graph models. We also comment on some related threads in the broader work on network dynamics, which provide additional context for the continuous time case. Graph processes that unfold in continuous time are natural models for social network dynamics: able to directly represent changes in structure as they unfold (rather than, e.g. as snapshots at discrete intervals), such models not only offer the promise of capturing dynamics at high temporal resolution, but are also easily mapped to empirical data without the need to preselect a level of granularity with respect to which the dynamics are defined. Although relatively few general frameworks of this type have been extensively studied, at least one (the stochastic actor-oriented models, or SAOMs) is arguably among the most successful and widely used families of models in the social sciences (see, e.g., Snijders (2001); Steglich et al. (2010); Burk et al. (2007); Sijtsema et al. (2010); de la Haye et al. (2011); Weerman (2011); Schaefer and Kreager (2020) among many others). Work using other continuous time graph processes has also found applications both within (Koskinen and Snijders, 2007; Koskinen et al., 2015; Stadtfeld et al., 2017; Hoffman et al., 2020) and beyond (Grazioli et al., 2019; Yu et al., 2020) the social sciences, suggesting the potential for further advances.
Collapse
Affiliation(s)
- Carter T Butts
- Departments of Sociology, Statistics, Computer Science, and EECS and Institute for Mathematical Behavioral Sciences, University of California Irvine
| |
Collapse
|
4
|
Diessner EM, Freites JA, Tobias DJ, Butts CT. Network Hamiltonian Models for Unstructured Protein Aggregates, with Application to γD-Crystallin. J Phys Chem B 2023; 127:685-697. [PMID: 36637342 PMCID: PMC10437096 DOI: 10.1021/acs.jpcb.2c07672] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Network Hamiltonian models (NHMs) are a framework for topological coarse-graining of protein-protein interactions, in which each node corresponds to a protein, and edges are drawn between nodes representing proteins that are noncovalently bound. Here, this framework is applied to aggregates of γD-crystallin, a structural protein of the eye lens implicated in cataract disease. The NHMs in this study are generated from atomistic simulations of equilibrium distributions of wild-type and the cataract-causing variant W42R in solution, performed by Wong, E. K.; Prytkova, V.; Freites, J. A.; Butts, C. T.; Tobias, D. J. Molecular Mechanism of Aggregation of the Cataract-Related γD-Crystallin W42R Variant from Multiscale Atomistic Simulations. Biochemistry2019, 58 (35), 3691-3699. Network models are shown to successfully reproduce the aggregate size and structure observed in the atomistic simulation, and provide information about the transient protein-protein interactions therein. The system size is scaled from the original 375 monomers to a system of 10000 monomers, revealing a lowering of the upper tail of the aggregate size distribution of the W42R variant. Extrapolation to higher and lower concentrations is also performed. These results provide an example of the utility of NHMs for coarse-grained simulation of protein systems, as well as their ability to scale to large system sizes and high concentrations, reducing computational costs while retaining topological information about the system.
Collapse
Affiliation(s)
- Elizabeth M Diessner
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - J Alfredo Freites
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Douglas J Tobias
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Carter T Butts
- Departments of Sociology, Statistics, Computer Science, and EECS, University of California, Irvine, California92697, United States
| |
Collapse
|
5
|
Yin F, Butts CT. Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices. PLoS One 2022; 17:e0273039. [PMID: 36018834 PMCID: PMC9417041 DOI: 10.1371/journal.pone.0273039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 08/02/2022] [Indexed: 11/18/2022] Open
Abstract
The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case—in which we observe multiple networks from a common generative process—adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.
Collapse
Affiliation(s)
- Fan Yin
- Department of Statistics, University of California at Irvine, Irvine, CA, United States of America
| | - Carter T. Butts
- Department of Sociology, Statistics, Computer Science, and EECS and Institute for Mathematical Behavioral Sciences, University of California at Irvine, Irvine, CA, United States of America
- * E-mail:
| |
Collapse
|
6
|
Duong VT, Diessner EM, Grazioli G, Martin RW, Butts CT. Neural Upscaling from Residue-Level Protein Structure Networks to Atomistic Structures. Biomolecules 2021; 11:biom11121788. [PMID: 34944432 PMCID: PMC8698800 DOI: 10.3390/biom11121788] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 11/11/2021] [Accepted: 11/19/2021] [Indexed: 01/01/2023] Open
Abstract
Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of Aβ1–40, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.
Collapse
Affiliation(s)
- Vy T. Duong
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
| | - Elizabeth M. Diessner
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
| | - Gianmarc Grazioli
- Department of Chemistry, San Jose State University, San Jose, CA 95192, USA;
| | - Rachel W. Martin
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
- Department of Molecular Biology & Biochemistry, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| | - Carter T. Butts
- Departments of Sociology, Statistics and Electrical Engineering & Computer Science, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| |
Collapse
|
7
|
Sprague-Piercy MA, Rocha MA, Kwok AO, Martin RW. α-Crystallins in the Vertebrate Eye Lens: Complex Oligomers and Molecular Chaperones. Annu Rev Phys Chem 2021; 72:143-163. [PMID: 33321054 PMCID: PMC8062273 DOI: 10.1146/annurev-physchem-090419-121428] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
α-Crystallins are small heat-shock proteins that act as holdase chaperones. In humans, αA-crystallin is expressed only in the eye lens, while αB-crystallin is found in many tissues. α-Crystallins have a central domain flanked by flexible extensions and form dynamic, heterogeneous oligomers. Structural models show that both the C- and N-terminal extensions are important for controlling oligomerization through domain swapping. α-Crystallin prevents aggregation of damaged β- and γ-crystallins by binding to the client protein using a variety of binding modes. α-Crystallin chaperone activity can be compromised by mutation or posttranslational modifications, leading to protein aggregation and cataract. Because of their high solubility and their ability to form large, functional oligomers, α-crystallins are particularly amenable to structure determination by solid-state nuclear magnetic resonance (NMR) and solution NMR, as well as cryo-electron microscopy.
Collapse
Affiliation(s)
- Marc A Sprague-Piercy
- Department of Molecular Biology and Biochemistry, University of California, Irvine, California 92697, USA;
| | - Megan A Rocha
- Department of Chemistry, University of California, Irvine, California 92697, USA
| | - Ashley O Kwok
- Department of Chemistry, University of California, Irvine, California 92697, USA
| | - Rachel W Martin
- Department of Molecular Biology and Biochemistry, University of California, Irvine, California 92697, USA;
- Department of Chemistry, University of California, Irvine, California 92697, USA
| |
Collapse
|
8
|
Rocha MA, Sprague-Piercy MA, Kwok AO, Roskamp KW, Martin RW. Chemical Properties Determine Solubility and Stability in βγ-Crystallins of the Eye Lens. Chembiochem 2021; 22:1329-1346. [PMID: 33569867 PMCID: PMC8052307 DOI: 10.1002/cbic.202000739] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 12/17/2020] [Indexed: 11/10/2022]
Abstract
βγ-Crystallins are the primary structural and refractive proteins found in the vertebrate eye lens. Because crystallins are not replaced after early eye development, their solubility and stability must be maintained for a lifetime, which is even more remarkable given the high protein concentration in the lens. Aggregation of crystallins caused by mutations or post-translational modifications can reduce crystallin protein stability and alter intermolecular interactions. Common post-translational modifications that can cause age-related cataracts include deamidation, oxidation, and tryptophan derivatization. Metal ion binding can also trigger reduced crystallin solubility through a variety of mechanisms. Interprotein interactions are critical to maintaining lens transparency: crystallins can undergo domain swapping, disulfide bonding, and liquid-liquid phase separation, all of which can cause opacity depending on the context. Important experimental techniques for assessing crystallin conformation in the absence of a high-resolution structure include dye-binding assays, circular dichroism, fluorescence, light scattering, and transition metal FRET.
Collapse
Affiliation(s)
- Megan A. Rocha
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, CA 92697-2025 (USA)
| | - Marc A. Sprague-Piercy
- Department of Molecular Biology and Biochemistry, University of California Irvine, 3205 McGaugh Hall, Irvine, CA 92697-2525
| | - Ashley O. Kwok
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, CA 92697-2025 (USA)
| | - Kyle W. Roskamp
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, CA 92697-2025 (USA)
| | - Rachel W. Martin
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, CA 92697-2025 (USA)
- Department of Molecular Biology and Biochemistry, University of California Irvine, 3205 McGaugh Hall, Irvine, CA 92697-2525
| |
Collapse
|